qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-20 03:04:40 -07:00

Author	SHA1	Message	Date
Dmitry Vyukov	ee0305e036	runtime: remove dead code runtime.free has long gone. Change-Id: I058f69e6481b8fa008e1951c29724731a8a3d081 Reviewed-on: https://go-review.googlesource.com/16593 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com>	2015-11-03 19:20:21 +00:00
Austin Clements	b6c0934a9b	runtime: cache two workbufs to reduce contention Currently the gcWork abstraction caches a single work buffer. As a result, if a worker is putting and getting pointers right at the boundary of a work buffer, it can flap between work buffers and (potentially significantly) increase contention on the global work buffer lists. This change modifies gcWork to instead cache two work buffers and switch off between them. This introduces one buffers' worth of hysteresis and eliminates the above performance worst case by amortizing the cost of getting or putting a work buffer over at least one buffers' worth of work. In practice, it's difficult to trigger this worst case with reasonably large work buffers. On the garbage benchmark, this reduces the max writes/sec to the global work list from 32K to 25K and the median from 6K to 5K. However, if a workload were to trigger this worst case behavior, it could significantly drive up this contention. This has negligible effects on the go1 benchmarks and slightly speeds up the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.90ms ± 3% 5.83ms ± 4% -1.18% (p=0.011 n=18+18) name old time/op new time/op delta BinaryTree17-12 3.22s ± 4% 3.17s ± 3% -1.57% (p=0.009 n=19+20) Fannkuch11-12 2.44s ± 1% 2.53s ± 4% +3.78% (p=0.000 n=18+19) FmtFprintfEmpty-12 50.2ns ± 2% 50.5ns ± 5% ~ (p=0.631 n=19+20) FmtFprintfString-12 167ns ± 1% 166ns ± 1% ~ (p=0.141 n=20+20) FmtFprintfInt-12 162ns ± 1% 159ns ± 1% -1.80% (p=0.000 n=20+20) FmtFprintfIntInt-12 277ns ± 2% 263ns ± 1% -4.78% (p=0.000 n=20+18) FmtFprintfPrefixedInt-12 240ns ± 1% 232ns ± 2% -3.25% (p=0.000 n=20+20) FmtFprintfFloat-12 311ns ± 1% 315ns ± 2% +1.17% (p=0.000 n=20+20) FmtManyArgs-12 1.05µs ± 2% 1.03µs ± 2% -1.72% (p=0.000 n=20+20) GobDecode-12 8.65ms ± 1% 8.71ms ± 2% +0.68% (p=0.001 n=19+20) GobEncode-12 6.51ms ± 1% 6.54ms ± 1% +0.42% (p=0.047 n=20+19) Gzip-12 318ms ± 2% 315ms ± 2% -1.20% (p=0.000 n=19+19) Gunzip-12 42.2ms ± 2% 42.1ms ± 1% ~ (p=0.667 n=20+19) HTTPClientServer-12 62.5µs ± 1% 62.4µs ± 1% ~ (p=0.110 n=20+18) JSONEncode-12 16.8ms ± 1% 16.8ms ± 2% ~ (p=0.569 n=19+20) JSONDecode-12 60.8ms ± 2% 59.8ms ± 1% -1.69% (p=0.000 n=19+19) Mandelbrot200-12 3.87ms ± 1% 3.85ms ± 0% -0.61% (p=0.001 n=20+17) GoParse-12 3.76ms ± 2% 3.76ms ± 1% ~ (p=0.698 n=20+20) RegexpMatchEasy0_32-12 100ns ± 2% 101ns ± 2% ~ (p=0.065 n=19+20) RegexpMatchEasy0_1K-12 342ns ± 2% 333ns ± 1% -2.82% (p=0.000 n=20+19) RegexpMatchEasy1_32-12 83.3ns ± 2% 83.2ns ± 2% ~ (p=0.692 n=20+19) RegexpMatchEasy1_1K-12 498ns ± 2% 490ns ± 1% -1.52% (p=0.000 n=18+20) RegexpMatchMedium_32-12 131ns ± 2% 131ns ± 2% ~ (p=0.464 n=20+18) RegexpMatchMedium_1K-12 39.3µs ± 2% 39.6µs ± 1% +0.77% (p=0.000 n=18+19) RegexpMatchHard_32-12 2.04µs ± 2% 2.06µs ± 1% +0.69% (p=0.009 n=19+20) RegexpMatchHard_1K-12 61.4µs ± 2% 62.1µs ± 1% +1.21% (p=0.000 n=19+20) Revcomp-12 534ms ± 1% 529ms ± 1% -0.97% (p=0.000 n=19+16) Template-12 70.4ms ± 2% 70.0ms ± 1% ~ (p=0.070 n=19+19) TimeParse-12 359ns ± 3% 344ns ± 1% -4.15% (p=0.000 n=19+19) TimeFormat-12 357ns ± 1% 361ns ± 2% +1.05% (p=0.002 n=20+20) [Geo mean] 62.4µs 62.0µs -0.56% name old speed new speed delta GobDecode-12 88.7MB/s ± 1% 88.1MB/s ± 2% -0.68% (p=0.001 n=19+20) GobEncode-12 118MB/s ± 1% 117MB/s ± 1% -0.42% (p=0.046 n=20+19) Gzip-12 60.9MB/s ± 2% 61.7MB/s ± 2% +1.21% (p=0.000 n=19+19) Gunzip-12 460MB/s ± 2% 461MB/s ± 1% ~ (p=0.661 n=20+19) JSONEncode-12 116MB/s ± 1% 115MB/s ± 2% ~ (p=0.555 n=19+20) JSONDecode-12 31.9MB/s ± 2% 32.5MB/s ± 1% +1.72% (p=0.000 n=19+19) GoParse-12 15.4MB/s ± 2% 15.4MB/s ± 1% ~ (p=0.653 n=20+20) RegexpMatchEasy0_32-12 317MB/s ± 2% 315MB/s ± 2% ~ (p=0.141 n=19+20) RegexpMatchEasy0_1K-12 2.99GB/s ± 2% 3.07GB/s ± 1% +2.86% (p=0.000 n=20+19) RegexpMatchEasy1_32-12 384MB/s ± 2% 385MB/s ± 2% ~ (p=0.672 n=20+19) RegexpMatchEasy1_1K-12 2.06GB/s ± 2% 2.09GB/s ± 1% +1.54% (p=0.000 n=18+20) RegexpMatchMedium_32-12 7.62MB/s ± 2% 7.63MB/s ± 2% ~ (p=0.800 n=20+18) RegexpMatchMedium_1K-12 26.0MB/s ± 1% 25.8MB/s ± 1% -0.77% (p=0.000 n=18+19) RegexpMatchHard_32-12 15.7MB/s ± 2% 15.6MB/s ± 1% -0.69% (p=0.010 n=19+20) RegexpMatchHard_1K-12 16.7MB/s ± 2% 16.5MB/s ± 1% -1.19% (p=0.000 n=19+20) Revcomp-12 476MB/s ± 1% 481MB/s ± 1% +0.97% (p=0.000 n=19+16) Template-12 27.6MB/s ± 2% 27.7MB/s ± 1% ~ (p=0.071 n=19+19) [Geo mean] 99.1MB/s 99.3MB/s +0.27% Change-Id: I68bcbf74ccb716cd5e844a554f67b679135105e6 Reviewed-on: https://go-review.googlesource.com/16042 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 19:12:10 +00:00
Dmitry Vyukov	bf606094ee	runtime: fix finalization and profiling of tiny allocations Handling of special records for tiny allocations has two problems: 1. Once we queue a finalizer we mark the object. As the result any subsequent finalizers for the same object will not be queued during this GC cycle. If we have 16 finalizers setup (the worst case), finalization will take 16 GC cycles. This is what caused misbehave of tinyfin.go. The actual flakiness was caused by the fact that fing is asynchronous and don't always run before the check. 2. If a tiny block has both finalizer and profile specials, it is possible that we both queue finalizer, preserve the object live and free the profile record. As the result heap profile can be skewed. Fix both issues by analyzing all special records for a single object at once. Also, make tinyfin test stricter and remove reliance on real time. Also, add a test for the problem 2. Currently heap profile missed about a half of live memory. Fixes #13100 Change-Id: I9ae4dc1c44893724138a4565ca5cae29f2e97544 Reviewed-on: https://go-review.googlesource.com/16591 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com>	2015-11-03 18:57:18 +00:00
Ilya Tocar	95333aea53	strings: add asm version of Index() for short strings on amd64 Currently we have special case for 1-byte strings, This extends this to strings shorter than 32 bytes on amd64. Results (broadwell): name old time/op new time/op delta IndexRune-4 57.4ns ± 0% 57.5ns ± 0% +0.10% (p=0.000 n=20+19) IndexRuneFastPath-4 20.4ns ± 0% 20.4ns ± 0% ~ (all samples are equal) Index-4 21.0ns ± 0% 21.8ns ± 0% +3.81% (p=0.000 n=20+20) LastIndex-4 7.07ns ± 1% 6.98ns ± 0% -1.21% (p=0.000 n=20+16) IndexByte-4 18.3ns ± 0% 18.3ns ± 0% ~ (all samples are equal) IndexHard1-4 1.46ms ± 0% 0.39ms ± 0% -73.06% (p=0.000 n=16+16) IndexHard2-4 1.46ms ± 0% 0.30ms ± 0% -79.55% (p=0.000 n=18+18) IndexHard3-4 1.46ms ± 0% 0.66ms ± 0% -54.68% (p=0.000 n=19+19) LastIndexHard1-4 1.46ms ± 0% 1.46ms ± 0% -0.01% (p=0.036 n=18+20) LastIndexHard2-4 1.46ms ± 0% 1.46ms ± 0% ~ (p=0.588 n=19+19) LastIndexHard3-4 1.46ms ± 0% 1.46ms ± 0% ~ (p=0.283 n=17+20) IndexTorture-4 11.1µs ± 0% 11.1µs ± 0% +0.01% (p=0.000 n=18+17) Change-Id: I892781549f558f698be4e41f9f568e3d0611efb5 Reviewed-on: https://go-review.googlesource.com/16430 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>	2015-11-03 16:04:28 +00:00
Austin Clements	1870572180	runtime: enlarge GC work buffer size Currently the GC work buffers are only 256 bytes and hence can record only 24 64-bit pointer. They were reduced from 4K in commits `db7fd1c` and `a15818f` as a way to minimize the amount of work the per-P workbuf caches could "hide" from the mark phase and carry in to the mark termination phase. However, this approach wasn't very robust and we later added a "mark 2" phase to address this problem head-on. Because of mark 2, there's now no benefit to having very small work buffers. But there are plenty of downsides: small work buffers increase contention on the work lists, increase the frequency and hence net overhead of acquiring and releasing work buffers, and somewhat increase memory overhead of the GC. This commit expands work buffers back to 4K (504 64-bit pointers). This reduces the rate of writes to work.full in the garbage benchmark from a peak of ~780,000 writes/sec to a peak of ~32,000 writes/sec. This has negligible effect on the go1 benchmarks. It slightly slows down the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.37ms ± 5% 5.60ms ± 2% +4.37% (p=0.000 n=20+20) Change-Id: Ic9cc28e7a125d23d9faf4f5e690fb8aa9bcdfb28 Reviewed-on: https://go-review.googlesource.com/15893 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 15:53:38 +00:00
Austin Clements	456528304d	runtime: make assists preemptible Currently, assists are non-preemptible, which means a heavily assisting G can block other Gs from running. At the beginning of a GC cycle, it can also delay scang, which will spin until the assist is done. Since scanning is currently done sequentially, this can seriously extend the length of the scan phase. Fix this by making assists preemptible. Since the assist holds work buffers and runs on the system stack, this must be done cooperatively: we make gcDrainN return on preemption, and make the assist return from the system stack and voluntarily Gosched. This is prerequisite to enlarging the work buffers. Without this change, the delays and spinning in scang increase significantly. This has no effect on the go1 benchmarks. name old time/op new time/op delta XBenchGarbage-12 5.72ms ± 4% 5.37ms ± 5% -6.11% (p=0.000 n=20+20) Change-Id: I829e732a0f23b126da633516a1a9ec1a508fdbf1 Reviewed-on: https://go-review.googlesource.com/15894 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 15:53:31 +00:00
Austin Clements	15aa6bbd5a	runtime: replace assist sleep loop with park/ready GC assists must block until the assist can be satisfied (either through stealing credit or doing work) or the GC cycle ends. Currently, this is implemented as a retry loop with a 100 µs delay. This obviously isn't ideal, as it wastes CPU and delays mutator execution. It also has the somewhat peculiar downside that sleeping a G requires allocation, and this requires working around recursive allocation. Replace this timed delay with a proper scheduling queue. When an assist can't be satisfied immediately, it adds the allocating G to a queue and parks it. Any time background scan credit is flushed, it consults this queue, directly satisfies the debt of queued assists, and wakes up satisfied assists before flushing any remaining credit to the background credit pool. No effect on the go1 benchmarks. Slightly speeds up the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.81ms ± 1% 5.72ms ± 4% -1.65% (p=0.011 n=20+20) Updates #12041. Change-Id: I8ee3b6274dd097b12b10a8030796a958a4b0e7b7 Reviewed-on: https://go-review.googlesource.com/15890 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 15:53:25 +00:00
Austin Clements	0ca4488cc1	runtime: change p.runq from []*g to []guintptr This eliminates many write barriers in the scheduler code that are unnecessary and will interfere with upcoming changes where the garbage collector will have to invoke run queue functions in contexts that must not have write barriers. Change-Id: I702d0ac99cfd00ffff406e7362917db6a43e7e55 Reviewed-on: https://go-review.googlesource.com/16556 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-11-03 15:53:18 +00:00
Todd Neal	e3e0122ae2	test: use go:noinline consistently Replace various implementations of inlining prevention with "go:noinline" Change-Id: Iac90895c3a62d6f4b7a6c72e11e165d15a0abfa4 Reviewed-on: https://go-review.googlesource.com/16510 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Todd Neal <todd@tneal.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 02:01:34 +00:00
Ilya Tocar	0e23ca41d9	bytes: speed up Compare() on amd64 Use AVX2 if available. Results (haswell), below: name old time/op new time/op delta BytesCompare1-6 11.4ns ± 0% 11.4ns ± 0% ~ (all samples are equal) BytesCompare2-6 11.4ns ± 0% 11.4ns ± 0% ~ (all samples are equal) BytesCompare4-6 11.4ns ± 0% 11.4ns ± 0% ~ (all samples are equal) BytesCompare8-6 9.29ns ± 2% 8.76ns ± 0% -5.72% (p=0.000 n=16+17) BytesCompare16-6 9.29ns ± 2% 9.20ns ± 0% -1.02% (p=0.000 n=20+16) BytesCompare32-6 11.4ns ± 1% 11.4ns ± 0% ~ (p=0.191 n=20+20) BytesCompare64-6 14.4ns ± 0% 13.1ns ± 0% -8.68% (p=0.000 n=20+20) BytesCompare128-6 20.2ns ± 0% 18.5ns ± 0% -8.27% (p=0.000 n=16+20) BytesCompare256-6 29.3ns ± 0% 24.5ns ± 0% -16.38% (p=0.000 n=16+16) BytesCompare512-6 46.8ns ± 0% 37.1ns ± 0% -20.78% (p=0.000 n=18+16) BytesCompare1024-6 82.9ns ± 0% 62.3ns ± 0% -24.86% (p=0.000 n=20+14) BytesCompare2048-6 155ns ± 0% 112ns ± 0% -27.74% (p=0.000 n=20+20) CompareBytesEqual-6 10.1ns ± 1% 10.0ns ± 1% ~ (p=0.527 n=20+20) CompareBytesToNil-6 10.0ns ± 2% 9.4ns ± 0% -6.57% (p=0.000 n=20+17) CompareBytesEmpty-6 8.76ns ± 0% 8.76ns ± 0% ~ (all samples are equal) CompareBytesIdentical-6 8.76ns ± 0% 8.76ns ± 0% ~ (all samples are equal) CompareBytesSameLength-6 10.6ns ± 1% 10.6ns ± 1% ~ (p=0.240 n=20+20) CompareBytesDifferentLength-6 10.6ns ± 0% 10.6ns ± 1% ~ (p=1.000 n=20+20) CompareBytesBigUnaligned-6 132±s ± 1% 105±s ± 1% -20.61% (p=0.000 n=20+18) CompareBytesBig-6 125±s ± 1% 105±s ± 1% -16.31% (p=0.000 n=20+20) CompareBytesBigIdentical-6 8.13ns ± 0% 8.13ns ± 0% ~ (all samples are equal) name old speed new speed delta CompareBytesBigUnaligned-6 7.94GB/s ± 1% 10.01GB/s ± 1% +25.96% (p=0.000 n=20+18) CompareBytesBig-6 8.38GB/s ± 1% 10.01GB/s ± 1% +19.48% (p=0.000 n=20+20) CompareBytesBigIdentical-6 129TB/s ± 0% 129TB/s ± 0% +0.01% (p=0.003 n=17+19) Change-Id: I820f31bab4582dd4204b146bb077c0d2f24cd8f5 Reviewed-on: https://go-review.googlesource.com/16434 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Klaus Post <klauspost@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2015-11-02 18:39:38 +00:00
Michael Hudson-Doyle	35d71d6727	cmd/go, runtime: define GOBUILDMODE_shared rather than shared when dynamically linking To avoid collisions with what existing code may already be doing. Change-Id: Ice639440aafc0724714c25333d90a49954372230 Reviewed-on: https://go-review.googlesource.com/16503 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-11-01 19:52:33 +00:00
Austin Clements	fbf273250f	runtime: perform mark 2 root re-scanning in GC workers This moves another root scanning task out of the GC coordinator and parallelizes it on the GC workers. This has negligible effect on the go1 benchmarks and the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.24ms ± 1% 5.26ms ± 1% +0.30% (p=0.007 n=18+17) name old time/op new time/op delta BinaryTree17-12 3.20s ± 5% 3.21s ± 5% ~ (p=0.264 n=20+18) Fannkuch11-12 2.46s ± 1% 2.54s ± 2% +3.09% (p=0.000 n=18+20) FmtFprintfEmpty-12 49.9ns ± 4% 50.0ns ± 5% ~ (p=0.356 n=20+20) FmtFprintfString-12 170ns ± 1% 170ns ± 2% ~ (p=0.815 n=19+20) FmtFprintfInt-12 160ns ± 1% 159ns ± 1% -0.63% (p=0.003 n=18+19) FmtFprintfIntInt-12 270ns ± 1% 267ns ± 1% -1.00% (p=0.000 n=19+18) FmtFprintfPrefixedInt-12 238ns ± 1% 232ns ± 1% -2.28% (p=0.000 n=19+19) FmtFprintfFloat-12 310ns ± 2% 313ns ± 2% +0.93% (p=0.000 n=19+19) FmtManyArgs-12 1.06µs ± 1% 1.04µs ± 1% -1.93% (p=0.000 n=20+19) GobDecode-12 8.63ms ± 1% 8.70ms ± 1% +0.81% (p=0.001 n=20+19) GobEncode-12 6.52ms ± 1% 6.56ms ± 1% +0.66% (p=0.000 n=20+19) Gzip-12 318ms ± 1% 319ms ± 1% ~ (p=0.405 n=17+18) Gunzip-12 42.1ms ± 2% 42.0ms ± 1% ~ (p=0.771 n=20+19) HTTPClientServer-12 62.6µs ± 1% 62.9µs ± 1% +0.41% (p=0.038 n=20+20) JSONEncode-12 16.9ms ± 1% 16.9ms ± 1% ~ (p=0.077 n=18+20) JSONDecode-12 60.7ms ± 1% 62.3ms ± 1% +2.73% (p=0.000 n=20+20) Mandelbrot200-12 3.86ms ± 1% 3.85ms ± 1% ~ (p=0.084 n=19+20) GoParse-12 3.75ms ± 2% 3.73ms ± 1% ~ (p=0.107 n=20+19) RegexpMatchEasy0_32-12 100ns ± 2% 101ns ± 2% +0.97% (p=0.001 n=20+19) RegexpMatchEasy0_1K-12 342ns ± 2% 332ns ± 2% -2.86% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 83.2ns ± 2% 82.8ns ± 2% ~ (p=0.108 n=19+20) RegexpMatchEasy1_1K-12 495ns ± 2% 490ns ± 2% -1.04% (p=0.000 n=18+19) RegexpMatchMedium_32-12 130ns ± 2% 131ns ± 2% ~ (p=0.291 n=20+20) RegexpMatchMedium_1K-12 39.3µs ± 1% 39.9µs ± 1% +1.54% (p=0.000 n=18+20) RegexpMatchHard_32-12 2.02µs ± 1% 2.05µs ± 2% +1.19% (p=0.000 n=19+19) RegexpMatchHard_1K-12 60.9µs ± 1% 61.5µs ± 1% +0.99% (p=0.000 n=18+18) Revcomp-12 535ms ± 1% 531ms ± 1% -0.82% (p=0.000 n=17+17) Template-12 73.0ms ± 1% 74.1ms ± 1% +1.47% (p=0.000 n=20+20) TimeParse-12 356ns ± 2% 348ns ± 1% -2.30% (p=0.000 n=20+20) TimeFormat-12 347ns ± 1% 353ns ± 1% +1.68% (p=0.000 n=19+20) [Geo mean] 62.3µs 62.4µs +0.12% name old speed new speed delta GobDecode-12 88.9MB/s ± 1% 88.2MB/s ± 1% -0.81% (p=0.001 n=20+19) GobEncode-12 118MB/s ± 1% 117MB/s ± 1% -0.66% (p=0.000 n=20+19) Gzip-12 60.9MB/s ± 1% 60.8MB/s ± 1% ~ (p=0.409 n=17+18) Gunzip-12 461MB/s ± 2% 462MB/s ± 1% ~ (p=0.765 n=20+19) JSONEncode-12 115MB/s ± 1% 115MB/s ± 1% ~ (p=0.078 n=18+20) JSONDecode-12 32.0MB/s ± 1% 31.1MB/s ± 1% -2.65% (p=0.000 n=20+20) GoParse-12 15.5MB/s ± 2% 15.5MB/s ± 1% ~ (p=0.111 n=20+19) RegexpMatchEasy0_32-12 318MB/s ± 2% 314MB/s ± 2% -1.27% (p=0.000 n=20+19) RegexpMatchEasy0_1K-12 2.99GB/s ± 1% 3.08GB/s ± 2% +2.94% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 385MB/s ± 2% 386MB/s ± 2% ~ (p=0.105 n=19+20) RegexpMatchEasy1_1K-12 2.07GB/s ± 1% 2.09GB/s ± 2% +1.06% (p=0.000 n=18+19) RegexpMatchMedium_32-12 7.64MB/s ± 2% 7.61MB/s ± 1% ~ (p=0.179 n=20+20) RegexpMatchMedium_1K-12 26.1MB/s ± 1% 25.7MB/s ± 1% -1.52% (p=0.000 n=18+20) RegexpMatchHard_32-12 15.8MB/s ± 1% 15.6MB/s ± 2% -1.18% (p=0.000 n=19+19) RegexpMatchHard_1K-12 16.8MB/s ± 2% 16.6MB/s ± 1% -0.90% (p=0.000 n=19+18) Revcomp-12 475MB/s ± 1% 479MB/s ± 1% +0.83% (p=0.000 n=17+17) Template-12 26.6MB/s ± 1% 26.2MB/s ± 1% -1.45% (p=0.000 n=20+20) [Geo mean] 99.0MB/s 98.7MB/s -0.32% Change-Id: I6ea44d7a59aaa6851c64695277ab65645ff9d32e Reviewed-on: https://go-review.googlesource.com/16070 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-10-30 22:46:39 +00:00
Austin Clements	82d14d77da	runtime: perform concurrent scan in GC workers Currently the concurrent root scan is performed in its entirety by the GC coordinator before entering concurrent mark (which enables GC workers). This scan is done sequentially, which can prolong the scan phase, delay the mark phase, and means that the scan phase does not obey the 25% CPU goal. Furthermore, there's no need to complete the root scan before starting marking (in fact, we already allow GC assists to happen during the scan phase), so this acts as an unnecessary barrier between root scanning and marking. This change shifts the root scan work out of the GC coordinator and in to the GC workers. The coordinator simply sets up the scan state and enqueues the right number of root scan jobs. The GC workers then drain the root scan jobs prior to draining heap scan jobs. This parallelizes the root scan process, makes it obey the 25% CPU goal, and effectively eliminates root scanning as an isolated phase, allowing the system to smoothly transition from root scanning to heap marking. This also eliminates a major non-STW responsibility of the GC coordinator, which will make it easier to switch to a decentralized state machine. Finally, it puts us in a good position to perform root scanning in assists as well, which will help satisfy assists at the beginning of the GC cycle. This is mostly straightforward. One tricky aspect is that we have to deal with preemption deadlock: where two non-preemptible gorountines are trying to preempt each other to perform a stack scan. Given the context where this happens, the only instance of this is two background workers trying to scan each other. We avoid this by simply not scanning the stacks of background workers during the concurrent phase; this is safe because we'll scan them during mark termination (and their stacks are very small and should not contain any new pointers). This change also switches the root marking during mark termination to use the same gcDrain-based code path as concurrent mark. This shouldn't affect performance because STW root marking was already parallel and tasks switched to heap marking immediately when no more root marking tasks were available. However, it simplifies the code and unifies these code paths. This has negligible effect on the go1 benchmarks. It slightly slows down the garbage benchmark, possibly by making GC run slightly more frequently. name old time/op new time/op delta XBenchGarbage-12 5.10ms ± 1% 5.24ms ± 1% +2.87% (p=0.000 n=18+18) name old time/op new time/op delta BinaryTree17-12 3.25s ± 3% 3.20s ± 5% -1.57% (p=0.013 n=20+20) Fannkuch11-12 2.45s ± 1% 2.46s ± 1% +0.38% (p=0.019 n=20+18) FmtFprintfEmpty-12 49.7ns ± 3% 49.9ns ± 4% ~ (p=0.851 n=19+20) FmtFprintfString-12 170ns ± 2% 170ns ± 1% ~ (p=0.775 n=20+19) FmtFprintfInt-12 161ns ± 1% 160ns ± 1% -0.78% (p=0.000 n=19+18) FmtFprintfIntInt-12 267ns ± 1% 270ns ± 1% +1.04% (p=0.000 n=19+19) FmtFprintfPrefixedInt-12 238ns ± 2% 238ns ± 1% ~ (p=0.133 n=18+19) FmtFprintfFloat-12 311ns ± 1% 310ns ± 2% -0.35% (p=0.023 n=20+19) FmtManyArgs-12 1.08µs ± 1% 1.06µs ± 1% -2.31% (p=0.000 n=20+20) GobDecode-12 8.65ms ± 1% 8.63ms ± 1% ~ (p=0.377 n=18+20) GobEncode-12 6.49ms ± 1% 6.52ms ± 1% +0.37% (p=0.015 n=20+20) Gzip-12 319ms ± 3% 318ms ± 1% ~ (p=0.975 n=19+17) Gunzip-12 41.9ms ± 1% 42.1ms ± 2% +0.65% (p=0.004 n=19+20) HTTPClientServer-12 61.7µs ± 1% 62.6µs ± 1% +1.40% (p=0.000 n=18+20) JSONEncode-12 16.8ms ± 1% 16.9ms ± 1% ~ (p=0.239 n=20+18) JSONDecode-12 58.4ms ± 1% 60.7ms ± 1% +3.85% (p=0.000 n=19+20) Mandelbrot200-12 3.86ms ± 0% 3.86ms ± 1% ~ (p=0.092 n=18+19) GoParse-12 3.75ms ± 2% 3.75ms ± 2% ~ (p=0.708 n=19+20) RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 2% +0.60% (p=0.010 n=17+20) RegexpMatchEasy0_1K-12 341ns ± 1% 342ns ± 2% ~ (p=0.203 n=20+19) RegexpMatchEasy1_32-12 82.5ns ± 2% 83.2ns ± 2% +0.83% (p=0.007 n=19+19) RegexpMatchEasy1_1K-12 495ns ± 1% 495ns ± 2% ~ (p=0.970 n=19+18) RegexpMatchMedium_32-12 130ns ± 2% 130ns ± 2% +0.59% (p=0.039 n=19+20) RegexpMatchMedium_1K-12 39.2µs ± 1% 39.3µs ± 1% ~ (p=0.214 n=18+18) RegexpMatchHard_32-12 2.03µs ± 2% 2.02µs ± 1% ~ (p=0.166 n=18+19) RegexpMatchHard_1K-12 61.0µs ± 1% 60.9µs ± 1% ~ (p=0.169 n=20+18) Revcomp-12 533ms ± 1% 535ms ± 1% ~ (p=0.071 n=19+17) Template-12 68.1ms ± 2% 73.0ms ± 1% +7.26% (p=0.000 n=19+20) TimeParse-12 355ns ± 2% 356ns ± 2% ~ (p=0.530 n=19+20) TimeFormat-12 357ns ± 2% 347ns ± 1% -2.59% (p=0.000 n=20+19) [Geo mean] 62.1µs 62.3µs +0.31% name old speed new speed delta GobDecode-12 88.7MB/s ± 1% 88.9MB/s ± 1% ~ (p=0.377 n=18+20) GobEncode-12 118MB/s ± 1% 118MB/s ± 1% -0.37% (p=0.015 n=20+20) Gzip-12 60.9MB/s ± 3% 60.9MB/s ± 1% ~ (p=0.944 n=19+17) Gunzip-12 464MB/s ± 1% 461MB/s ± 2% -0.64% (p=0.004 n=19+20) JSONEncode-12 115MB/s ± 1% 115MB/s ± 1% ~ (p=0.236 n=20+18) JSONDecode-12 33.2MB/s ± 1% 32.0MB/s ± 1% -3.71% (p=0.000 n=19+20) GoParse-12 15.5MB/s ± 2% 15.5MB/s ± 2% ~ (p=0.702 n=19+20) RegexpMatchEasy0_32-12 320MB/s ± 1% 318MB/s ± 2% ~ (p=0.094 n=18+20) RegexpMatchEasy0_1K-12 3.00GB/s ± 1% 2.99GB/s ± 1% ~ (p=0.194 n=20+19) RegexpMatchEasy1_32-12 388MB/s ± 2% 385MB/s ± 2% -0.83% (p=0.008 n=19+19) RegexpMatchEasy1_1K-12 2.07GB/s ± 1% 2.07GB/s ± 1% ~ (p=0.964 n=19+18) RegexpMatchMedium_32-12 7.68MB/s ± 1% 7.64MB/s ± 2% -0.57% (p=0.020 n=19+20) RegexpMatchMedium_1K-12 26.1MB/s ± 1% 26.1MB/s ± 1% ~ (p=0.211 n=18+18) RegexpMatchHard_32-12 15.8MB/s ± 1% 15.8MB/s ± 1% ~ (p=0.180 n=18+19) RegexpMatchHard_1K-12 16.8MB/s ± 1% 16.8MB/s ± 2% ~ (p=0.236 n=20+19) Revcomp-12 477MB/s ± 1% 475MB/s ± 1% ~ (p=0.071 n=19+17) Template-12 28.5MB/s ± 2% 26.6MB/s ± 1% -6.77% (p=0.000 n=19+20) [Geo mean] 100MB/s 99.0MB/s -0.82% Change-Id: I875bf6ceb306d1ee2f470cabf88aa6ede27c47a0 Reviewed-on: https://go-review.googlesource.com/16059 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-30 22:46:31 +00:00
Austin Clements	4cca1cc05e	runtime: consolidate "out of GC work" checks We already have gcMarkWorkAvailable, but the check for GC mark work is open-coded in several places. Generalize gcMarkWorkAvailable slightly and replace these open-coded checks with calls to gcMarkWorkAvailable. In addition to cleaning up the code, this puts us in a better position to make this check slightly more complicated. Change-Id: I1b29883300ecd82a1bf6be193e9b4ee96582a860 Reviewed-on: https://go-review.googlesource.com/16058 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-30 22:46:22 +00:00
Russ Cox	bf1de1b141	runtime: introduce GOTRACEBACK=single, now the default Abandon (but still support) the old numbering system. GOTRACEBACK=none is old 0 GOTRACEBACK=single is the new behavior GOTRACEBACK=all is old 1 GOTRACEBACK=system is old 2 GOTRACEBACK=crash is unchanged See doc comment change in runtime1.go for details. Filed #13107 to decide whether to change default back to GOTRACEBACK=all for Go 1.6 release. If you run into programs where printing only the current goroutine omits needed information, please add details in a comment on that issue. Fixes #12366. Change-Id: I82ca8b99b5d86dceb3f7102d38d2659d45dbe0db Reviewed-on: https://go-review.googlesource.com/16512 Reviewed-by: Austin Clements <austin@google.com>	2015-10-30 18:43:44 +00:00
Michael Hudson-Doyle	c9b8cab16c	cmd/internal/obj, cmd/link, runtime: handle TLS more like a platform linker on ppc64 On ppc64x, the thread pointer, held in R13, points 0x7000 bytes past where thread-local storage begins (presumably to maximize the amount of storage that can be accessed with a 16-bit signed displacement). The relocations used to indicate thread-local storage to the platform linker account for this, so to be able to support external linking we need to change things so the linker applies this offset instead of the runtime assembly. Change-Id: I2556c249ab2d802cae62c44b2b4c5b44787d7059 Reviewed-on: https://go-review.googlesource.com/14233 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2015-10-29 22:24:29 +00:00
Michael Hudson-Doyle	8537ff8a39	runtime/cgo: export _cgo_reginit on ppc64x This is needed to make external linking work. Change-Id: I4cf7edb4ea318849cab92a697952f8745eed40c4 Reviewed-on: https://go-review.googlesource.com/14237 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-29 00:38:43 +00:00
Ian Lance Taylor	f6fd086d5e	runtime: add missing word in comment Change-Id: Iffe27445e35ec071cf0920a05c81b8a97a3ed712 Reviewed-on: https://go-review.googlesource.com/16431 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-28 23:09:44 +00:00
David Crawshaw	73ff7cb1ed	runtime: c-shared entrypoint for linux/arm64 Change-Id: I7dab124842f5209097a8d5a802fcbdde650654fa Reviewed-on: https://go-review.googlesource.com/16395 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-28 21:21:33 +00:00
Hyang-Ah Hana Kim	dfc8649854	runtime, cmd: TLS setup for android/amd64. Android linker does not handle TLS for us. We set up the TLS slot for g, as darwin/386,amd64 handle instead. This is disgusting and fragile. We will eventually fix this ugly hack by taking advantage of the recent TLS IE model implementation. (Instead of referencing an GOT entry, make the code sequence look into the TLS variable that holds the offset.) The TLS slot for g in android/amd64 assumes a fixed offset from %fs. See runtime/cgo/gcc_android_amd64.c for details. For golang/go#10743 Change-Id: I1a3fc207946c665515f79026a56ea19134ede2dd Reviewed-on: https://go-review.googlesource.com/15991 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-10-28 20:54:28 +00:00
Michael Hudson-Doyle	72180c3b82	cmd/internal/obj, cmd/link, runtime: native-ish support for tls on arm64 Fixes #10560 Change-Id: Iedffd9c236c4fbb386c3afc52c5a1457f96ef122 Reviewed-on: https://go-review.googlesource.com/13991 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-10-28 19:51:05 +00:00
David du Colombier	31430bda09	runtime: don't use FP when calling nextSample in the Plan 9 sighandler In the Go signal handler on Plan 9, when a signal with the _SigThrow flag is received, we call startpanic before printing the stack trace. The startpanic function calls systemstack which calls startpanic_m. In the startpanic_m function, we call allocmcache to allocate _g_.m.mcache. The problem is that allocmcache calls nextSample, which does a floating point operation to return a sampling point for heap profiling. However, Plan 9 doesn't support floating point in the signal handler. This change adds a new function nextSampleNoFP, only called when in the Plan 9 signal handler, which is similar to nextSample, but avoids floating point. Change-Id: Iaa30437aa0f7c8c84d40afbab7567ad3bd5ea2de Reviewed-on: https://go-review.googlesource.com/16307 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-28 05:45:24 +00:00
Michael Hudson-Doyle	bc3f14fd2a	runtime: invoke vsyscall helper via TCB when dynamic linking on linux/386 The dynamic linker on linux/386 stores the address of the vsyscall helper at a fixed offset from the %gs register on linux/386 for easy access from PIC code. Change-Id: I635305cfecceef2289985d62e676e16810ed6b94 Reviewed-on: https://go-review.googlesource.com/16346 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-28 01:36:25 +00:00
Matthew Dempsky	4ff231bca1	runtime: eliminate some unnecessary uintptr conversions arena_{start,used,end} are already uintptr, so no need to convert them to uintptr, much less to convert them to unsafe.Pointer and then to uintptr. No binary change to pkg/linux_amd64/runtime.a. Change-Id: Ia4232ed2a724c44fde7eba403c5fe8e6dccaa879 Reviewed-on: https://go-review.googlesource.com/16339 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-27 02:53:04 +00:00
David du Colombier	d093bf489b	runtime: handle abort note on Plan 9 Implement an abort note on Plan 9, as an equivalent of the SIGABRT signal on other operating systems. Updates #11975. Change-Id: I010c9b10f2fbd2471aacd1d073368d975a2f0592 Reviewed-on: https://go-review.googlesource.com/16300 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: David du Colombier <0intro@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-26 22:12:30 +00:00
Matthew Dempsky	d18167fefe	runtime: fix tiny allocator When a new tiny block is allocated because we're allocating an object that won't fit into the current block, mallocgc saves the new block if it has more space leftover than the old block. However, the logic for this was subtly broken in golang.org/cl/2814, resulting in never saving (or consequently reusing) a tiny block. Change-Id: Ib5f6769451fb82877ddeefe75dfe79ed4a04fd40 Reviewed-on: https://go-review.googlesource.com/16330 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-26 21:14:15 +00:00
Austin Clements	d3df04cd8c	runtime: partition data and BSS root marking Currently data and BSS root marking are each a single markroot job. This makes them difficult to load balance, which can draw out mark termination time if they are large. Fix this by splitting both in to 256K chunks. While we're putting in the infrastructure for dynamic roots, we also replace the fixed sharding of the span roots with sharding in to fixed sizes. In addition to helping balance root marking, this also paves the way to parallelizing concurrent scan and to letting assists help with root marking. Updates #10345. This fixes the data and BSS aspects of that bug; it does not partition scanning of large heap objects. This has negligible effect on either the go1 benchmarks or the garbage benchmark: name old time/op new time/op delta XBenchGarbage-12 4.90ms ± 1% 4.91ms ± 2% ~ (p=0.058 n=17+16) name old time/op new time/op delta BinaryTree17-12 3.11s ± 4% 3.12s ± 4% ~ (p=0.512 n=20+20) Fannkuch11-12 2.53s ± 2% 2.47s ± 2% -2.28% (p=0.000 n=20+18) FmtFprintfEmpty-12 49.1ns ± 1% 50.0ns ± 4% +1.68% (p=0.008 n=18+20) FmtFprintfString-12 170ns ± 0% 172ns ± 1% +1.05% (p=0.000 n=14+19) FmtFprintfInt-12 174ns ± 1% 162ns ± 1% -6.81% (p=0.000 n=18+17) FmtFprintfIntInt-12 284ns ± 1% 277ns ± 1% -2.42% (p=0.000 n=20+19) FmtFprintfPrefixedInt-12 252ns ± 1% 244ns ± 1% -2.84% (p=0.000 n=18+20) FmtFprintfFloat-12 317ns ± 0% 311ns ± 0% -1.95% (p=0.000 n=19+18) FmtManyArgs-12 1.08µs ± 1% 1.11µs ± 1% +3.43% (p=0.000 n=18+19) GobDecode-12 8.56ms ± 1% 8.61ms ± 1% +0.50% (p=0.020 n=20+20) GobEncode-12 6.58ms ± 1% 6.57ms ± 1% ~ (p=0.792 n=20+19) Gzip-12 317ms ± 3% 317ms ± 2% ~ (p=0.840 n=19+19) Gunzip-12 41.6ms ± 0% 41.6ms ± 0% +0.07% (p=0.027 n=18+15) HTTPClientServer-12 62.2µs ± 1% 62.3µs ± 1% ~ (p=0.283 n=19+20) JSONEncode-12 16.5ms ± 2% 16.5ms ± 1% ~ (p=0.857 n=20+19) JSONDecode-12 58.5ms ± 1% 61.3ms ± 1% +4.67% (p=0.000 n=18+17) Mandelbrot200-12 3.84ms ± 0% 3.84ms ± 0% ~ (p=0.259 n=17+17) GoParse-12 3.70ms ± 2% 3.74ms ± 2% +0.96% (p=0.009 n=19+20) RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 0% +0.31% (p=0.040 n=19+15) RegexpMatchEasy0_1K-12 340ns ± 1% 340ns ± 1% ~ (p=0.411 n=17+19) RegexpMatchEasy1_32-12 82.7ns ± 2% 82.3ns ± 1% ~ (p=0.456 n=20+19) RegexpMatchEasy1_1K-12 498ns ± 2% 495ns ± 0% ~ (p=0.108 n=19+17) RegexpMatchMedium_32-12 130ns ± 1% 130ns ± 2% ~ (p=0.405 n=18+19) RegexpMatchMedium_1K-12 39.4µs ± 2% 39.1µs ± 1% -0.64% (p=0.002 n=20+19) RegexpMatchHard_32-12 2.03µs ± 2% 2.02µs ± 0% ~ (p=0.561 n=20+17) RegexpMatchHard_1K-12 61.1µs ± 2% 60.8µs ± 1% ~ (p=0.615 n=19+18) Revcomp-12 532ms ± 2% 531ms ± 1% ~ (p=0.470 n=19+19) Template-12 68.5ms ± 1% 69.1ms ± 1% +0.87% (p=0.000 n=17+17) TimeParse-12 344ns ± 2% 344ns ± 1% +0.25% (p=0.032 n=19+18) TimeFormat-12 347ns ± 1% 362ns ± 1% +4.27% (p=0.000 n=17+19) [Geo mean] 62.3µs 62.3µs -0.04% name old speed new speed delta GobDecode-12 89.6MB/s ± 1% 89.2MB/s ± 1% -0.50% (p=0.019 n=20+20) GobEncode-12 117MB/s ± 1% 117MB/s ± 1% ~ (p=0.797 n=20+19) Gzip-12 61.3MB/s ± 3% 61.2MB/s ± 2% ~ (p=0.834 n=19+19) Gunzip-12 467MB/s ± 0% 466MB/s ± 0% -0.07% (p=0.027 n=18+15) JSONEncode-12 117MB/s ± 2% 117MB/s ± 1% ~ (p=0.851 n=20+19) JSONDecode-12 33.2MB/s ± 1% 31.7MB/s ± 1% -4.47% (p=0.000 n=18+17) GoParse-12 15.6MB/s ± 2% 15.5MB/s ± 2% -0.95% (p=0.008 n=19+20) RegexpMatchEasy0_32-12 321MB/s ± 2% 320MB/s ± 1% -0.57% (p=0.002 n=17+17) RegexpMatchEasy0_1K-12 3.01GB/s ± 1% 3.01GB/s ± 1% ~ (p=0.132 n=17+18) RegexpMatchEasy1_32-12 387MB/s ± 2% 389MB/s ± 1% ~ (p=0.423 n=20+19) RegexpMatchEasy1_1K-12 2.05GB/s ± 2% 2.06GB/s ± 0% ~ (p=0.129 n=19+17) RegexpMatchMedium_32-12 7.64MB/s ± 1% 7.66MB/s ± 1% ~ (p=0.258 n=18+19) RegexpMatchMedium_1K-12 26.0MB/s ± 2% 26.2MB/s ± 1% +0.64% (p=0.002 n=20+19) RegexpMatchHard_32-12 15.7MB/s ± 2% 15.8MB/s ± 1% ~ (p=0.510 n=20+17) RegexpMatchHard_1K-12 16.8MB/s ± 2% 16.8MB/s ± 1% ~ (p=0.603 n=19+18) Revcomp-12 477MB/s ± 2% 479MB/s ± 1% ~ (p=0.470 n=19+19) Template-12 28.3MB/s ± 1% 28.1MB/s ± 1% -0.85% (p=0.000 n=17+17) [Geo mean] 100MB/s 100MB/s -0.26% Change-Id: Ib0bfe0145675ce88c5a8791752f7486ac98805b4 Reviewed-on: https://go-review.googlesource.com/16043 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-26 15:42:44 +00:00
David Crawshaw	21f35b33c2	runtime: use a 64kb system stack on arm I went looking for an arm system whose stacks are by default smaller than 64KB. In fact the smallest common linux target I could find was Android, which like iOS uses 1MB stacks. Fixes #11873 Change-Id: Ieeb66ad095b3da18d47ba21360ea75152a4107c6 Reviewed-on: https://go-review.googlesource.com/14602 Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Minux Ma <minux@golang.org>	2015-10-26 15:10:34 +00:00
Caleb Spare	fb7178e7cc	runtime: copy sqrt normalization bugfix from math This copies the change from CL 16158 (applied as `22d4c8bf13`). Updates #13013 Change-Id: Id7d02e63d92806f06a4e064a91b2fb6574fe385f Reviewed-on: https://go-review.googlesource.com/16291 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-23 23:43:47 +00:00
Matthew Dempsky	8ee0fd8623	runtime: replace is{plan9,solaris,windows} with GOOS tests Change-Id: I27589395f547c5837dc7536a0ab5bc7cc23a4ff6 Reviewed-on: https://go-review.googlesource.com/10872 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-23 18:11:17 +00:00
Alex Brainman	6410e67a1e	runtime: account for cpu affinity in windows NumCPU Fixes #11671 Change-Id: Ide1f8d92637dad2a2faed391329f9b6001789b76 Reviewed-on: https://go-review.googlesource.com/14742 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Alex Brainman <alex.brainman@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-23 07:54:42 +00:00
Austin Clements	beedb1ec33	runtime: add pcvalue cache to improve stack scan speed The cost of scanning large stacks is currently dominated by the time spent looking up and decoding the pcvalue table. However, large stacks are usually large not because they contain calls to many different functions, but because they contain many calls to the same, small set of recursive functions. Hence, walking large stacks tends to make the same pcvalue queries many times. Based on this observation, this commit adds a small, very simple, and fast cache in front of pcvalue lookup. We thread this cache down from operations that make many pcvalue calls, such as gentraceback, stack scanning, and stack adjusting. This simple cache works well because it has minimal overhead when it's not effective. I also tried a hashed direct-map cache, CLOCK-based replacement, round-robin replacement, and round-robin with lookups disabled until there had been at least 16 probes, but none of these approaches had obvious wins over the random replacement policy in this commit. This nearly doubles the overall performance of the deep stack test program from issue #10898: name old time/op new time/op delta Issue10898 16.5s ±12% 9.2s ±12% -44.37% (p=0.008 n=5+5) It's a very slight win on the garbage benchmark: name old time/op new time/op delta XBenchGarbage-12 4.92ms ± 1% 4.89ms ± 1% -0.75% (p=0.000 n=18+19) It's a wash (but doesn't harm performance) on the go1 benchmarks, which don't have particularly deep stacks: name old time/op new time/op delta BinaryTree17-12 3.11s ± 2% 3.20s ± 3% +2.83% (p=0.000 n=17+20) Fannkuch11-12 2.51s ± 1% 2.51s ± 1% -0.22% (p=0.034 n=19+18) FmtFprintfEmpty-12 50.8ns ± 3% 50.6ns ± 2% ~ (p=0.793 n=20+20) FmtFprintfString-12 174ns ± 0% 174ns ± 1% +0.17% (p=0.048 n=15+20) FmtFprintfInt-12 177ns ± 0% 165ns ± 1% -6.99% (p=0.000 n=17+19) FmtFprintfIntInt-12 283ns ± 1% 284ns ± 0% +0.22% (p=0.000 n=18+15) FmtFprintfPrefixedInt-12 243ns ± 1% 244ns ± 1% +0.40% (p=0.000 n=20+19) FmtFprintfFloat-12 318ns ± 0% 319ns ± 0% +0.27% (p=0.001 n=19+20) FmtManyArgs-12 1.12µs ± 0% 1.14µs ± 0% +1.74% (p=0.000 n=19+20) GobDecode-12 8.69ms ± 0% 8.73ms ± 1% +0.46% (p=0.000 n=18+18) GobEncode-12 6.64ms ± 1% 6.61ms ± 1% -0.46% (p=0.000 n=20+20) Gzip-12 323ms ± 2% 319ms ± 1% -1.11% (p=0.000 n=20+20) Gunzip-12 42.8ms ± 0% 42.9ms ± 0% ~ (p=0.158 n=18+20) HTTPClientServer-12 63.3µs ± 1% 63.1µs ± 1% -0.35% (p=0.011 n=20+20) JSONEncode-12 16.9ms ± 1% 17.3ms ± 1% +2.84% (p=0.000 n=19+20) JSONDecode-12 59.7ms ± 0% 58.5ms ± 0% -2.05% (p=0.000 n=19+17) Mandelbrot200-12 3.92ms ± 0% 3.91ms ± 0% -0.16% (p=0.003 n=19+19) GoParse-12 3.79ms ± 2% 3.75ms ± 2% -0.91% (p=0.005 n=20+20) RegexpMatchEasy0_32-12 102ns ± 1% 101ns ± 1% -0.80% (p=0.001 n=14+20) RegexpMatchEasy0_1K-12 337ns ± 1% 346ns ± 1% +2.90% (p=0.000 n=20+19) RegexpMatchEasy1_32-12 84.4ns ± 2% 84.3ns ± 2% ~ (p=0.743 n=20+20) RegexpMatchEasy1_1K-12 502ns ± 1% 505ns ± 0% +0.64% (p=0.000 n=20+20) RegexpMatchMedium_32-12 133ns ± 1% 132ns ± 1% -0.85% (p=0.000 n=20+19) RegexpMatchMedium_1K-12 40.1µs ± 1% 39.8µs ± 1% -0.77% (p=0.000 n=18+18) RegexpMatchHard_32-12 2.08µs ± 1% 2.07µs ± 1% -0.55% (p=0.001 n=18+19) RegexpMatchHard_1K-12 62.4µs ± 1% 62.0µs ± 1% -0.74% (p=0.000 n=19+19) Revcomp-12 545ms ± 2% 545ms ± 3% ~ (p=0.771 n=19+20) Template-12 73.7ms ± 1% 72.0ms ± 0% -2.33% (p=0.000 n=20+18) TimeParse-12 358ns ± 1% 351ns ± 1% -2.07% (p=0.000 n=20+20) TimeFormat-12 369ns ± 1% 356ns ± 0% -3.53% (p=0.000 n=20+18) [Geo mean] 63.5µs 63.2µs -0.41% name old speed new speed delta GobDecode-12 88.3MB/s ± 0% 87.9MB/s ± 0% -0.43% (p=0.000 n=18+17) GobEncode-12 116MB/s ± 1% 116MB/s ± 1% +0.47% (p=0.000 n=20+20) Gzip-12 60.2MB/s ± 2% 60.8MB/s ± 1% +1.13% (p=0.000 n=20+20) Gunzip-12 453MB/s ± 0% 453MB/s ± 0% ~ (p=0.160 n=18+20) JSONEncode-12 115MB/s ± 1% 112MB/s ± 1% -2.76% (p=0.000 n=19+20) JSONDecode-12 32.5MB/s ± 0% 33.2MB/s ± 0% +2.09% (p=0.000 n=19+17) GoParse-12 15.3MB/s ± 2% 15.4MB/s ± 2% +0.92% (p=0.004 n=20+20) RegexpMatchEasy0_32-12 311MB/s ± 1% 314MB/s ± 1% +0.78% (p=0.000 n=15+19) RegexpMatchEasy0_1K-12 3.04GB/s ± 1% 2.95GB/s ± 1% -2.90% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 379MB/s ± 2% 380MB/s ± 2% ~ (p=0.779 n=20+20) RegexpMatchEasy1_1K-12 2.04GB/s ± 1% 2.02GB/s ± 0% -0.62% (p=0.000 n=20+20) RegexpMatchMedium_32-12 7.46MB/s ± 1% 7.53MB/s ± 1% +0.86% (p=0.000 n=20+19) RegexpMatchMedium_1K-12 25.5MB/s ± 1% 25.7MB/s ± 1% +0.78% (p=0.000 n=18+18) RegexpMatchHard_32-12 15.4MB/s ± 1% 15.5MB/s ± 1% +0.62% (p=0.000 n=19+19) RegexpMatchHard_1K-12 16.4MB/s ± 1% 16.5MB/s ± 1% +0.82% (p=0.000 n=20+19) Revcomp-12 466MB/s ± 2% 466MB/s ± 3% ~ (p=0.765 n=19+20) Template-12 26.3MB/s ± 1% 27.0MB/s ± 0% +2.38% (p=0.000 n=20+18) [Geo mean] 97.8MB/s 98.0MB/s +0.23% Change-Id: I281044ae0b24990ba46487cacbc1069493274bc4 Reviewed-on: https://go-review.googlesource.com/13614 Reviewed-by: Keith Randall <khr@golang.org>	2015-10-22 17:48:13 +00:00
Matthew Dempsky	1652a2c316	runtime: add mSpanList type to represent lists of mspans This CL introduces a new mSpanList type to replace the empty mspan variables that were previously used as list heads. To be type safe, the previous circular linked list data structure is now a tail queue instead. One complication of this is mSpanList_Remove needs to know the list a span is being removed from, but this appears to be computable in all circumstances. As a temporary sanity check, mSpanList_Insert and mSpanList_InsertBack record the list that an mspan has been inserted into so that mSpanList_Remove can verify that the correct list was specified. Whereas mspan is 112 bytes on amd64, mSpanList is only 16 bytes. This shrinks the size of mheap from 50216 bytes to 12584 bytes. Change-Id: I8146364753dbc3b4ab120afbb9c7b8740653c216 Reviewed-on: https://go-review.googlesource.com/15906 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Austin Clements <austin@google.com>	2015-10-22 17:12:06 +00:00
Aaron Jacobs	151f4ec95d	runtime: remove unused printpc and printbyte functions Change-Id: I40e338f6b445ca72055fc9bac0f09f0dca904e3a Reviewed-on: https://go-review.googlesource.com/16191 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-22 15:02:44 +00:00
Matthew Dempsky	5a68eb9f25	runtime: prune some dead variables Change-Id: I7a1c3079b433c4e30d72fb7d59f9594e0d5efe47 Reviewed-on: https://go-review.googlesource.com/16178 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Gerrand <adg@golang.org>	2015-10-22 03:56:19 +00:00
Matthew Dempsky	29330c118d	runtime: change fixalloc's chunk field to unsafe.Pointer It's never used as a *byte anyway, so might as well just make it an unsafe.Pointer instead. Change-Id: I68ee418781ab2fc574eeac0498f2515b5561b7a8 Reviewed-on: https://go-review.googlesource.com/16175 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-22 01:14:23 +00:00
Shenghou Ma	1948aef6e3	runtime: fix typos Change-Id: Iffc25fc80452baf090bf8ef15ab798cfaa120b8e Reviewed-on: https://go-review.googlesource.com/16154 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-22 00:40:48 +00:00
Matthew Dempsky	58e3ae2fae	runtime: split plan9 and solaris's m fields into new embedded mOS type Reduces the size of m by ~8% on linux/amd64 (1040 bytes -> 960 bytes). There are also windows-specific fields, but they're currently referenced in OS-independent source files (but only when GOOS=="windows"). Change-Id: I13e1471ff585ccced1271f74209f8ed6df14c202 Reviewed-on: https://go-review.googlesource.com/16173 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-22 00:04:52 +00:00
Matthew Dempsky	7df8ba136c	runtime: replace unsafe pointer arithmetic with array indexing Change-Id: I313819abebd4cda4a6c30fd0fd6f44cb1d09161f Reviewed-on: https://go-review.googlesource.com/16167 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-21 23:22:20 +00:00
Matthew Dempsky	84afa1be76	runtime: make iface/eface handling more type safe Change compiler-invoked interface functions to directly take iface/eface parameters instead of fInterface/interface{} to avoid needing to always convert. For the handful of functions that legitimately need to take an interface{} parameter, add efaceOf to type-safely convert interface{} to eface. Change-Id: I8928761a12fd3c771394f36adf93d3006a9fcf39 Reviewed-on: https://go-review.googlesource.com/16166 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-21 23:08:22 +00:00
Ian Lance Taylor	73f329f472	runtime, syscall: add calls to msan functions Add explicit memory sanitizer instrumentation to the runtime and syscall packages. The compiler does not instrument the runtime package. It does instrument the syscall package, but we need to add a couple of cases that it can't see. Change-Id: I2d66073f713fe67e33a6720460d2bb8f72f31394 Reviewed-on: https://go-review.googlesource.com/16164 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-10-21 19:17:46 +00:00
Matthew Dempsky	c279250946	runtime: change functype's in and out fields to []_type Allows removing a few gratuitous unsafe.Pointer conversions and parallels the type of reflect.funcType's in and out fields ([]rtype). Change-Id: Ie5ca230a94407301a854dfd8782a3180d5054bc4 Reviewed-on: https://go-review.googlesource.com/16163 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-21 18:37:45 +00:00
Ian Lance Taylor	5174df9087	runtime, runtime/msan: add msan runtime support These are the runtime support functions for letting Go code interoperate with the C/C++ memory sanitizer. Calls to msanread/msanwrite are now inserted by the compiler with the -msan option. Calls to msanmalloc/msanfree will be from other runtime functions in a subsequent CL. Change-Id: I64fb061b38cc6519153face242eccd291c07d1f2 Reviewed-on: https://go-review.googlesource.com/16162 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-21 17:50:39 +00:00
Austin Clements	a42f668654	runtime: eliminate unused _GCstw phase Change-Id: Ie94cd17e1975fdaaa418fa6a7b2d3b164fedc135 Reviewed-on: https://go-review.googlesource.com/16057 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-21 16:26:34 +00:00
Austin Clements	28f458ce5b	runtime: eliminate unnecessary ragged barrier The ragged barrier after entering the concurrent mark phase is vestigial. This used to be the point where we enabled write barriers, so it was necessary to synchronize all Ps to ensure write barriers were enabled before any marking occurred. However, we've long since switched to enabling write barriers during the concurrent scan phase, so the start-the-world at the beginning of the concurrent scan phase ensures that all Ps have enabled the write barrier. Hence, we can eliminate the old "install write barrier" phase. Fixes #11971. Change-Id: I8cdcb84b5525cef19927d51ea11ba0a4db991ea8 Reviewed-on: https://go-review.googlesource.com/16044 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-21 16:26:25 +00:00
Matthew Dempsky	d4a7ea1b71	runtime: add stringStructOf helper function Instead of open-coding conversions from string to unsafe.Pointer then to stringStruct, add a helper function to add some type safety. Bonus: This caught two *string values being converted to stringStruct in heapdump.go. While here, get rid of the redundant _string type, but add in a stringStructDWARF type used for generating DWARF debug info. Change-Id: I8882f8cca66ac45190270f82019a5d85db023bd2 Reviewed-on: https://go-review.googlesource.com/16131 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-20 23:13:27 +00:00
Aaron Jacobs	ef986fa3fc	runtime: change odd 'print1_write' file names The '1' part is left over from the C conversion, but no longer makes sense given that print1.go no longer exists. Change-Id: Iec171251370d740f234afdbd6fb1a4009fde6696 Reviewed-on: https://go-review.googlesource.com/16036 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-20 23:03:06 +00:00
Hyang-Ah Hana Kim	30ee5919bd	runtime: add syscalls needed for android/amd64 logging. access, connect, socket. In Android-L, logging is done by writing the log messages to the logd process through a unix domain socket. Also, changed the arg types of those syscall stubs to match linux programming APIs. For golang/go#10743 Change-Id: I66368a03316e253561e9e76aadd180c2cd2e48f3 Reviewed-on: https://go-review.googlesource.com/15993 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-10-20 16:56:58 +00:00
Aaron Jacobs	3bc0601742	runtime: rename _func.frame to make it clear it's deprecated and unused. When I saw that it was labelled "legacy", I went looking for users of it to see how it was still used. But there aren't any. Save the next person the trouble. Change-Id: I921dd6c57b60331c9816542272555153ac133c02 Reviewed-on: https://go-review.googlesource.com/16035 Reviewed-by: Dave Cheney <dave@cheney.net> Run-TryBot: Dave Cheney <dave@cheney.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-20 03:16:09 +00:00
Michael Hudson-Doyle	c5856cfdb6	runtime: tweaks to allow -buildmode=shared to work Building Go shared libraries requires that all functions that have declarations without bodies have implementations and vice versa, so remove the implementation of call16 and add a stub implementation of sigreturn. Change-Id: I4d5a30c8637a5da7991054e151a536611d5bea46 Reviewed-on: https://go-review.googlesource.com/15966 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-19 21:23:36 +00:00
Austin Clements	3cd56b4dca	runtime: combine gcResetGState and gcResetMarkState These functions are always called together and perform logically related state resets, so combine them in to just gcResetMarkState. Fixes #11427. Change-Id: I06c17ef65f66186494887a767b3993126955b5fe Reviewed-on: https://go-review.googlesource.com/16041 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-19 18:38:07 +00:00
Austin Clements	b0d5e5c500	runtime: consolidate gcResetGState calls Currently gcResetGState is called by func gcscan_m for concurrent GC and directly by func gc for STW GC. Simplify this by consolidating these two calls in to one call by func gc above where it splits for concurrent and STW GC. As a consequence, gcResetGState and gcResetMarkState are always called together, so the next commit will consolidate these. Change-Id: Ib62d404c7b32b28f7d3080d26ecf3966cbc4aca0 Reviewed-on: https://go-review.googlesource.com/16040 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-19 18:38:00 +00:00
Austin Clements	feb92a8e8c	runtime: remove work.partial queue This work queue is no longer used (there are many reads of work.partial, but the only write is in putpartial, which is never called). Fixes #11922. Change-Id: I08b76c0c02a0867a9cdcb94783e1f7629d44249a Reviewed-on: https://go-review.googlesource.com/15892 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-19 18:37:54 +00:00
Aaron Jacobs	5d88323fa6	runtime: remove a redundant nil pointer check. It appears this was made possible by commit 89f185f; before that, g was not dereferenced above. Change-Id: I70bc571d924b36351392fd4c13d681e938cfb573 Reviewed-on: https://go-review.googlesource.com/16033 Reviewed-by: Andrew Gerrand <adg@golang.org>	2015-10-19 09:58:15 +00:00
Nodir Turakulov	386fa03609	runtime: merge proc1.go -> proc.go from proc1.go to proc.go: * prepend header comment explaining "Goroutine scheduler" * insert m0 and g0 var defs after the comment * append the rest Updates #12952 Change-Id: I35ee9ae3287675cde0c1b6aeaca0a460393f2354 Reviewed-on: https://go-review.googlesource.com/16024 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-19 01:11:00 +00:00
Nodir Turakulov	243757576d	runtime: merge race1.go -> race.go * append contents of race1.go to race.go * delete "Implementation of the race detector API." comment from race1.go Updates #12952 Change-Id: Ibdd9c4dc79a63c3bef69eade9525578063c86c1c Reviewed-on: https://go-review.googlesource.com/16023 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-18 23:48:22 +00:00
Michael Hudson-Doyle	6deb3c0619	runtime, runtime/cgo: conform to PIC register use rules in ppc64 asm PIC code on ppc64le uses R2 as a TOC pointer and when calling a function through a function pointer must ensure the function pointer is in R12. These rules are easy enough to follow unconditionally in our assembly, so do that. Change-Id: Icfc4e47ae5dfbe15f581cbdd785cdeed6e40bc32 Reviewed-on: https://go-review.googlesource.com/15526 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-18 23:36:39 +00:00
Michael Hudson-Doyle	b8f8969fbd	reflect, runtime, runtime/cgo: use ppc64 asm constant for fixed frame size Shared libraries on ppc64le will require a larger minimum stack frame (because the ABI mandates that the TOC pointer is available at 24(R1)). Part 3 of that is using a #define in the ppc64 assembly to refer to the size of the fixed part of the stack (finding all these took me about a week!). Change-Id: I50f22fe1c47af1ec59da1bd7ea8f84a4750df9b7 Reviewed-on: https://go-review.googlesource.com/15525 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-18 23:15:26 +00:00
Michael Hudson-Doyle	a4855812e2	runtime: add a constant for the smallest possible stack frame Shared libraries on ppc64le will require a larger minimum stack frame (because the ABI mandates that the TOC pointer is available at 24(R1)). So to prepare for this, make a constant for the fixed part of a stack and use that where necessary. Change-Id: I447949f4d725003bb82e7d2cf7991c1bca5aa887 Reviewed-on: https://go-review.googlesource.com/15523 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-18 22:14:00 +00:00
Michael Hudson-Doyle	45c06b27a4	cmd/internal/obj, runtime: add NOFRAME flag to suppress stack frame set up on ppc64x Replace the confusing game where a frame size of $-8 would suppress the implicit setting up of a stack frame with a nice explicit flag. The code to set up the function prologue is still a little confusing but better than it was. Change-Id: I1d49278ff42c6bc734ebfb079998b32bc53f8d9a Reviewed-on: https://go-review.googlesource.com/15670 Reviewed-by: Minux Ma <minux@golang.org>	2015-10-18 22:13:30 +00:00
Nodir Turakulov	db2e73faeb	runtime: merge stack{1,2}.go -> stack.go * rename stack1.go -> stack.go * prepend contents of stack2.go to stack.go Updates #12952 Change-Id: I60d409af37162a5a7596c678dfebc2cea89564ff Reviewed-on: https://go-review.googlesource.com/16008 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-17 20:52:22 +00:00
Matthew Dempsky	4562784bae	runtime: remove some unnecessary unsafe code in mfixalloc Change-Id: Ie9ea4af4315a4d0eb69d0569726bb3eca2b397af Reviewed-on: https://go-review.googlesource.com/16005 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-17 00:26:26 +00:00
Nodir Turakulov	9358f7fa61	runtime: merge panic1.go into panic.go A TODO to merge is removed from panic1.go. The rest is appended to panic.go Updates #12952 Change-Id: Ied4382a455abc20bc2938e34d031802e6b4baf8b Reviewed-on: https://go-review.googlesource.com/15905 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-16 15:51:49 +00:00
Nodir Turakulov	d72d299f3e	runtime: rename print1.go -> print.go It seems that it was called print1.go mistakenly: print.go was deleted in the same commit: https://go.googlesource.com/go/+/597b266eafe7d63e9be8da1c1b4813bd2998a11c Updates #12952 Change-Id: I371e59d6cebc8824857df3f3ee89101147dfffc0 Reviewed-on: https://go-review.googlesource.com/15950 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-16 15:51:30 +00:00
Nodir Turakulov	881b0e7880	runtime: merge string1.go into string.go string1.go contents are appended to string.go as is Updates #12952 Change-Id: I30083ba7fdd362d4421e964a494c76ca865bedc2 Reviewed-on: https://go-review.googlesource.com/15951 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-16 15:46:02 +00:00
Michael Hudson-Doyle	42c7929c04	runtime, runtime/debug: access unexported runtime functions with //go:linkname, not assembly stubs Change-Id: I88f80f5914d6e4c179f3d28aa59fc29b7ef0cc66 Reviewed-on: https://go-review.googlesource.com/15960 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-16 09:14:25 +00:00
Michael Hudson-Doyle	0b8d583320	runtime, os/signal: use //go:linkname instead of assembly stubs to get access to runtime functions os/signal depends on a few unexported runtime functions. This removes the assembly stubs it used to get access to these in favour of using //go:linkname in runtime to make the functions accessible to os/signal. This is motivated by ppc64le shared libraries, where you cannot BR to a symbol defined in a shared library (only BL), but it seems like an improvment anyway. Change-Id: I09361203ce38070bd3f132f6dc5ac212f2dc6f58 Reviewed-on: https://go-review.googlesource.com/15871 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net>	2015-10-16 07:11:04 +00:00
Matthew Dempsky	4c2465d47d	runtime: use unsafe.Pointer(x) instead of (unsafe.Pointer)(x) This isn't C anymore. No binary change to pkg/linux_amd64/runtime.a. Change-Id: I24d66b0f5ac888f432b874aac684b1395e7c8345 Reviewed-on: https://go-review.googlesource.com/15903 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-15 21:48:37 +00:00
Raul Silvera	1d765b77a0	runtime: Reduce testing for fastlog2 implementation The current fastlog2 testing checks all 64M values in the domain of interest, which is too much for platforms with no native floating point. Reduce testing under testing.Short() to speed up builds for those platforms. Related to #12620 Change-Id: Ie5dcd408724ba91c3b3fcf9ba0dddedb34706cd1 Reviewed-on: https://go-review.googlesource.com/15830 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Joel Sing <jsing@google.com> Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-14 04:54:33 +00:00
Ian Lance Taylor	2961cab965	runtime: remove _Kind constants The duplication of _Kind and kind constants is a legacy of the conversion from C. Change-Id: I368b35a41f215cf91ac4b09dac59699edb414a0e Reviewed-on: https://go-review.googlesource.com/15800 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-13 00:15:36 +00:00
Austin Clements	65aa2da617	runtime: assist before allocating Currently, when the mutator allocates, the runtime first allocates the memory and then, if that G has done "enough" allocation, the runtime checks whether the G has assist debt to pay off and, if so, pays it off. This approach leads to under-assisting, where a G can allocate a large region (or many small regions) before paying for it, or can even exit with outstanding debt. This commit flips this around so that a G always acquires enough credit for an allocation before it can perform that allocation. We continue to amortize the cost of assists by requiring that they over-assist when triggered to build up credit for many allocations. Fixes #11967. Change-Id: Idac9f11133b328535667674d837be72c23ebd899 Reviewed-on: https://go-review.googlesource.com/15409 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-10-09 19:39:03 +00:00
Austin Clements	89c341c5e9	runtime: directly track GC assist balance Currently we track the per-G GC assist balance as two monotonically increasing values: the bytes allocated by the G this cycle (gcalloc) and the scan work performed by the G this cycle (gcscanwork). The assist balance is hence assistRatiogcalloc - gcscanwork. This works, but has two important downsides: 1) It requires floating-point math to figure out if a G is in debt or not. This makes it inappropriate to check for assist debt in the hot path of mallocgc, so we only do this when a G allocates a new span. As a result, Gs can operate "in the red", leading to under-assist and extended GC cycle length. 2) Revising the assist ratio during a GC cycle can lead to an "assist burst". If you think of plotting the scan work performed versus heaps size, the assist ratio controls the slope of this line. However, in the current system, the target line always passes through 0 at the heap size that triggered GC, so if the runtime increases the assist ratio, there has to be a potentially large assist to jump from the current amount of scan work up to the new target scan work for the current heap size. This commit replaces this approach with directly tracking the GC assist balance in terms of allocation credit bytes. Allocating N bytes simply decreases this by N and assisting raises it by the amount of scan work performed divided by the assist ratio (to get back to bytes). This will make it cheap to figure out if a G is in debt, which will let us efficiently check if an assist is necessary before* performing an allocation and hence keep Gs "in the black". This also fixes assist bursts because the assist ratio is now in terms of remaining work, rather than work from the beginning of the GC cycle. Hence, the plot of scan work versus heap size becomes continuous: we can revise the slope, but this slope always starts from where we are right now, rather than where we were at the beginning of the cycle. Change-Id: Ia821c5f07f8a433e8da7f195b52adfedd58bdf2c Reviewed-on: https://go-review.googlesource.com/15408 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-09 19:38:52 +00:00
Austin Clements	9e77c89868	runtime: ensure minimum heap distance via heap goal Currently we ensure a minimum heap distance of 1MB when computing the assist ratio. Rather than enforcing this minimum on the heap distance, it makes more sense to enforce that the heap goal itself is at least 1MB over the live heap size at the beginning of GC. Currently the two approaches are semantically equivalent, but this will let us switch to basing the assist ratio on current heap distance rather than the initial heap distance, since we can't enforce this minimum on the current heap distance (the GC may never finish because the goal posts will always be 1MB away). Change-Id: I0027b1c26a41a0152b01e5b67bdb1140d43ee903 Reviewed-on: https://go-review.googlesource.com/15604 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-09 19:38:39 +00:00
Austin Clements	8e8219deb5	runtime: update gcController.scanWork regularly Currently, gcController.scanWork is updated as lazily as possible since it is only read at the end of the GC cycle. We're about to read it during the GC cycle to improve the assist ratio revisions, so modify gcDrain* to regularly flush to gcController.scanWork in much the same way as we regularly flush to gcController.bgScanCredit. One consequence of this is that it's difficult to keep gcw.scanWork monotonic, so we give up on that and simply return the amount of scan work done by gcDrainN rather than calculating it in the caller. Change-Id: I7b50acdc39602f843eed0b5c6d2dacd7e762b81d Reviewed-on: https://go-review.googlesource.com/15407 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-09 19:38:29 +00:00
Austin Clements	c18b163c15	runtime: control background scan credit flushing with flag Currently callers of gcDrain control whether it flushes scan work credit to gcController.bgScanCredit by passing a value other than -1 for the flush threshold. Shortly we're going to make this always flush scan work to gcController.scanWork and optionally also flush scan work to gcController.bgScanCredit. This will be much easier if the flush threshold is simply a constant (which it is in practice) and callers merely control whether or not the flush includes the background credit. Hence, replace the flush threshold argument with a flag. Change-Id: Ia27db17de8a3f1e462a5d7137d4b5dc72f99a04e Reviewed-on: https://go-review.googlesource.com/15406 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-09 19:38:16 +00:00
Austin Clements	9b3cdaf0a3	runtime: consolidate gcDrain and gcDrainUntilPreempt These functions were nearly identical. Consolidate them by adding a flags argument. In addition to cleaning up this code, this makes further changes that affect both functions easier. Change-Id: I6ec5c947603bbbd3ff4040113b2fbc240e99745f Reviewed-on: https://go-review.googlesource.com/15405 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-09 19:38:03 +00:00
Austin Clements	39ed682206	runtime: explain why continuous assist revising is necessary Change-Id: I950af8d80433b3ae8a1da0aa7a8d2d0b295dd313 Reviewed-on: https://go-review.googlesource.com/15404 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-09 19:37:53 +00:00
Austin Clements	3271250ec4	runtime: fix comment for gcAssistAlloc Change-Id: I312e56e95d8ef8ae036d16444ab1e2df1285845d Reviewed-on: https://go-review.googlesource.com/15403 Reviewed-by: Russ Cox <rsc@golang.org>	2015-10-09 19:37:41 +00:00
Austin Clements	3e57b17dc3	runtime: fix comment for assistRatio The comment for assistRatio claimed it to be the reciprocal of what it actually is. Change-Id: If7f9bb853d75d0097facff3aa6704b224d9108b8 Reviewed-on: https://go-review.googlesource.com/15402 Reviewed-by: Russ Cox <rsc@golang.org>	2015-10-09 19:37:23 +00:00
Nodir Turakulov	3be4d59820	runtime: remove redundant type cast (*T)(unsafe.Pointer(&t)) === &t for t of type T Change-Id: I43c1aa436747dfa0bf4cb0d615da1647633f9536 Reviewed-on: https://go-review.googlesource.com/15656 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-09 18:48:36 +00:00
Keith Randall	91059de095	runtime: make aeshash more DOS-proof Improve the aeshash implementation to make it harder to engineer collisions. 1) Scramble the seed before xoring with the input string. This makes it harder to cancel known portions of the seed (like the size) because it mixes the per-table seed into those other parts. 2) Use table-dependent seeds for all stripes when hashing >16 byte strings. For small strings this change uses 4 aesenc ops instead of 3, so it is somewhat slower. The first two can run in parallel, though, so it isn't 33% slower. benchmark old ns/op new ns/op delta BenchmarkHash64-12 10.2 11.2 +9.80% BenchmarkHash16-12 5.71 6.13 +7.36% BenchmarkHash5-12 6.64 7.01 +5.57% BenchmarkHashBytesSpeed-12 30.3 31.9 +5.28% BenchmarkHash65536-12 2785 2882 +3.48% BenchmarkHash1024-12 53.6 55.4 +3.36% BenchmarkHashStringArraySpeed-12 54.9 56.5 +2.91% BenchmarkHashStringSpeed-12 18.7 19.2 +2.67% BenchmarkHashInt32Speed-12 14.8 15.1 +2.03% BenchmarkHashInt64Speed-12 14.5 14.5 +0.00% Change-Id: I59ea124b5cb92b1c7e8584008257347f9049996c Reviewed-on: https://go-review.googlesource.com/14124 Reviewed-by: jcd . <jcd@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-08 16:43:03 +00:00
Michael Hudson-Doyle	168a51b3a1	runtime: adjust the arm64 memmove and memclr to operate by word as much as they can Not only is this an obvious optimization: benchmark old MB/s new MB/s speedup BenchmarkMemmove1-4 35.35 29.65 0.84x BenchmarkMemmove2-4 63.78 52.53 0.82x BenchmarkMemmove3-4 89.72 73.96 0.82x BenchmarkMemmove4-4 109.94 95.73 0.87x BenchmarkMemmove5-4 127.60 112.80 0.88x BenchmarkMemmove6-4 143.59 126.67 0.88x BenchmarkMemmove7-4 157.90 138.92 0.88x BenchmarkMemmove8-4 167.18 231.81 1.39x BenchmarkMemmove9-4 175.23 252.07 1.44x BenchmarkMemmove10-4 165.68 261.10 1.58x BenchmarkMemmove11-4 174.43 263.31 1.51x BenchmarkMemmove12-4 180.76 267.56 1.48x BenchmarkMemmove13-4 189.06 284.93 1.51x BenchmarkMemmove14-4 186.31 284.72 1.53x BenchmarkMemmove15-4 195.75 281.62 1.44x BenchmarkMemmove16-4 202.96 439.23 2.16x BenchmarkMemmove32-4 264.77 775.77 2.93x BenchmarkMemmove64-4 306.81 1209.64 3.94x BenchmarkMemmove128-4 357.03 1515.41 4.24x BenchmarkMemmove256-4 380.77 2066.01 5.43x BenchmarkMemmove512-4 385.05 2556.45 6.64x BenchmarkMemmove1024-4 381.23 2804.10 7.36x BenchmarkMemmove2048-4 379.06 2814.83 7.43x BenchmarkMemmove4096-4 387.43 3064.96 7.91x BenchmarkMemmoveUnaligned1-4 28.91 25.40 0.88x BenchmarkMemmoveUnaligned2-4 56.13 47.56 0.85x BenchmarkMemmoveUnaligned3-4 74.32 69.31 0.93x BenchmarkMemmoveUnaligned4-4 97.02 83.58 0.86x BenchmarkMemmoveUnaligned5-4 110.17 103.62 0.94x BenchmarkMemmoveUnaligned6-4 124.95 113.26 0.91x BenchmarkMemmoveUnaligned7-4 142.37 130.82 0.92x BenchmarkMemmoveUnaligned8-4 151.20 205.64 1.36x BenchmarkMemmoveUnaligned9-4 166.97 215.42 1.29x BenchmarkMemmoveUnaligned10-4 148.49 221.22 1.49x BenchmarkMemmoveUnaligned11-4 159.47 239.57 1.50x BenchmarkMemmoveUnaligned12-4 163.52 247.32 1.51x BenchmarkMemmoveUnaligned13-4 167.55 256.54 1.53x BenchmarkMemmoveUnaligned14-4 175.12 251.03 1.43x BenchmarkMemmoveUnaligned15-4 192.10 267.13 1.39x BenchmarkMemmoveUnaligned16-4 190.76 378.87 1.99x BenchmarkMemmoveUnaligned32-4 259.02 562.98 2.17x BenchmarkMemmoveUnaligned64-4 317.72 842.44 2.65x BenchmarkMemmoveUnaligned128-4 355.43 1274.49 3.59x BenchmarkMemmoveUnaligned256-4 378.17 1815.74 4.80x BenchmarkMemmoveUnaligned512-4 362.15 2180.81 6.02x BenchmarkMemmoveUnaligned1024-4 376.07 2453.58 6.52x BenchmarkMemmoveUnaligned2048-4 381.66 2568.32 6.73x BenchmarkMemmoveUnaligned4096-4 398.51 2669.36 6.70x BenchmarkMemclr5-4 113.83 107.93 0.95x BenchmarkMemclr16-4 223.84 389.63 1.74x BenchmarkMemclr64-4 421.99 1209.58 2.87x BenchmarkMemclr256-4 525.94 2411.58 4.59x BenchmarkMemclr4096-4 581.66 4372.20 7.52x BenchmarkMemclr65536-4 565.84 4747.48 8.39x BenchmarkGoMemclr5-4 194.63 160.31 0.82x BenchmarkGoMemclr16-4 295.30 630.07 2.13x BenchmarkGoMemclr64-4 480.24 1884.03 3.92x BenchmarkGoMemclr256-4 540.23 2926.49 5.42x but it turns out that it's necessary to avoid the GC seeing partially written pointers. It's of course possible to be more sophisticated (using ldp/stp to move 16 bytes at a time in the core loop and unrolling the tail copying loops being the obvious ideas) but I wanted something simple and (reasonably) obviously correct. Fixes #12552 Change-Id: Iaeaf8a812cd06f4747ba2f792de1ded738890735 Reviewed-on: https://go-review.googlesource.com/14813 Reviewed-by: Austin Clements <austin@google.com>	2015-10-08 07:49:35 +00:00
Michael Hudson-Doyle	a5cb76243a	cmd/internal/obj, cmd/link, runtime: lots of TLS cleanup It's particularly nice to get rid of the android special cases in the linker. Change-Id: I516363af7ce8a6b2f196fe49cb8887ac787a6dad Reviewed-on: https://go-review.googlesource.com/14197 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-08 00:21:30 +00:00
Raul Silvera	27ee719fb3	pprof: improve sampling for heap profiling The current heap sampling introduces some bias that interferes with unsampling, producing unexpected heap profiles. The solution is to use a Poisson process to generate the sampling points, using the formulas described at https://en.wikipedia.org/wiki/Poisson_process This fixes #12620 Change-Id: If2400809ed3c41de504dd6cff06be14e476ff96c Reviewed-on: https://go-review.googlesource.com/14590 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-05 08:15:09 +00:00
Austin Clements	9f6df6c940	runtime: use 4 byte writes in amd64p32 memmove/memclr Currently, amd64p32's memmove and memclr use 8 byte writes as much as possible and 1 byte writes for the tail of the object. However, if an object ends with a 4 byte pointer at an 8 byte aligned offset, this may copy/zero the pointer field one byte at a time, allowing the garbage collector to observe a partially copied pointer. Fix this by using 4 byte writes instead of 8 byte writes. Updates #12552. Change-Id: I13324fd05756fb25ae57e812e836f0a975b5595c Reviewed-on: https://go-review.googlesource.com/15370 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2015-10-02 22:49:15 +00:00
Austin Clements	44078a3228	runtime: adjust huge page flags only on huge page granularity This fixes an issue where the runtime panics with "out of memory" or "cannot allocate memory" even though there's ample memory by reducing the number of memory mappings created by the memory allocator. Commit `7e1b61c` worked around issue #8832 where Linux's transparent huge page support could dramatically increase the RSS of a Go process by setting the MADV_NOHUGEPAGE flag on any regions of pages released to the OS with MADV_DONTNEED. This had the side effect of also increasing the number of VMAs (memory mappings) in a Go address space because a separate VMA is needed for every region of the virtual address space with different flags. Unfortunately, by default, Linux limits the number of VMAs in an address space to 65530, and a large heap can quickly reach this limit when the runtime starts scavenging memory. This commit dramatically reduces the number of VMAs. It does this primarily by only adjusting the huge page flag at huge page granularity. With this change, on amd64, even a pessimal heap that alternates between MADV_NOHUGEPAGE and MADV_HUGEPAGE must reach 128GB to reach the VMA limit. Because of this rounding to huge page granularity, this change is also careful to leave large used and unused regions huge page-enabled. This change reduces the maximum number of VMAs during the runtime benchmarks with GODEBUG=scavenge=1 from 692 to 49. Fixes #12233. Change-Id: Ic397776d042f20d53783a1cacf122e2e2db00584 Reviewed-on: https://go-review.googlesource.com/15191 Reviewed-by: Keith Randall <khr@golang.org>	2015-10-02 20:20:43 +00:00
Austin Clements	9a31d38f65	runtime: remove sweep wait loop in finishsweep_m In general, finishsweep_m must block until any spans that are concurrently being swept have been swept. It accomplishes this by looping over all spans, which, as in the previous commit, takes ~1ms/heap GB. Unfortunately, we do this during the STW sweep termination phase, so multi-gigabyte heaps can push our STW time past 10ms. However, there's no need to do this wait if the world is stopped because, in effect, stopping the world already had to wait for anything that was sweeping (and if it didn't, the wait in finishsweep_m would deadlock). Hence, we can simply skip this loop if the world is stopped, such as during sweep termination. In fact, currently all calls to finishsweep_m are STW, but this hasn't always been the case and may not be the case in the future, so we keep the logic around. For 24GB heaps, this reduces max pause time by 75% relative to tip and by 90% relative to Go 1.5. Notably, all pauses are now well under 10ms. Here are the results for the garbage benchmark: ------------- max pause ------------ Heap Procs after change before change 1.5.1 24GB 12 3.8ms 16ms 37ms 24GB 4 3.7ms 16ms 37ms 4GB 4 3.7ms 3ms 6.9ms In the 4GB/4P case, it seems the "before change" run got lucky: the max went up, but the 99%ile pause time went down from 3ms to 2.04ms. Change-Id: Ica22189559f231d408ef2815019c9dbb5f38bf31 Reviewed-on: https://go-review.googlesource.com/15071 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-02 19:56:01 +00:00
Austin Clements	dac220b0a9	runtime: remove in-use page count loop from STW In order to compute the sweep ratio, the runtime needs to know how many pages belong to spans in state _MSpanInUse. Currently it finds this out by looping over all spans during mark termination. However, this takes ~1ms/heap GB, so multi-gigabyte heaps can quickly push our STW time past 10ms. Replace the loop with an actively maintained count of in-use pages. For multi-gigabyte heaps, this reduces max mark termination pause time by 75%–90% relative to tip and by 85%–95% relative to Go 1.5.1. This shifts the longest pause time for large heaps to the sweep termination phase, so it only slightly decreases max pause time, though it roughly halves mean pause time. Here are the results for the garbage benchmark: ---- max mark termination pause ---- Heap Procs after change before change 1.5.1 24GB 12 1.9ms 18ms 37ms 24GB 4 3.7ms 18ms 37ms 4GB 4 920µs 3.8ms 6.9ms Fixes #11484. Change-Id: Ia2d28bb8a1e4f1c3b8ebf79fb203f12b9bf114ac Reviewed-on: https://go-review.googlesource.com/15070 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-02 19:55:55 +00:00
Austin Clements	608c1b0d56	runtime: scan objects with finalizers concurrently This reduces pause time by ~25% relative to tip and by ~50% relative to Go 1.5.1. Currently one of the steps of STW mark termination is to loop (in parallel) over all spans to find objects with finalizers in order to mark all objects reachable from these objects and to treat the finalizer special as a root. Unfortunately, even if there are no finalizers at all, this loop takes roughly 1 ms/heap GB/core, so multi-gigabyte heaps can quickly push our STW time past 10ms. Fix this by moving this scan from mark termination to concurrent scan, where it can run in parallel with mutators. The loop itself could also be optimized, but this cost is small compared to concurrent marking. Making this scan concurrent introduces two complications: 1) The scan currently walks the specials list of each span without locking it, which is safe only with the world stopped. We fix this by speculatively checking if a span has any specials (the vast majority won't) and then locking the specials list only if there are specials to check. 2) An object can have a finalizer set after concurrent scan, in which case it won't have been marked appropriately by concurrent scan. If the finalizer is a closure and is only reachable from the special, it could be swept before it is run. Likewise, if the object is not marked yet when the finalizer is set and then becomes unreachable before it is marked, other objects reachable only from it may be swept before the finalizer function is run. We fix this issue by making addfinalizer ensure the same marking invariants as markroot does. For multi-gigabyte heaps, this reduces max pause time by 20%–30% relative to tip (depending on GOMAXPROCS) and by ~50% relative to Go 1.5.1 (where this loop was neither concurrent nor parallel). Here are the results for the garbage benchmark: ---------------- max pause ---------------- Heap Procs Concurrent scan STW parallel scan 1.5.1 24GB 12 18ms 23ms 37ms 24GB 4 18ms 25ms 37ms 4GB 4 3.8ms 4.9ms 6.9ms In all cases, 95%ile pause time is similar to the max pause time. This also improves mean STW time by 10%–30%. Fixes #11485. Change-Id: I9359d8c3d120a51d23d924b52bf853a1299b1dfd Reviewed-on: https://go-review.googlesource.com/14982 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-02 19:55:48 +00:00
Austin Clements	fbd2660af3	runtime: introduce gcMode type for GC modes Currently, the GC modes constants are untyped and functions pass them around as ints. Clean this up by introducing a proper type for these constant. Change-Id: Ibc022447bdfa203644921fbb548312d7e2272e8d Reviewed-on: https://go-review.googlesource.com/14981 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-02 19:55:41 +00:00
Austin Clements	1b84bb8c7c	runtime: fix out-of-date comment on gcWork usage Change-Id: I3c21ffa80a5c14911e07238b1f64bec686ed7b72 Reviewed-on: https://go-review.googlesource.com/14980 Reviewed-by: Minux Ma <minux@golang.org>	2015-10-02 19:55:34 +00:00
David Crawshaw	47ccf96a95	runtime: darwin/386 entrypoint for c-archive Change-Id: Ic22597b5e2824cffe9598cb9b506af3426c285fd Reviewed-on: https://go-review.googlesource.com/12412 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-02 11:45:52 +00:00
Michael Hudson-Doyle	2c911143fd	runtime: adjust the ppc64x memmove and memclr to copy by word as much as it can Issue #12552 can happen on ppc64 too, although much less frequently in my testing. I'm fairly sure this fixes it (2 out of 200 runs of oracle.test failed without this change and 0 of 200 failed with it). It's also a lot faster for large moves/clears: name old speed new speed delta Memmove1-6 157MB/s ± 9% 144MB/s ± 0% -8.20% (p=0.004 n=10+9) Memmove2-6 281MB/s ± 1% 249MB/s ± 1% -11.53% (p=0.000 n=10+10) Memmove3-6 376MB/s ± 1% 328MB/s ± 1% -12.64% (p=0.000 n=10+10) Memmove4-6 475MB/s ± 4% 345MB/s ± 1% -27.28% (p=0.000 n=10+8) Memmove5-6 540MB/s ± 1% 393MB/s ± 0% -27.21% (p=0.000 n=10+10) Memmove6-6 609MB/s ± 0% 423MB/s ± 0% -30.56% (p=0.000 n=9+10) Memmove7-6 659MB/s ± 0% 468MB/s ± 0% -28.99% (p=0.000 n=8+10) Memmove8-6 705MB/s ± 0% 1295MB/s ± 1% +83.73% (p=0.000 n=9+9) Memmove9-6 740MB/s ± 1% 1241MB/s ± 1% +67.61% (p=0.000 n=10+8) Memmove10-6 780MB/s ± 0% 1162MB/s ± 1% +48.95% (p=0.000 n=10+9) Memmove11-6 811MB/s ± 0% 1180MB/s ± 0% +45.58% (p=0.000 n=8+9) Memmove12-6 820MB/s ± 1% 1073MB/s ± 1% +30.83% (p=0.000 n=10+9) Memmove13-6 849MB/s ± 0% 1068MB/s ± 1% +25.87% (p=0.000 n=10+10) Memmove14-6 877MB/s ± 0% 911MB/s ± 0% +3.83% (p=0.000 n=10+10) Memmove15-6 893MB/s ± 0% 922MB/s ± 0% +3.25% (p=0.000 n=10+9) Memmove16-6 897MB/s ± 1% 2418MB/s ± 1% +169.67% (p=0.000 n=10+9) Memmove32-6 908MB/s ± 0% 3927MB/s ± 2% +332.64% (p=0.000 n=10+8) Memmove64-6 1.11GB/s ± 0% 5.59GB/s ± 0% +404.64% (p=0.000 n=9+9) Memmove128-6 1.25GB/s ± 0% 6.71GB/s ± 2% +437.49% (p=0.000 n=9+10) Memmove256-6 1.33GB/s ± 0% 7.25GB/s ± 1% +445.06% (p=0.000 n=10+10) Memmove512-6 1.38GB/s ± 0% 8.87GB/s ± 0% +544.43% (p=0.000 n=10+10) Memmove1024-6 1.40GB/s ± 0% 10.00GB/s ± 0% +613.80% (p=0.000 n=10+10) Memmove2048-6 1.41GB/s ± 0% 10.65GB/s ± 0% +652.95% (p=0.000 n=9+10) Memmove4096-6 1.42GB/s ± 0% 11.01GB/s ± 0% +675.37% (p=0.000 n=8+10) Memclr5-6 269MB/s ± 1% 264MB/s ± 0% -1.80% (p=0.000 n=10+10) Memclr16-6 600MB/s ± 0% 887MB/s ± 1% +47.83% (p=0.000 n=10+10) Memclr64-6 1.06GB/s ± 0% 2.91GB/s ± 1% +174.58% (p=0.000 n=8+10) Memclr256-6 1.32GB/s ± 0% 6.58GB/s ± 0% +399.86% (p=0.000 n=9+10) Memclr4096-6 1.42GB/s ± 0% 10.90GB/s ± 0% +668.03% (p=0.000 n=8+10) Memclr65536-6 1.43GB/s ± 0% 11.37GB/s ± 0% +697.83% (p=0.000 n=9+8) GoMemclr5-6 359MB/s ± 0% 360MB/s ± 0% +0.46% (p=0.000 n=10+10) GoMemclr16-6 750MB/s ± 0% 1264MB/s ± 1% +68.45% (p=0.000 n=10+10) GoMemclr64-6 1.17GB/s ± 0% 3.78GB/s ± 1% +223.58% (p=0.000 n=10+9) GoMemclr256-6 1.35GB/s ± 0% 7.47GB/s ± 0% +452.44% (p=0.000 n=10+10) Update #12552 Change-Id: I7192e9deb9684a843aed37f58a16a4e29970e893 Reviewed-on: https://go-review.googlesource.com/14840 Reviewed-by: Minux Ma <minux@golang.org>	2015-10-02 07:50:52 +00:00
Mikio Hara	9fb79380f0	runtime: drop sigfwd from signal forwarding unsupported platforms This change splits signal_unix.go into signal_unix.go and signal2_unix.go and removes the fake symbol sigfwd from signal forwarding unsupported platforms for clarification purpose. Change-Id: I205eab5cf1930fda8a68659b35cfa9f3a0e67ca6 Reviewed-on: https://go-review.googlesource.com/12062 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-02 01:07:44 +00:00
Joel Sing	db70c019d7	runtime/trace: reduce memory usage for trace stress tests on openbsd/arm Reduce allocation to avoid running out of memory on the openbsd/arm builder, until issue/12032 is resolved. Update issue #12032 Change-Id: Ibd513829ffdbd0db6cd86a0a5409934336131156 Reviewed-on: https://go-review.googlesource.com/15242 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-10-01 18:00:55 +00:00
Joel Sing	1d5251f707	runtime: handle sysReserve failure in mHeap_SysAlloc sysReserve will return nil on failure - correctly handle this case and return nil to the caller. Currently, a failure will result in h.arena_end being set to psize, h.arena_used being set to zero and fun times ensue. On the openbsd/arm builder this has resulted in: runtime: address space conflict: map(0x0) = 0x40946000 fatal error: runtime: address space conflict When it should be reporting out of memory instead. Change-Id: Iba828d5ee48ee1946de75eba409e0cfb04f089d4 Reviewed-on: https://go-review.googlesource.com/15056 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2015-10-01 14:40:02 +00:00
Jeremy Schlatter	59bacb285c	runtime: update comment to match function name Change-Id: I8f22434ade576cc7e3e6d9f357bba12c1296e3d1 Reviewed-on: https://go-review.googlesource.com/15250 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-01 13:12:50 +00:00
Ian Lance Taylor	0c1f0549b8	runtime, runtime/cgo: support using msan on cgo code The memory sanitizer (msan) is a nice compiler feature that can dynamically check for memory errors in C code. It's not useful for Go code, since Go is memory safe. But it is useful to be able to use the memory sanitizer on C code that is linked into a Go program via cgo. Without this change it does not work, as msan considers memory passed from Go to C as uninitialized. To make this work, change the runtime to call the C mmap function when using cgo. When using msan the mmap call will be intercepted and marked as returning initialized memory. Work around what appears to be an msan bug by calling malloc before we call mmap. Change-Id: I8ab7286d7595ae84782f68a98bef6d3688b946f9 Reviewed-on: https://go-review.googlesource.com/15170 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-09-30 22:17:55 +00:00
Austin Clements	e01be84149	runtime: test that periodic GC works We've broken periodic GC a few times without noticing because there's no test for it, partly because you have to wait two minutes to see if it happens. This exposes control of the periodic GC timeout to runtime tests and adds a test that cranks it down to zero and sleeps for a bit to make sure periodic GCs happen. Change-Id: I3ec44e967e99f4eda752f85c329eebd18b87709e Reviewed-on: https://go-review.googlesource.com/13169 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-09-30 19:24:07 +00:00
Shenghou Ma	604fbab3f1	runtime: fix incomplete sentence in comment Fixes #12709. Change-Id: If5a2536458fcd26d6f003dde1bfc02f86b09fa94 Reviewed-on: https://go-review.googlesource.com/14793 Reviewed-by: Andrew Gerrand <adg@golang.org>	2015-09-23 17:05:39 +00:00

1 2 3 4 5 ...

1487 Commits