qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-20 02:24:43 -07:00

Author	SHA1	Message	Date
Austin Clements	dcd9e5bc0f	runtime: make putfull start mark workers Currently we depend on the good graces and timing of the scheduler to get opportunities to start dedicated mark workers. In the worst case, it may take 10ms to get dedicated mark workers going at the beginning of mark 1 and mark 2 or after the amount of available work has dropped and gone back up. Instead of waiting for the regular preemption logic to get around to us, make putfull enlist a random P if we're not already running enough dedicated workers. This should improve performance stability of the garbage collector and is likely to improve the overall performance somewhat. No overall effect on the go1 benchmarks. It speeds up the garbage benchmark by 12%, which more than counters the performance loss from the previous commit. name old time/op new time/op delta XBenchGarbage-12 6.32ms ± 4% 5.58ms ± 2% -11.68% (p=0.000 n=20+16) name old time/op new time/op delta BinaryTree17-12 3.18s ± 5% 3.12s ± 4% -1.83% (p=0.021 n=20+20) Fannkuch11-12 2.50s ± 2% 2.46s ± 2% -1.57% (p=0.000 n=18+19) FmtFprintfEmpty-12 50.8ns ± 3% 50.4ns ± 3% ~ (p=0.184 n=20+20) FmtFprintfString-12 167ns ± 2% 171ns ± 1% +2.46% (p=0.000 n=20+19) FmtFprintfInt-12 161ns ± 2% 163ns ± 2% +1.81% (p=0.000 n=20+20) FmtFprintfIntInt-12 269ns ± 1% 266ns ± 1% -0.81% (p=0.002 n=19+20) FmtFprintfPrefixedInt-12 237ns ± 2% 231ns ± 2% -2.86% (p=0.000 n=20+20) FmtFprintfFloat-12 313ns ± 2% 313ns ± 1% ~ (p=0.681 n=20+20) FmtManyArgs-12 1.05µs ± 2% 1.03µs ± 1% -2.26% (p=0.000 n=20+20) GobDecode-12 8.66ms ± 1% 8.67ms ± 1% ~ (p=0.380 n=19+20) GobEncode-12 6.56ms ± 1% 6.56ms ± 2% ~ (p=0.607 n=19+20) Gzip-12 317ms ± 1% 314ms ± 2% -1.10% (p=0.000 n=20+19) Gunzip-12 42.1ms ± 1% 42.2ms ± 1% +0.27% (p=0.044 n=20+19) HTTPClientServer-12 62.7µs ± 1% 62.0µs ± 1% -1.04% (p=0.000 n=19+18) JSONEncode-12 16.7ms ± 1% 16.8ms ± 2% +0.59% (p=0.021 n=20+20) JSONDecode-12 58.2ms ± 1% 61.4ms ± 2% +5.43% (p=0.000 n=18+19) Mandelbrot200-12 3.84ms ± 1% 3.87ms ± 2% +0.79% (p=0.008 n=18+20) GoParse-12 3.86ms ± 2% 3.76ms ± 2% -2.60% (p=0.000 n=20+20) RegexpMatchEasy0_32-12 100ns ± 2% 100ns ± 1% -0.68% (p=0.005 n=18+15) RegexpMatchEasy0_1K-12 332ns ± 1% 342ns ± 1% +3.16% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 82.9ns ± 3% 83.0ns ± 2% ~ (p=0.906 n=19+20) RegexpMatchEasy1_1K-12 487ns ± 1% 494ns ± 1% +1.50% (p=0.000 n=17+20) RegexpMatchMedium_32-12 131ns ± 2% 130ns ± 1% ~ (p=0.686 n=19+20) RegexpMatchMedium_1K-12 39.6µs ± 1% 39.2µs ± 1% -1.09% (p=0.000 n=18+19) RegexpMatchHard_32-12 2.04µs ± 1% 2.04µs ± 2% ~ (p=0.804 n=20+20) RegexpMatchHard_1K-12 61.7µs ± 2% 61.3µs ± 2% ~ (p=0.052 n=18+20) Revcomp-12 529ms ± 2% 533ms ± 1% +0.83% (p=0.003 n=20+19) Template-12 70.7ms ± 2% 71.0ms ± 2% ~ (p=0.065 n=20+19) TimeParse-12 351ns ± 2% 355ns ± 1% +1.25% (p=0.000 n=19+20) TimeFormat-12 362ns ± 2% 373ns ± 1% +2.83% (p=0.000 n=18+20) [Geo mean] 62.2µs 62.3µs +0.13% name old speed new speed delta GobDecode-12 88.6MB/s ± 1% 88.5MB/s ± 1% ~ (p=0.392 n=19+20) GobEncode-12 117MB/s ± 1% 117MB/s ± 1% ~ (p=0.622 n=19+20) Gzip-12 61.1MB/s ± 1% 61.8MB/s ± 2% +1.11% (p=0.000 n=20+19) Gunzip-12 461MB/s ± 1% 460MB/s ± 1% -0.27% (p=0.044 n=20+19) JSONEncode-12 116MB/s ± 1% 115MB/s ± 2% -0.58% (p=0.022 n=20+20) JSONDecode-12 33.3MB/s ± 1% 31.6MB/s ± 2% -5.15% (p=0.000 n=18+19) GoParse-12 15.0MB/s ± 2% 15.4MB/s ± 2% +2.66% (p=0.000 n=20+20) RegexpMatchEasy0_32-12 317MB/s ± 2% 319MB/s ± 2% ~ (p=0.052 n=20+20) RegexpMatchEasy0_1K-12 3.08GB/s ± 1% 2.99GB/s ± 1% -3.07% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 386MB/s ± 3% 386MB/s ± 2% ~ (p=0.939 n=19+20) RegexpMatchEasy1_1K-12 2.10GB/s ± 1% 2.07GB/s ± 1% -1.46% (p=0.000 n=17+20) RegexpMatchMedium_32-12 7.62MB/s ± 2% 7.64MB/s ± 1% ~ (p=0.702 n=19+20) RegexpMatchMedium_1K-12 25.9MB/s ± 1% 26.1MB/s ± 2% +0.99% (p=0.000 n=18+20) RegexpMatchHard_32-12 15.7MB/s ± 1% 15.7MB/s ± 2% ~ (p=0.723 n=20+20) RegexpMatchHard_1K-12 16.6MB/s ± 2% 16.7MB/s ± 2% ~ (p=0.052 n=18+20) Revcomp-12 481MB/s ± 2% 477MB/s ± 1% -0.83% (p=0.003 n=20+19) Template-12 27.5MB/s ± 2% 27.3MB/s ± 2% ~ (p=0.062 n=20+19) [Geo mean] 99.4MB/s 99.1MB/s -0.35% Change-Id: I914d8cadded5a230509d118164a4c201601afc06 Reviewed-on: https://go-review.googlesource.com/16298 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-11-04 20:15:51 +00:00
Austin Clements	62ba520b23	runtime: eliminate getfull barrier from concurrent mark Currently dedicated mark workers participate in the getfull barrier during concurrent mark. However, the getfull barrier wasn't designed for concurrent work and this causes no end of headaches. In the concurrent setting, participants come and go. This makes mark completion susceptible to live-lock: since dedicated workers are only periodically polling for completion, it's possible for the program to be in some transient worker each time one of the dedicated workers wakes up to check if it can exit the getfull barrier. It also complicates reasoning about the system because dedicated workers participate directly in the getfull barrier, but transient workers must instead use trygetfull because they have exit conditions that aren't captured by getfull (e.g., fractional workers exit when preempted). The complexity of implementing these exit conditions contributed to #11677. Furthermore, the getfull barrier is inefficient because we could be running user code instead of spinning on a P. In effect, we're dedicating 25% of the CPU to marking even if that means we have to spin to make that 25%. It also causes issues on Windows because we can't actually sleep for 100µs (#8687). Fix this by making dedicated workers no longer participate in the getfull barrier. Instead, dedicated workers simply return to the scheduler when they fail to get more work, regardless of what others workers are doing, and the scheduler only starts new dedicated workers if there's work available. Everything that needs to be handled by this barrier is already handled by detection of mark completion. This makes the system much more symmetric because all workers and assists now use trygetfull during concurrent mark. It also loosens the 25% CPU target so that we can give some of that 25% back to user code if there isn't enough work to keep the mark worker busy. And it eliminates the problematic 100µs sleep on Windows during concurrent mark (though not during mark termination). The downside of this is that if we hit a bottleneck in the heap graph that then expands back out, the system may shut down dedicated workers and take a while to start them back up. We'll address this in the next commit. Updates #12041 and #8687. No effect on the go1 benchmarks. This slows down the garbage benchmark by 9%, but we'll more than make it up in the next commit. name old time/op new time/op delta XBenchGarbage-12 5.80ms ± 2% 6.32ms ± 4% +9.03% (p=0.000 n=20+20) Change-Id: I65100a9ba005a8b5cf97940798918672ea9dd09b Reviewed-on: https://go-review.googlesource.com/16297 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-11-04 20:15:39 +00:00
Austin Clements	3a765430c1	cmd/compile: add go:nowritebarrierrec annotation This introduces a recursive variant of the go:nowritebarrier annotation that prohibits write barriers not only in the annotated function, but in all functions it calls, recursively. The error message gives the shortest call stack from the annotated function to the function containing the prohibited write barrier, including the names of the functions and the line numbers of the calls. To demonstrate the annotation, we apply it to gcmarkwb_m, the write barrier itself. This is a new annotation rather than a modification of the existing go:nowritebarrier annotation because, for better or worse, there are many go:nowritebarrier functions that do call functions with write barriers. In most of these cases this is benign because the annotation was conservative, but it prohibits simply coopting the existing annotation. Change-Id: I225ca483c8f699e8436373ed96349e80ca2c2479 Reviewed-on: https://go-review.googlesource.com/16554 Reviewed-by: Keith Randall <khr@golang.org>	2015-11-04 14:42:04 +00:00
Dmitry Vyukov	ee0305e036	runtime: remove dead code runtime.free has long gone. Change-Id: I058f69e6481b8fa008e1951c29724731a8a3d081 Reviewed-on: https://go-review.googlesource.com/16593 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com>	2015-11-03 19:20:21 +00:00
Austin Clements	b6c0934a9b	runtime: cache two workbufs to reduce contention Currently the gcWork abstraction caches a single work buffer. As a result, if a worker is putting and getting pointers right at the boundary of a work buffer, it can flap between work buffers and (potentially significantly) increase contention on the global work buffer lists. This change modifies gcWork to instead cache two work buffers and switch off between them. This introduces one buffers' worth of hysteresis and eliminates the above performance worst case by amortizing the cost of getting or putting a work buffer over at least one buffers' worth of work. In practice, it's difficult to trigger this worst case with reasonably large work buffers. On the garbage benchmark, this reduces the max writes/sec to the global work list from 32K to 25K and the median from 6K to 5K. However, if a workload were to trigger this worst case behavior, it could significantly drive up this contention. This has negligible effects on the go1 benchmarks and slightly speeds up the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.90ms ± 3% 5.83ms ± 4% -1.18% (p=0.011 n=18+18) name old time/op new time/op delta BinaryTree17-12 3.22s ± 4% 3.17s ± 3% -1.57% (p=0.009 n=19+20) Fannkuch11-12 2.44s ± 1% 2.53s ± 4% +3.78% (p=0.000 n=18+19) FmtFprintfEmpty-12 50.2ns ± 2% 50.5ns ± 5% ~ (p=0.631 n=19+20) FmtFprintfString-12 167ns ± 1% 166ns ± 1% ~ (p=0.141 n=20+20) FmtFprintfInt-12 162ns ± 1% 159ns ± 1% -1.80% (p=0.000 n=20+20) FmtFprintfIntInt-12 277ns ± 2% 263ns ± 1% -4.78% (p=0.000 n=20+18) FmtFprintfPrefixedInt-12 240ns ± 1% 232ns ± 2% -3.25% (p=0.000 n=20+20) FmtFprintfFloat-12 311ns ± 1% 315ns ± 2% +1.17% (p=0.000 n=20+20) FmtManyArgs-12 1.05µs ± 2% 1.03µs ± 2% -1.72% (p=0.000 n=20+20) GobDecode-12 8.65ms ± 1% 8.71ms ± 2% +0.68% (p=0.001 n=19+20) GobEncode-12 6.51ms ± 1% 6.54ms ± 1% +0.42% (p=0.047 n=20+19) Gzip-12 318ms ± 2% 315ms ± 2% -1.20% (p=0.000 n=19+19) Gunzip-12 42.2ms ± 2% 42.1ms ± 1% ~ (p=0.667 n=20+19) HTTPClientServer-12 62.5µs ± 1% 62.4µs ± 1% ~ (p=0.110 n=20+18) JSONEncode-12 16.8ms ± 1% 16.8ms ± 2% ~ (p=0.569 n=19+20) JSONDecode-12 60.8ms ± 2% 59.8ms ± 1% -1.69% (p=0.000 n=19+19) Mandelbrot200-12 3.87ms ± 1% 3.85ms ± 0% -0.61% (p=0.001 n=20+17) GoParse-12 3.76ms ± 2% 3.76ms ± 1% ~ (p=0.698 n=20+20) RegexpMatchEasy0_32-12 100ns ± 2% 101ns ± 2% ~ (p=0.065 n=19+20) RegexpMatchEasy0_1K-12 342ns ± 2% 333ns ± 1% -2.82% (p=0.000 n=20+19) RegexpMatchEasy1_32-12 83.3ns ± 2% 83.2ns ± 2% ~ (p=0.692 n=20+19) RegexpMatchEasy1_1K-12 498ns ± 2% 490ns ± 1% -1.52% (p=0.000 n=18+20) RegexpMatchMedium_32-12 131ns ± 2% 131ns ± 2% ~ (p=0.464 n=20+18) RegexpMatchMedium_1K-12 39.3µs ± 2% 39.6µs ± 1% +0.77% (p=0.000 n=18+19) RegexpMatchHard_32-12 2.04µs ± 2% 2.06µs ± 1% +0.69% (p=0.009 n=19+20) RegexpMatchHard_1K-12 61.4µs ± 2% 62.1µs ± 1% +1.21% (p=0.000 n=19+20) Revcomp-12 534ms ± 1% 529ms ± 1% -0.97% (p=0.000 n=19+16) Template-12 70.4ms ± 2% 70.0ms ± 1% ~ (p=0.070 n=19+19) TimeParse-12 359ns ± 3% 344ns ± 1% -4.15% (p=0.000 n=19+19) TimeFormat-12 357ns ± 1% 361ns ± 2% +1.05% (p=0.002 n=20+20) [Geo mean] 62.4µs 62.0µs -0.56% name old speed new speed delta GobDecode-12 88.7MB/s ± 1% 88.1MB/s ± 2% -0.68% (p=0.001 n=19+20) GobEncode-12 118MB/s ± 1% 117MB/s ± 1% -0.42% (p=0.046 n=20+19) Gzip-12 60.9MB/s ± 2% 61.7MB/s ± 2% +1.21% (p=0.000 n=19+19) Gunzip-12 460MB/s ± 2% 461MB/s ± 1% ~ (p=0.661 n=20+19) JSONEncode-12 116MB/s ± 1% 115MB/s ± 2% ~ (p=0.555 n=19+20) JSONDecode-12 31.9MB/s ± 2% 32.5MB/s ± 1% +1.72% (p=0.000 n=19+19) GoParse-12 15.4MB/s ± 2% 15.4MB/s ± 1% ~ (p=0.653 n=20+20) RegexpMatchEasy0_32-12 317MB/s ± 2% 315MB/s ± 2% ~ (p=0.141 n=19+20) RegexpMatchEasy0_1K-12 2.99GB/s ± 2% 3.07GB/s ± 1% +2.86% (p=0.000 n=20+19) RegexpMatchEasy1_32-12 384MB/s ± 2% 385MB/s ± 2% ~ (p=0.672 n=20+19) RegexpMatchEasy1_1K-12 2.06GB/s ± 2% 2.09GB/s ± 1% +1.54% (p=0.000 n=18+20) RegexpMatchMedium_32-12 7.62MB/s ± 2% 7.63MB/s ± 2% ~ (p=0.800 n=20+18) RegexpMatchMedium_1K-12 26.0MB/s ± 1% 25.8MB/s ± 1% -0.77% (p=0.000 n=18+19) RegexpMatchHard_32-12 15.7MB/s ± 2% 15.6MB/s ± 1% -0.69% (p=0.010 n=19+20) RegexpMatchHard_1K-12 16.7MB/s ± 2% 16.5MB/s ± 1% -1.19% (p=0.000 n=19+20) Revcomp-12 476MB/s ± 1% 481MB/s ± 1% +0.97% (p=0.000 n=19+16) Template-12 27.6MB/s ± 2% 27.7MB/s ± 1% ~ (p=0.071 n=19+19) [Geo mean] 99.1MB/s 99.3MB/s +0.27% Change-Id: I68bcbf74ccb716cd5e844a554f67b679135105e6 Reviewed-on: https://go-review.googlesource.com/16042 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 19:12:10 +00:00
Dmitry Vyukov	bf606094ee	runtime: fix finalization and profiling of tiny allocations Handling of special records for tiny allocations has two problems: 1. Once we queue a finalizer we mark the object. As the result any subsequent finalizers for the same object will not be queued during this GC cycle. If we have 16 finalizers setup (the worst case), finalization will take 16 GC cycles. This is what caused misbehave of tinyfin.go. The actual flakiness was caused by the fact that fing is asynchronous and don't always run before the check. 2. If a tiny block has both finalizer and profile specials, it is possible that we both queue finalizer, preserve the object live and free the profile record. As the result heap profile can be skewed. Fix both issues by analyzing all special records for a single object at once. Also, make tinyfin test stricter and remove reliance on real time. Also, add a test for the problem 2. Currently heap profile missed about a half of live memory. Fixes #13100 Change-Id: I9ae4dc1c44893724138a4565ca5cae29f2e97544 Reviewed-on: https://go-review.googlesource.com/16591 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com>	2015-11-03 18:57:18 +00:00
Ilya Tocar	95333aea53	strings: add asm version of Index() for short strings on amd64 Currently we have special case for 1-byte strings, This extends this to strings shorter than 32 bytes on amd64. Results (broadwell): name old time/op new time/op delta IndexRune-4 57.4ns ± 0% 57.5ns ± 0% +0.10% (p=0.000 n=20+19) IndexRuneFastPath-4 20.4ns ± 0% 20.4ns ± 0% ~ (all samples are equal) Index-4 21.0ns ± 0% 21.8ns ± 0% +3.81% (p=0.000 n=20+20) LastIndex-4 7.07ns ± 1% 6.98ns ± 0% -1.21% (p=0.000 n=20+16) IndexByte-4 18.3ns ± 0% 18.3ns ± 0% ~ (all samples are equal) IndexHard1-4 1.46ms ± 0% 0.39ms ± 0% -73.06% (p=0.000 n=16+16) IndexHard2-4 1.46ms ± 0% 0.30ms ± 0% -79.55% (p=0.000 n=18+18) IndexHard3-4 1.46ms ± 0% 0.66ms ± 0% -54.68% (p=0.000 n=19+19) LastIndexHard1-4 1.46ms ± 0% 1.46ms ± 0% -0.01% (p=0.036 n=18+20) LastIndexHard2-4 1.46ms ± 0% 1.46ms ± 0% ~ (p=0.588 n=19+19) LastIndexHard3-4 1.46ms ± 0% 1.46ms ± 0% ~ (p=0.283 n=17+20) IndexTorture-4 11.1µs ± 0% 11.1µs ± 0% +0.01% (p=0.000 n=18+17) Change-Id: I892781549f558f698be4e41f9f568e3d0611efb5 Reviewed-on: https://go-review.googlesource.com/16430 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>	2015-11-03 16:04:28 +00:00
Austin Clements	1870572180	runtime: enlarge GC work buffer size Currently the GC work buffers are only 256 bytes and hence can record only 24 64-bit pointer. They were reduced from 4K in commits `db7fd1c` and `a15818f` as a way to minimize the amount of work the per-P workbuf caches could "hide" from the mark phase and carry in to the mark termination phase. However, this approach wasn't very robust and we later added a "mark 2" phase to address this problem head-on. Because of mark 2, there's now no benefit to having very small work buffers. But there are plenty of downsides: small work buffers increase contention on the work lists, increase the frequency and hence net overhead of acquiring and releasing work buffers, and somewhat increase memory overhead of the GC. This commit expands work buffers back to 4K (504 64-bit pointers). This reduces the rate of writes to work.full in the garbage benchmark from a peak of ~780,000 writes/sec to a peak of ~32,000 writes/sec. This has negligible effect on the go1 benchmarks. It slightly slows down the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.37ms ± 5% 5.60ms ± 2% +4.37% (p=0.000 n=20+20) Change-Id: Ic9cc28e7a125d23d9faf4f5e690fb8aa9bcdfb28 Reviewed-on: https://go-review.googlesource.com/15893 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 15:53:38 +00:00
Austin Clements	456528304d	runtime: make assists preemptible Currently, assists are non-preemptible, which means a heavily assisting G can block other Gs from running. At the beginning of a GC cycle, it can also delay scang, which will spin until the assist is done. Since scanning is currently done sequentially, this can seriously extend the length of the scan phase. Fix this by making assists preemptible. Since the assist holds work buffers and runs on the system stack, this must be done cooperatively: we make gcDrainN return on preemption, and make the assist return from the system stack and voluntarily Gosched. This is prerequisite to enlarging the work buffers. Without this change, the delays and spinning in scang increase significantly. This has no effect on the go1 benchmarks. name old time/op new time/op delta XBenchGarbage-12 5.72ms ± 4% 5.37ms ± 5% -6.11% (p=0.000 n=20+20) Change-Id: I829e732a0f23b126da633516a1a9ec1a508fdbf1 Reviewed-on: https://go-review.googlesource.com/15894 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 15:53:31 +00:00
Austin Clements	15aa6bbd5a	runtime: replace assist sleep loop with park/ready GC assists must block until the assist can be satisfied (either through stealing credit or doing work) or the GC cycle ends. Currently, this is implemented as a retry loop with a 100 µs delay. This obviously isn't ideal, as it wastes CPU and delays mutator execution. It also has the somewhat peculiar downside that sleeping a G requires allocation, and this requires working around recursive allocation. Replace this timed delay with a proper scheduling queue. When an assist can't be satisfied immediately, it adds the allocating G to a queue and parks it. Any time background scan credit is flushed, it consults this queue, directly satisfies the debt of queued assists, and wakes up satisfied assists before flushing any remaining credit to the background credit pool. No effect on the go1 benchmarks. Slightly speeds up the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.81ms ± 1% 5.72ms ± 4% -1.65% (p=0.011 n=20+20) Updates #12041. Change-Id: I8ee3b6274dd097b12b10a8030796a958a4b0e7b7 Reviewed-on: https://go-review.googlesource.com/15890 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 15:53:25 +00:00
Austin Clements	0ca4488cc1	runtime: change p.runq from []*g to []guintptr This eliminates many write barriers in the scheduler code that are unnecessary and will interfere with upcoming changes where the garbage collector will have to invoke run queue functions in contexts that must not have write barriers. Change-Id: I702d0ac99cfd00ffff406e7362917db6a43e7e55 Reviewed-on: https://go-review.googlesource.com/16556 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-11-03 15:53:18 +00:00
Todd Neal	e3e0122ae2	test: use go:noinline consistently Replace various implementations of inlining prevention with "go:noinline" Change-Id: Iac90895c3a62d6f4b7a6c72e11e165d15a0abfa4 Reviewed-on: https://go-review.googlesource.com/16510 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Todd Neal <todd@tneal.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-03 02:01:34 +00:00
Ilya Tocar	0e23ca41d9	bytes: speed up Compare() on amd64 Use AVX2 if available. Results (haswell), below: name old time/op new time/op delta BytesCompare1-6 11.4ns ± 0% 11.4ns ± 0% ~ (all samples are equal) BytesCompare2-6 11.4ns ± 0% 11.4ns ± 0% ~ (all samples are equal) BytesCompare4-6 11.4ns ± 0% 11.4ns ± 0% ~ (all samples are equal) BytesCompare8-6 9.29ns ± 2% 8.76ns ± 0% -5.72% (p=0.000 n=16+17) BytesCompare16-6 9.29ns ± 2% 9.20ns ± 0% -1.02% (p=0.000 n=20+16) BytesCompare32-6 11.4ns ± 1% 11.4ns ± 0% ~ (p=0.191 n=20+20) BytesCompare64-6 14.4ns ± 0% 13.1ns ± 0% -8.68% (p=0.000 n=20+20) BytesCompare128-6 20.2ns ± 0% 18.5ns ± 0% -8.27% (p=0.000 n=16+20) BytesCompare256-6 29.3ns ± 0% 24.5ns ± 0% -16.38% (p=0.000 n=16+16) BytesCompare512-6 46.8ns ± 0% 37.1ns ± 0% -20.78% (p=0.000 n=18+16) BytesCompare1024-6 82.9ns ± 0% 62.3ns ± 0% -24.86% (p=0.000 n=20+14) BytesCompare2048-6 155ns ± 0% 112ns ± 0% -27.74% (p=0.000 n=20+20) CompareBytesEqual-6 10.1ns ± 1% 10.0ns ± 1% ~ (p=0.527 n=20+20) CompareBytesToNil-6 10.0ns ± 2% 9.4ns ± 0% -6.57% (p=0.000 n=20+17) CompareBytesEmpty-6 8.76ns ± 0% 8.76ns ± 0% ~ (all samples are equal) CompareBytesIdentical-6 8.76ns ± 0% 8.76ns ± 0% ~ (all samples are equal) CompareBytesSameLength-6 10.6ns ± 1% 10.6ns ± 1% ~ (p=0.240 n=20+20) CompareBytesDifferentLength-6 10.6ns ± 0% 10.6ns ± 1% ~ (p=1.000 n=20+20) CompareBytesBigUnaligned-6 132±s ± 1% 105±s ± 1% -20.61% (p=0.000 n=20+18) CompareBytesBig-6 125±s ± 1% 105±s ± 1% -16.31% (p=0.000 n=20+20) CompareBytesBigIdentical-6 8.13ns ± 0% 8.13ns ± 0% ~ (all samples are equal) name old speed new speed delta CompareBytesBigUnaligned-6 7.94GB/s ± 1% 10.01GB/s ± 1% +25.96% (p=0.000 n=20+18) CompareBytesBig-6 8.38GB/s ± 1% 10.01GB/s ± 1% +19.48% (p=0.000 n=20+20) CompareBytesBigIdentical-6 129TB/s ± 0% 129TB/s ± 0% +0.01% (p=0.003 n=17+19) Change-Id: I820f31bab4582dd4204b146bb077c0d2f24cd8f5 Reviewed-on: https://go-review.googlesource.com/16434 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Klaus Post <klauspost@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2015-11-02 18:39:38 +00:00
Michael Hudson-Doyle	35d71d6727	cmd/go, runtime: define GOBUILDMODE_shared rather than shared when dynamically linking To avoid collisions with what existing code may already be doing. Change-Id: Ice639440aafc0724714c25333d90a49954372230 Reviewed-on: https://go-review.googlesource.com/16503 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-11-01 19:52:33 +00:00
Austin Clements	fbf273250f	runtime: perform mark 2 root re-scanning in GC workers This moves another root scanning task out of the GC coordinator and parallelizes it on the GC workers. This has negligible effect on the go1 benchmarks and the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.24ms ± 1% 5.26ms ± 1% +0.30% (p=0.007 n=18+17) name old time/op new time/op delta BinaryTree17-12 3.20s ± 5% 3.21s ± 5% ~ (p=0.264 n=20+18) Fannkuch11-12 2.46s ± 1% 2.54s ± 2% +3.09% (p=0.000 n=18+20) FmtFprintfEmpty-12 49.9ns ± 4% 50.0ns ± 5% ~ (p=0.356 n=20+20) FmtFprintfString-12 170ns ± 1% 170ns ± 2% ~ (p=0.815 n=19+20) FmtFprintfInt-12 160ns ± 1% 159ns ± 1% -0.63% (p=0.003 n=18+19) FmtFprintfIntInt-12 270ns ± 1% 267ns ± 1% -1.00% (p=0.000 n=19+18) FmtFprintfPrefixedInt-12 238ns ± 1% 232ns ± 1% -2.28% (p=0.000 n=19+19) FmtFprintfFloat-12 310ns ± 2% 313ns ± 2% +0.93% (p=0.000 n=19+19) FmtManyArgs-12 1.06µs ± 1% 1.04µs ± 1% -1.93% (p=0.000 n=20+19) GobDecode-12 8.63ms ± 1% 8.70ms ± 1% +0.81% (p=0.001 n=20+19) GobEncode-12 6.52ms ± 1% 6.56ms ± 1% +0.66% (p=0.000 n=20+19) Gzip-12 318ms ± 1% 319ms ± 1% ~ (p=0.405 n=17+18) Gunzip-12 42.1ms ± 2% 42.0ms ± 1% ~ (p=0.771 n=20+19) HTTPClientServer-12 62.6µs ± 1% 62.9µs ± 1% +0.41% (p=0.038 n=20+20) JSONEncode-12 16.9ms ± 1% 16.9ms ± 1% ~ (p=0.077 n=18+20) JSONDecode-12 60.7ms ± 1% 62.3ms ± 1% +2.73% (p=0.000 n=20+20) Mandelbrot200-12 3.86ms ± 1% 3.85ms ± 1% ~ (p=0.084 n=19+20) GoParse-12 3.75ms ± 2% 3.73ms ± 1% ~ (p=0.107 n=20+19) RegexpMatchEasy0_32-12 100ns ± 2% 101ns ± 2% +0.97% (p=0.001 n=20+19) RegexpMatchEasy0_1K-12 342ns ± 2% 332ns ± 2% -2.86% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 83.2ns ± 2% 82.8ns ± 2% ~ (p=0.108 n=19+20) RegexpMatchEasy1_1K-12 495ns ± 2% 490ns ± 2% -1.04% (p=0.000 n=18+19) RegexpMatchMedium_32-12 130ns ± 2% 131ns ± 2% ~ (p=0.291 n=20+20) RegexpMatchMedium_1K-12 39.3µs ± 1% 39.9µs ± 1% +1.54% (p=0.000 n=18+20) RegexpMatchHard_32-12 2.02µs ± 1% 2.05µs ± 2% +1.19% (p=0.000 n=19+19) RegexpMatchHard_1K-12 60.9µs ± 1% 61.5µs ± 1% +0.99% (p=0.000 n=18+18) Revcomp-12 535ms ± 1% 531ms ± 1% -0.82% (p=0.000 n=17+17) Template-12 73.0ms ± 1% 74.1ms ± 1% +1.47% (p=0.000 n=20+20) TimeParse-12 356ns ± 2% 348ns ± 1% -2.30% (p=0.000 n=20+20) TimeFormat-12 347ns ± 1% 353ns ± 1% +1.68% (p=0.000 n=19+20) [Geo mean] 62.3µs 62.4µs +0.12% name old speed new speed delta GobDecode-12 88.9MB/s ± 1% 88.2MB/s ± 1% -0.81% (p=0.001 n=20+19) GobEncode-12 118MB/s ± 1% 117MB/s ± 1% -0.66% (p=0.000 n=20+19) Gzip-12 60.9MB/s ± 1% 60.8MB/s ± 1% ~ (p=0.409 n=17+18) Gunzip-12 461MB/s ± 2% 462MB/s ± 1% ~ (p=0.765 n=20+19) JSONEncode-12 115MB/s ± 1% 115MB/s ± 1% ~ (p=0.078 n=18+20) JSONDecode-12 32.0MB/s ± 1% 31.1MB/s ± 1% -2.65% (p=0.000 n=20+20) GoParse-12 15.5MB/s ± 2% 15.5MB/s ± 1% ~ (p=0.111 n=20+19) RegexpMatchEasy0_32-12 318MB/s ± 2% 314MB/s ± 2% -1.27% (p=0.000 n=20+19) RegexpMatchEasy0_1K-12 2.99GB/s ± 1% 3.08GB/s ± 2% +2.94% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 385MB/s ± 2% 386MB/s ± 2% ~ (p=0.105 n=19+20) RegexpMatchEasy1_1K-12 2.07GB/s ± 1% 2.09GB/s ± 2% +1.06% (p=0.000 n=18+19) RegexpMatchMedium_32-12 7.64MB/s ± 2% 7.61MB/s ± 1% ~ (p=0.179 n=20+20) RegexpMatchMedium_1K-12 26.1MB/s ± 1% 25.7MB/s ± 1% -1.52% (p=0.000 n=18+20) RegexpMatchHard_32-12 15.8MB/s ± 1% 15.6MB/s ± 2% -1.18% (p=0.000 n=19+19) RegexpMatchHard_1K-12 16.8MB/s ± 2% 16.6MB/s ± 1% -0.90% (p=0.000 n=19+18) Revcomp-12 475MB/s ± 1% 479MB/s ± 1% +0.83% (p=0.000 n=17+17) Template-12 26.6MB/s ± 1% 26.2MB/s ± 1% -1.45% (p=0.000 n=20+20) [Geo mean] 99.0MB/s 98.7MB/s -0.32% Change-Id: I6ea44d7a59aaa6851c64695277ab65645ff9d32e Reviewed-on: https://go-review.googlesource.com/16070 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-10-30 22:46:39 +00:00
Austin Clements	82d14d77da	runtime: perform concurrent scan in GC workers Currently the concurrent root scan is performed in its entirety by the GC coordinator before entering concurrent mark (which enables GC workers). This scan is done sequentially, which can prolong the scan phase, delay the mark phase, and means that the scan phase does not obey the 25% CPU goal. Furthermore, there's no need to complete the root scan before starting marking (in fact, we already allow GC assists to happen during the scan phase), so this acts as an unnecessary barrier between root scanning and marking. This change shifts the root scan work out of the GC coordinator and in to the GC workers. The coordinator simply sets up the scan state and enqueues the right number of root scan jobs. The GC workers then drain the root scan jobs prior to draining heap scan jobs. This parallelizes the root scan process, makes it obey the 25% CPU goal, and effectively eliminates root scanning as an isolated phase, allowing the system to smoothly transition from root scanning to heap marking. This also eliminates a major non-STW responsibility of the GC coordinator, which will make it easier to switch to a decentralized state machine. Finally, it puts us in a good position to perform root scanning in assists as well, which will help satisfy assists at the beginning of the GC cycle. This is mostly straightforward. One tricky aspect is that we have to deal with preemption deadlock: where two non-preemptible gorountines are trying to preempt each other to perform a stack scan. Given the context where this happens, the only instance of this is two background workers trying to scan each other. We avoid this by simply not scanning the stacks of background workers during the concurrent phase; this is safe because we'll scan them during mark termination (and their stacks are very small and should not contain any new pointers). This change also switches the root marking during mark termination to use the same gcDrain-based code path as concurrent mark. This shouldn't affect performance because STW root marking was already parallel and tasks switched to heap marking immediately when no more root marking tasks were available. However, it simplifies the code and unifies these code paths. This has negligible effect on the go1 benchmarks. It slightly slows down the garbage benchmark, possibly by making GC run slightly more frequently. name old time/op new time/op delta XBenchGarbage-12 5.10ms ± 1% 5.24ms ± 1% +2.87% (p=0.000 n=18+18) name old time/op new time/op delta BinaryTree17-12 3.25s ± 3% 3.20s ± 5% -1.57% (p=0.013 n=20+20) Fannkuch11-12 2.45s ± 1% 2.46s ± 1% +0.38% (p=0.019 n=20+18) FmtFprintfEmpty-12 49.7ns ± 3% 49.9ns ± 4% ~ (p=0.851 n=19+20) FmtFprintfString-12 170ns ± 2% 170ns ± 1% ~ (p=0.775 n=20+19) FmtFprintfInt-12 161ns ± 1% 160ns ± 1% -0.78% (p=0.000 n=19+18) FmtFprintfIntInt-12 267ns ± 1% 270ns ± 1% +1.04% (p=0.000 n=19+19) FmtFprintfPrefixedInt-12 238ns ± 2% 238ns ± 1% ~ (p=0.133 n=18+19) FmtFprintfFloat-12 311ns ± 1% 310ns ± 2% -0.35% (p=0.023 n=20+19) FmtManyArgs-12 1.08µs ± 1% 1.06µs ± 1% -2.31% (p=0.000 n=20+20) GobDecode-12 8.65ms ± 1% 8.63ms ± 1% ~ (p=0.377 n=18+20) GobEncode-12 6.49ms ± 1% 6.52ms ± 1% +0.37% (p=0.015 n=20+20) Gzip-12 319ms ± 3% 318ms ± 1% ~ (p=0.975 n=19+17) Gunzip-12 41.9ms ± 1% 42.1ms ± 2% +0.65% (p=0.004 n=19+20) HTTPClientServer-12 61.7µs ± 1% 62.6µs ± 1% +1.40% (p=0.000 n=18+20) JSONEncode-12 16.8ms ± 1% 16.9ms ± 1% ~ (p=0.239 n=20+18) JSONDecode-12 58.4ms ± 1% 60.7ms ± 1% +3.85% (p=0.000 n=19+20) Mandelbrot200-12 3.86ms ± 0% 3.86ms ± 1% ~ (p=0.092 n=18+19) GoParse-12 3.75ms ± 2% 3.75ms ± 2% ~ (p=0.708 n=19+20) RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 2% +0.60% (p=0.010 n=17+20) RegexpMatchEasy0_1K-12 341ns ± 1% 342ns ± 2% ~ (p=0.203 n=20+19) RegexpMatchEasy1_32-12 82.5ns ± 2% 83.2ns ± 2% +0.83% (p=0.007 n=19+19) RegexpMatchEasy1_1K-12 495ns ± 1% 495ns ± 2% ~ (p=0.970 n=19+18) RegexpMatchMedium_32-12 130ns ± 2% 130ns ± 2% +0.59% (p=0.039 n=19+20) RegexpMatchMedium_1K-12 39.2µs ± 1% 39.3µs ± 1% ~ (p=0.214 n=18+18) RegexpMatchHard_32-12 2.03µs ± 2% 2.02µs ± 1% ~ (p=0.166 n=18+19) RegexpMatchHard_1K-12 61.0µs ± 1% 60.9µs ± 1% ~ (p=0.169 n=20+18) Revcomp-12 533ms ± 1% 535ms ± 1% ~ (p=0.071 n=19+17) Template-12 68.1ms ± 2% 73.0ms ± 1% +7.26% (p=0.000 n=19+20) TimeParse-12 355ns ± 2% 356ns ± 2% ~ (p=0.530 n=19+20) TimeFormat-12 357ns ± 2% 347ns ± 1% -2.59% (p=0.000 n=20+19) [Geo mean] 62.1µs 62.3µs +0.31% name old speed new speed delta GobDecode-12 88.7MB/s ± 1% 88.9MB/s ± 1% ~ (p=0.377 n=18+20) GobEncode-12 118MB/s ± 1% 118MB/s ± 1% -0.37% (p=0.015 n=20+20) Gzip-12 60.9MB/s ± 3% 60.9MB/s ± 1% ~ (p=0.944 n=19+17) Gunzip-12 464MB/s ± 1% 461MB/s ± 2% -0.64% (p=0.004 n=19+20) JSONEncode-12 115MB/s ± 1% 115MB/s ± 1% ~ (p=0.236 n=20+18) JSONDecode-12 33.2MB/s ± 1% 32.0MB/s ± 1% -3.71% (p=0.000 n=19+20) GoParse-12 15.5MB/s ± 2% 15.5MB/s ± 2% ~ (p=0.702 n=19+20) RegexpMatchEasy0_32-12 320MB/s ± 1% 318MB/s ± 2% ~ (p=0.094 n=18+20) RegexpMatchEasy0_1K-12 3.00GB/s ± 1% 2.99GB/s ± 1% ~ (p=0.194 n=20+19) RegexpMatchEasy1_32-12 388MB/s ± 2% 385MB/s ± 2% -0.83% (p=0.008 n=19+19) RegexpMatchEasy1_1K-12 2.07GB/s ± 1% 2.07GB/s ± 1% ~ (p=0.964 n=19+18) RegexpMatchMedium_32-12 7.68MB/s ± 1% 7.64MB/s ± 2% -0.57% (p=0.020 n=19+20) RegexpMatchMedium_1K-12 26.1MB/s ± 1% 26.1MB/s ± 1% ~ (p=0.211 n=18+18) RegexpMatchHard_32-12 15.8MB/s ± 1% 15.8MB/s ± 1% ~ (p=0.180 n=18+19) RegexpMatchHard_1K-12 16.8MB/s ± 1% 16.8MB/s ± 2% ~ (p=0.236 n=20+19) Revcomp-12 477MB/s ± 1% 475MB/s ± 1% ~ (p=0.071 n=19+17) Template-12 28.5MB/s ± 2% 26.6MB/s ± 1% -6.77% (p=0.000 n=19+20) [Geo mean] 100MB/s 99.0MB/s -0.82% Change-Id: I875bf6ceb306d1ee2f470cabf88aa6ede27c47a0 Reviewed-on: https://go-review.googlesource.com/16059 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-30 22:46:31 +00:00
Austin Clements	4cca1cc05e	runtime: consolidate "out of GC work" checks We already have gcMarkWorkAvailable, but the check for GC mark work is open-coded in several places. Generalize gcMarkWorkAvailable slightly and replace these open-coded checks with calls to gcMarkWorkAvailable. In addition to cleaning up the code, this puts us in a better position to make this check slightly more complicated. Change-Id: I1b29883300ecd82a1bf6be193e9b4ee96582a860 Reviewed-on: https://go-review.googlesource.com/16058 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-30 22:46:22 +00:00
Russ Cox	bf1de1b141	runtime: introduce GOTRACEBACK=single, now the default Abandon (but still support) the old numbering system. GOTRACEBACK=none is old 0 GOTRACEBACK=single is the new behavior GOTRACEBACK=all is old 1 GOTRACEBACK=system is old 2 GOTRACEBACK=crash is unchanged See doc comment change in runtime1.go for details. Filed #13107 to decide whether to change default back to GOTRACEBACK=all for Go 1.6 release. If you run into programs where printing only the current goroutine omits needed information, please add details in a comment on that issue. Fixes #12366. Change-Id: I82ca8b99b5d86dceb3f7102d38d2659d45dbe0db Reviewed-on: https://go-review.googlesource.com/16512 Reviewed-by: Austin Clements <austin@google.com>	2015-10-30 18:43:44 +00:00
Michael Hudson-Doyle	c9b8cab16c	cmd/internal/obj, cmd/link, runtime: handle TLS more like a platform linker on ppc64 On ppc64x, the thread pointer, held in R13, points 0x7000 bytes past where thread-local storage begins (presumably to maximize the amount of storage that can be accessed with a 16-bit signed displacement). The relocations used to indicate thread-local storage to the platform linker account for this, so to be able to support external linking we need to change things so the linker applies this offset instead of the runtime assembly. Change-Id: I2556c249ab2d802cae62c44b2b4c5b44787d7059 Reviewed-on: https://go-review.googlesource.com/14233 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2015-10-29 22:24:29 +00:00
Michael Hudson-Doyle	8537ff8a39	runtime/cgo: export _cgo_reginit on ppc64x This is needed to make external linking work. Change-Id: I4cf7edb4ea318849cab92a697952f8745eed40c4 Reviewed-on: https://go-review.googlesource.com/14237 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-29 00:38:43 +00:00
Ian Lance Taylor	f6fd086d5e	runtime: add missing word in comment Change-Id: Iffe27445e35ec071cf0920a05c81b8a97a3ed712 Reviewed-on: https://go-review.googlesource.com/16431 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-28 23:09:44 +00:00
David Crawshaw	73ff7cb1ed	runtime: c-shared entrypoint for linux/arm64 Change-Id: I7dab124842f5209097a8d5a802fcbdde650654fa Reviewed-on: https://go-review.googlesource.com/16395 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-28 21:21:33 +00:00
Hyang-Ah Hana Kim	dfc8649854	runtime, cmd: TLS setup for android/amd64. Android linker does not handle TLS for us. We set up the TLS slot for g, as darwin/386,amd64 handle instead. This is disgusting and fragile. We will eventually fix this ugly hack by taking advantage of the recent TLS IE model implementation. (Instead of referencing an GOT entry, make the code sequence look into the TLS variable that holds the offset.) The TLS slot for g in android/amd64 assumes a fixed offset from %fs. See runtime/cgo/gcc_android_amd64.c for details. For golang/go#10743 Change-Id: I1a3fc207946c665515f79026a56ea19134ede2dd Reviewed-on: https://go-review.googlesource.com/15991 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-10-28 20:54:28 +00:00
Michael Hudson-Doyle	72180c3b82	cmd/internal/obj, cmd/link, runtime: native-ish support for tls on arm64 Fixes #10560 Change-Id: Iedffd9c236c4fbb386c3afc52c5a1457f96ef122 Reviewed-on: https://go-review.googlesource.com/13991 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-10-28 19:51:05 +00:00
David du Colombier	31430bda09	runtime: don't use FP when calling nextSample in the Plan 9 sighandler In the Go signal handler on Plan 9, when a signal with the _SigThrow flag is received, we call startpanic before printing the stack trace. The startpanic function calls systemstack which calls startpanic_m. In the startpanic_m function, we call allocmcache to allocate _g_.m.mcache. The problem is that allocmcache calls nextSample, which does a floating point operation to return a sampling point for heap profiling. However, Plan 9 doesn't support floating point in the signal handler. This change adds a new function nextSampleNoFP, only called when in the Plan 9 signal handler, which is similar to nextSample, but avoids floating point. Change-Id: Iaa30437aa0f7c8c84d40afbab7567ad3bd5ea2de Reviewed-on: https://go-review.googlesource.com/16307 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-28 05:45:24 +00:00
Michael Hudson-Doyle	bc3f14fd2a	runtime: invoke vsyscall helper via TCB when dynamic linking on linux/386 The dynamic linker on linux/386 stores the address of the vsyscall helper at a fixed offset from the %gs register on linux/386 for easy access from PIC code. Change-Id: I635305cfecceef2289985d62e676e16810ed6b94 Reviewed-on: https://go-review.googlesource.com/16346 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-28 01:36:25 +00:00
Matthew Dempsky	4ff231bca1	runtime: eliminate some unnecessary uintptr conversions arena_{start,used,end} are already uintptr, so no need to convert them to uintptr, much less to convert them to unsafe.Pointer and then to uintptr. No binary change to pkg/linux_amd64/runtime.a. Change-Id: Ia4232ed2a724c44fde7eba403c5fe8e6dccaa879 Reviewed-on: https://go-review.googlesource.com/16339 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2015-10-27 02:53:04 +00:00
David du Colombier	d093bf489b	runtime: handle abort note on Plan 9 Implement an abort note on Plan 9, as an equivalent of the SIGABRT signal on other operating systems. Updates #11975. Change-Id: I010c9b10f2fbd2471aacd1d073368d975a2f0592 Reviewed-on: https://go-review.googlesource.com/16300 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: David du Colombier <0intro@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-26 22:12:30 +00:00
Matthew Dempsky	d18167fefe	runtime: fix tiny allocator When a new tiny block is allocated because we're allocating an object that won't fit into the current block, mallocgc saves the new block if it has more space leftover than the old block. However, the logic for this was subtly broken in golang.org/cl/2814, resulting in never saving (or consequently reusing) a tiny block. Change-Id: Ib5f6769451fb82877ddeefe75dfe79ed4a04fd40 Reviewed-on: https://go-review.googlesource.com/16330 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-26 21:14:15 +00:00
Austin Clements	d3df04cd8c	runtime: partition data and BSS root marking Currently data and BSS root marking are each a single markroot job. This makes them difficult to load balance, which can draw out mark termination time if they are large. Fix this by splitting both in to 256K chunks. While we're putting in the infrastructure for dynamic roots, we also replace the fixed sharding of the span roots with sharding in to fixed sizes. In addition to helping balance root marking, this also paves the way to parallelizing concurrent scan and to letting assists help with root marking. Updates #10345. This fixes the data and BSS aspects of that bug; it does not partition scanning of large heap objects. This has negligible effect on either the go1 benchmarks or the garbage benchmark: name old time/op new time/op delta XBenchGarbage-12 4.90ms ± 1% 4.91ms ± 2% ~ (p=0.058 n=17+16) name old time/op new time/op delta BinaryTree17-12 3.11s ± 4% 3.12s ± 4% ~ (p=0.512 n=20+20) Fannkuch11-12 2.53s ± 2% 2.47s ± 2% -2.28% (p=0.000 n=20+18) FmtFprintfEmpty-12 49.1ns ± 1% 50.0ns ± 4% +1.68% (p=0.008 n=18+20) FmtFprintfString-12 170ns ± 0% 172ns ± 1% +1.05% (p=0.000 n=14+19) FmtFprintfInt-12 174ns ± 1% 162ns ± 1% -6.81% (p=0.000 n=18+17) FmtFprintfIntInt-12 284ns ± 1% 277ns ± 1% -2.42% (p=0.000 n=20+19) FmtFprintfPrefixedInt-12 252ns ± 1% 244ns ± 1% -2.84% (p=0.000 n=18+20) FmtFprintfFloat-12 317ns ± 0% 311ns ± 0% -1.95% (p=0.000 n=19+18) FmtManyArgs-12 1.08µs ± 1% 1.11µs ± 1% +3.43% (p=0.000 n=18+19) GobDecode-12 8.56ms ± 1% 8.61ms ± 1% +0.50% (p=0.020 n=20+20) GobEncode-12 6.58ms ± 1% 6.57ms ± 1% ~ (p=0.792 n=20+19) Gzip-12 317ms ± 3% 317ms ± 2% ~ (p=0.840 n=19+19) Gunzip-12 41.6ms ± 0% 41.6ms ± 0% +0.07% (p=0.027 n=18+15) HTTPClientServer-12 62.2µs ± 1% 62.3µs ± 1% ~ (p=0.283 n=19+20) JSONEncode-12 16.5ms ± 2% 16.5ms ± 1% ~ (p=0.857 n=20+19) JSONDecode-12 58.5ms ± 1% 61.3ms ± 1% +4.67% (p=0.000 n=18+17) Mandelbrot200-12 3.84ms ± 0% 3.84ms ± 0% ~ (p=0.259 n=17+17) GoParse-12 3.70ms ± 2% 3.74ms ± 2% +0.96% (p=0.009 n=19+20) RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 0% +0.31% (p=0.040 n=19+15) RegexpMatchEasy0_1K-12 340ns ± 1% 340ns ± 1% ~ (p=0.411 n=17+19) RegexpMatchEasy1_32-12 82.7ns ± 2% 82.3ns ± 1% ~ (p=0.456 n=20+19) RegexpMatchEasy1_1K-12 498ns ± 2% 495ns ± 0% ~ (p=0.108 n=19+17) RegexpMatchMedium_32-12 130ns ± 1% 130ns ± 2% ~ (p=0.405 n=18+19) RegexpMatchMedium_1K-12 39.4µs ± 2% 39.1µs ± 1% -0.64% (p=0.002 n=20+19) RegexpMatchHard_32-12 2.03µs ± 2% 2.02µs ± 0% ~ (p=0.561 n=20+17) RegexpMatchHard_1K-12 61.1µs ± 2% 60.8µs ± 1% ~ (p=0.615 n=19+18) Revcomp-12 532ms ± 2% 531ms ± 1% ~ (p=0.470 n=19+19) Template-12 68.5ms ± 1% 69.1ms ± 1% +0.87% (p=0.000 n=17+17) TimeParse-12 344ns ± 2% 344ns ± 1% +0.25% (p=0.032 n=19+18) TimeFormat-12 347ns ± 1% 362ns ± 1% +4.27% (p=0.000 n=17+19) [Geo mean] 62.3µs 62.3µs -0.04% name old speed new speed delta GobDecode-12 89.6MB/s ± 1% 89.2MB/s ± 1% -0.50% (p=0.019 n=20+20) GobEncode-12 117MB/s ± 1% 117MB/s ± 1% ~ (p=0.797 n=20+19) Gzip-12 61.3MB/s ± 3% 61.2MB/s ± 2% ~ (p=0.834 n=19+19) Gunzip-12 467MB/s ± 0% 466MB/s ± 0% -0.07% (p=0.027 n=18+15) JSONEncode-12 117MB/s ± 2% 117MB/s ± 1% ~ (p=0.851 n=20+19) JSONDecode-12 33.2MB/s ± 1% 31.7MB/s ± 1% -4.47% (p=0.000 n=18+17) GoParse-12 15.6MB/s ± 2% 15.5MB/s ± 2% -0.95% (p=0.008 n=19+20) RegexpMatchEasy0_32-12 321MB/s ± 2% 320MB/s ± 1% -0.57% (p=0.002 n=17+17) RegexpMatchEasy0_1K-12 3.01GB/s ± 1% 3.01GB/s ± 1% ~ (p=0.132 n=17+18) RegexpMatchEasy1_32-12 387MB/s ± 2% 389MB/s ± 1% ~ (p=0.423 n=20+19) RegexpMatchEasy1_1K-12 2.05GB/s ± 2% 2.06GB/s ± 0% ~ (p=0.129 n=19+17) RegexpMatchMedium_32-12 7.64MB/s ± 1% 7.66MB/s ± 1% ~ (p=0.258 n=18+19) RegexpMatchMedium_1K-12 26.0MB/s ± 2% 26.2MB/s ± 1% +0.64% (p=0.002 n=20+19) RegexpMatchHard_32-12 15.7MB/s ± 2% 15.8MB/s ± 1% ~ (p=0.510 n=20+17) RegexpMatchHard_1K-12 16.8MB/s ± 2% 16.8MB/s ± 1% ~ (p=0.603 n=19+18) Revcomp-12 477MB/s ± 2% 479MB/s ± 1% ~ (p=0.470 n=19+19) Template-12 28.3MB/s ± 1% 28.1MB/s ± 1% -0.85% (p=0.000 n=17+17) [Geo mean] 100MB/s 100MB/s -0.26% Change-Id: Ib0bfe0145675ce88c5a8791752f7486ac98805b4 Reviewed-on: https://go-review.googlesource.com/16043 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-26 15:42:44 +00:00
David Crawshaw	21f35b33c2	runtime: use a 64kb system stack on arm I went looking for an arm system whose stacks are by default smaller than 64KB. In fact the smallest common linux target I could find was Android, which like iOS uses 1MB stacks. Fixes #11873 Change-Id: Ieeb66ad095b3da18d47ba21360ea75152a4107c6 Reviewed-on: https://go-review.googlesource.com/14602 Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Minux Ma <minux@golang.org>	2015-10-26 15:10:34 +00:00
Caleb Spare	fb7178e7cc	runtime: copy sqrt normalization bugfix from math This copies the change from CL 16158 (applied as `22d4c8bf13`). Updates #13013 Change-Id: Id7d02e63d92806f06a4e064a91b2fb6574fe385f Reviewed-on: https://go-review.googlesource.com/16291 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-23 23:43:47 +00:00
Matthew Dempsky	8ee0fd8623	runtime: replace is{plan9,solaris,windows} with GOOS tests Change-Id: I27589395f547c5837dc7536a0ab5bc7cc23a4ff6 Reviewed-on: https://go-review.googlesource.com/10872 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-23 18:11:17 +00:00
Alex Brainman	6410e67a1e	runtime: account for cpu affinity in windows NumCPU Fixes #11671 Change-Id: Ide1f8d92637dad2a2faed391329f9b6001789b76 Reviewed-on: https://go-review.googlesource.com/14742 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Alex Brainman <alex.brainman@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-23 07:54:42 +00:00
Austin Clements	beedb1ec33	runtime: add pcvalue cache to improve stack scan speed The cost of scanning large stacks is currently dominated by the time spent looking up and decoding the pcvalue table. However, large stacks are usually large not because they contain calls to many different functions, but because they contain many calls to the same, small set of recursive functions. Hence, walking large stacks tends to make the same pcvalue queries many times. Based on this observation, this commit adds a small, very simple, and fast cache in front of pcvalue lookup. We thread this cache down from operations that make many pcvalue calls, such as gentraceback, stack scanning, and stack adjusting. This simple cache works well because it has minimal overhead when it's not effective. I also tried a hashed direct-map cache, CLOCK-based replacement, round-robin replacement, and round-robin with lookups disabled until there had been at least 16 probes, but none of these approaches had obvious wins over the random replacement policy in this commit. This nearly doubles the overall performance of the deep stack test program from issue #10898: name old time/op new time/op delta Issue10898 16.5s ±12% 9.2s ±12% -44.37% (p=0.008 n=5+5) It's a very slight win on the garbage benchmark: name old time/op new time/op delta XBenchGarbage-12 4.92ms ± 1% 4.89ms ± 1% -0.75% (p=0.000 n=18+19) It's a wash (but doesn't harm performance) on the go1 benchmarks, which don't have particularly deep stacks: name old time/op new time/op delta BinaryTree17-12 3.11s ± 2% 3.20s ± 3% +2.83% (p=0.000 n=17+20) Fannkuch11-12 2.51s ± 1% 2.51s ± 1% -0.22% (p=0.034 n=19+18) FmtFprintfEmpty-12 50.8ns ± 3% 50.6ns ± 2% ~ (p=0.793 n=20+20) FmtFprintfString-12 174ns ± 0% 174ns ± 1% +0.17% (p=0.048 n=15+20) FmtFprintfInt-12 177ns ± 0% 165ns ± 1% -6.99% (p=0.000 n=17+19) FmtFprintfIntInt-12 283ns ± 1% 284ns ± 0% +0.22% (p=0.000 n=18+15) FmtFprintfPrefixedInt-12 243ns ± 1% 244ns ± 1% +0.40% (p=0.000 n=20+19) FmtFprintfFloat-12 318ns ± 0% 319ns ± 0% +0.27% (p=0.001 n=19+20) FmtManyArgs-12 1.12µs ± 0% 1.14µs ± 0% +1.74% (p=0.000 n=19+20) GobDecode-12 8.69ms ± 0% 8.73ms ± 1% +0.46% (p=0.000 n=18+18) GobEncode-12 6.64ms ± 1% 6.61ms ± 1% -0.46% (p=0.000 n=20+20) Gzip-12 323ms ± 2% 319ms ± 1% -1.11% (p=0.000 n=20+20) Gunzip-12 42.8ms ± 0% 42.9ms ± 0% ~ (p=0.158 n=18+20) HTTPClientServer-12 63.3µs ± 1% 63.1µs ± 1% -0.35% (p=0.011 n=20+20) JSONEncode-12 16.9ms ± 1% 17.3ms ± 1% +2.84% (p=0.000 n=19+20) JSONDecode-12 59.7ms ± 0% 58.5ms ± 0% -2.05% (p=0.000 n=19+17) Mandelbrot200-12 3.92ms ± 0% 3.91ms ± 0% -0.16% (p=0.003 n=19+19) GoParse-12 3.79ms ± 2% 3.75ms ± 2% -0.91% (p=0.005 n=20+20) RegexpMatchEasy0_32-12 102ns ± 1% 101ns ± 1% -0.80% (p=0.001 n=14+20) RegexpMatchEasy0_1K-12 337ns ± 1% 346ns ± 1% +2.90% (p=0.000 n=20+19) RegexpMatchEasy1_32-12 84.4ns ± 2% 84.3ns ± 2% ~ (p=0.743 n=20+20) RegexpMatchEasy1_1K-12 502ns ± 1% 505ns ± 0% +0.64% (p=0.000 n=20+20) RegexpMatchMedium_32-12 133ns ± 1% 132ns ± 1% -0.85% (p=0.000 n=20+19) RegexpMatchMedium_1K-12 40.1µs ± 1% 39.8µs ± 1% -0.77% (p=0.000 n=18+18) RegexpMatchHard_32-12 2.08µs ± 1% 2.07µs ± 1% -0.55% (p=0.001 n=18+19) RegexpMatchHard_1K-12 62.4µs ± 1% 62.0µs ± 1% -0.74% (p=0.000 n=19+19) Revcomp-12 545ms ± 2% 545ms ± 3% ~ (p=0.771 n=19+20) Template-12 73.7ms ± 1% 72.0ms ± 0% -2.33% (p=0.000 n=20+18) TimeParse-12 358ns ± 1% 351ns ± 1% -2.07% (p=0.000 n=20+20) TimeFormat-12 369ns ± 1% 356ns ± 0% -3.53% (p=0.000 n=20+18) [Geo mean] 63.5µs 63.2µs -0.41% name old speed new speed delta GobDecode-12 88.3MB/s ± 0% 87.9MB/s ± 0% -0.43% (p=0.000 n=18+17) GobEncode-12 116MB/s ± 1% 116MB/s ± 1% +0.47% (p=0.000 n=20+20) Gzip-12 60.2MB/s ± 2% 60.8MB/s ± 1% +1.13% (p=0.000 n=20+20) Gunzip-12 453MB/s ± 0% 453MB/s ± 0% ~ (p=0.160 n=18+20) JSONEncode-12 115MB/s ± 1% 112MB/s ± 1% -2.76% (p=0.000 n=19+20) JSONDecode-12 32.5MB/s ± 0% 33.2MB/s ± 0% +2.09% (p=0.000 n=19+17) GoParse-12 15.3MB/s ± 2% 15.4MB/s ± 2% +0.92% (p=0.004 n=20+20) RegexpMatchEasy0_32-12 311MB/s ± 1% 314MB/s ± 1% +0.78% (p=0.000 n=15+19) RegexpMatchEasy0_1K-12 3.04GB/s ± 1% 2.95GB/s ± 1% -2.90% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 379MB/s ± 2% 380MB/s ± 2% ~ (p=0.779 n=20+20) RegexpMatchEasy1_1K-12 2.04GB/s ± 1% 2.02GB/s ± 0% -0.62% (p=0.000 n=20+20) RegexpMatchMedium_32-12 7.46MB/s ± 1% 7.53MB/s ± 1% +0.86% (p=0.000 n=20+19) RegexpMatchMedium_1K-12 25.5MB/s ± 1% 25.7MB/s ± 1% +0.78% (p=0.000 n=18+18) RegexpMatchHard_32-12 15.4MB/s ± 1% 15.5MB/s ± 1% +0.62% (p=0.000 n=19+19) RegexpMatchHard_1K-12 16.4MB/s ± 1% 16.5MB/s ± 1% +0.82% (p=0.000 n=20+19) Revcomp-12 466MB/s ± 2% 466MB/s ± 3% ~ (p=0.765 n=19+20) Template-12 26.3MB/s ± 1% 27.0MB/s ± 0% +2.38% (p=0.000 n=20+18) [Geo mean] 97.8MB/s 98.0MB/s +0.23% Change-Id: I281044ae0b24990ba46487cacbc1069493274bc4 Reviewed-on: https://go-review.googlesource.com/13614 Reviewed-by: Keith Randall <khr@golang.org>	2015-10-22 17:48:13 +00:00
Matthew Dempsky	1652a2c316	runtime: add mSpanList type to represent lists of mspans This CL introduces a new mSpanList type to replace the empty mspan variables that were previously used as list heads. To be type safe, the previous circular linked list data structure is now a tail queue instead. One complication of this is mSpanList_Remove needs to know the list a span is being removed from, but this appears to be computable in all circumstances. As a temporary sanity check, mSpanList_Insert and mSpanList_InsertBack record the list that an mspan has been inserted into so that mSpanList_Remove can verify that the correct list was specified. Whereas mspan is 112 bytes on amd64, mSpanList is only 16 bytes. This shrinks the size of mheap from 50216 bytes to 12584 bytes. Change-Id: I8146364753dbc3b4ab120afbb9c7b8740653c216 Reviewed-on: https://go-review.googlesource.com/15906 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Austin Clements <austin@google.com>	2015-10-22 17:12:06 +00:00
Aaron Jacobs	151f4ec95d	runtime: remove unused printpc and printbyte functions Change-Id: I40e338f6b445ca72055fc9bac0f09f0dca904e3a Reviewed-on: https://go-review.googlesource.com/16191 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-22 15:02:44 +00:00
Matthew Dempsky	5a68eb9f25	runtime: prune some dead variables Change-Id: I7a1c3079b433c4e30d72fb7d59f9594e0d5efe47 Reviewed-on: https://go-review.googlesource.com/16178 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Gerrand <adg@golang.org>	2015-10-22 03:56:19 +00:00
Matthew Dempsky	29330c118d	runtime: change fixalloc's chunk field to unsafe.Pointer It's never used as a *byte anyway, so might as well just make it an unsafe.Pointer instead. Change-Id: I68ee418781ab2fc574eeac0498f2515b5561b7a8 Reviewed-on: https://go-review.googlesource.com/16175 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-22 01:14:23 +00:00
Shenghou Ma	1948aef6e3	runtime: fix typos Change-Id: Iffc25fc80452baf090bf8ef15ab798cfaa120b8e Reviewed-on: https://go-review.googlesource.com/16154 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-22 00:40:48 +00:00
Matthew Dempsky	58e3ae2fae	runtime: split plan9 and solaris's m fields into new embedded mOS type Reduces the size of m by ~8% on linux/amd64 (1040 bytes -> 960 bytes). There are also windows-specific fields, but they're currently referenced in OS-independent source files (but only when GOOS=="windows"). Change-Id: I13e1471ff585ccced1271f74209f8ed6df14c202 Reviewed-on: https://go-review.googlesource.com/16173 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-22 00:04:52 +00:00
Matthew Dempsky	7df8ba136c	runtime: replace unsafe pointer arithmetic with array indexing Change-Id: I313819abebd4cda4a6c30fd0fd6f44cb1d09161f Reviewed-on: https://go-review.googlesource.com/16167 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-21 23:22:20 +00:00
Matthew Dempsky	84afa1be76	runtime: make iface/eface handling more type safe Change compiler-invoked interface functions to directly take iface/eface parameters instead of fInterface/interface{} to avoid needing to always convert. For the handful of functions that legitimately need to take an interface{} parameter, add efaceOf to type-safely convert interface{} to eface. Change-Id: I8928761a12fd3c771394f36adf93d3006a9fcf39 Reviewed-on: https://go-review.googlesource.com/16166 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-21 23:08:22 +00:00
Ian Lance Taylor	73f329f472	runtime, syscall: add calls to msan functions Add explicit memory sanitizer instrumentation to the runtime and syscall packages. The compiler does not instrument the runtime package. It does instrument the syscall package, but we need to add a couple of cases that it can't see. Change-Id: I2d66073f713fe67e33a6720460d2bb8f72f31394 Reviewed-on: https://go-review.googlesource.com/16164 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-10-21 19:17:46 +00:00
Matthew Dempsky	c279250946	runtime: change functype's in and out fields to []_type Allows removing a few gratuitous unsafe.Pointer conversions and parallels the type of reflect.funcType's in and out fields ([]rtype). Change-Id: Ie5ca230a94407301a854dfd8782a3180d5054bc4 Reviewed-on: https://go-review.googlesource.com/16163 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-21 18:37:45 +00:00
Ian Lance Taylor	5174df9087	runtime, runtime/msan: add msan runtime support These are the runtime support functions for letting Go code interoperate with the C/C++ memory sanitizer. Calls to msanread/msanwrite are now inserted by the compiler with the -msan option. Calls to msanmalloc/msanfree will be from other runtime functions in a subsequent CL. Change-Id: I64fb061b38cc6519153face242eccd291c07d1f2 Reviewed-on: https://go-review.googlesource.com/16162 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-21 17:50:39 +00:00
Austin Clements	a42f668654	runtime: eliminate unused _GCstw phase Change-Id: Ie94cd17e1975fdaaa418fa6a7b2d3b164fedc135 Reviewed-on: https://go-review.googlesource.com/16057 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-21 16:26:34 +00:00
Austin Clements	28f458ce5b	runtime: eliminate unnecessary ragged barrier The ragged barrier after entering the concurrent mark phase is vestigial. This used to be the point where we enabled write barriers, so it was necessary to synchronize all Ps to ensure write barriers were enabled before any marking occurred. However, we've long since switched to enabling write barriers during the concurrent scan phase, so the start-the-world at the beginning of the concurrent scan phase ensures that all Ps have enabled the write barrier. Hence, we can eliminate the old "install write barrier" phase. Fixes #11971. Change-Id: I8cdcb84b5525cef19927d51ea11ba0a4db991ea8 Reviewed-on: https://go-review.googlesource.com/16044 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-10-21 16:26:25 +00:00
Matthew Dempsky	d4a7ea1b71	runtime: add stringStructOf helper function Instead of open-coding conversions from string to unsafe.Pointer then to stringStruct, add a helper function to add some type safety. Bonus: This caught two *string values being converted to stringStruct in heapdump.go. While here, get rid of the redundant _string type, but add in a stringStructDWARF type used for generating DWARF debug info. Change-Id: I8882f8cca66ac45190270f82019a5d85db023bd2 Reviewed-on: https://go-review.googlesource.com/16131 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-10-20 23:13:27 +00:00
Aaron Jacobs	ef986fa3fc	runtime: change odd 'print1_write' file names The '1' part is left over from the C conversion, but no longer makes sense given that print1.go no longer exists. Change-Id: Iec171251370d740f234afdbd6fb1a4009fde6696 Reviewed-on: https://go-review.googlesource.com/16036 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-10-20 23:03:06 +00:00

1 2 3 4 5 ...

1440 Commits