qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-20 04:24:51 -07:00

Author	SHA1	Message	Date
David Crawshaw	c165988360	cmd/compile, etc: use nameOff in uncommonType linux/amd64 PIE: cmd/go: -62KB (0.5%) jujud: -550KB (0.7%) For #6853. Change-Id: Ieb67982abce5832e24b997506f0ae7108f747108 Reviewed-on: https://go-review.googlesource.com/22371 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-22 13:51:29 +00:00
David Crawshaw	1492e7db05	cmd/compile, etc: use nameOff for rtype string linux/amd64: cmd/go: -8KB (basically nothing) linux/amd64 PIE: cmd/go: -191KB (1.6%) jujud: -1.5MB (1.9%) Updates #6853 Fixes #15064 Change-Id: I0adbb95685e28be92e8548741df0e11daa0a9b5f Reviewed-on: https://go-review.googlesource.com/21777 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-22 10:08:05 +00:00
Austin Clements	c8bd293e56	runtime: eliminate floating garbage estimate Currently when we compute the trigger for the next GC, we do it based on an estimate of the reachable heap size at the start of the GC cycle, which is itself based on an estimate of the floating garbage. This was introduced by `4655aad` to fix a bad feedback loop that allowed the heap to grow to many times the true reachable size. However, this estimate gets easily confused by rapidly allocating applications, and, worse it's different than the heap size the trigger controller uses to compute the trigger itself. This results in the trigger controller often thinking that GC finished before it started. Since this would be a pretty great outcome from it's perspective, it sets the trigger for the next cycle as close to the next goal as possible (which is limited to 95% of the goal). Furthermore, the bad feedback loop this estimate originally fixed seems not to happen any more, suggesting it was fixed more correctly by some other change in the mean time. Finally, with the change to allocate black, it shouldn't even be theoretically possible for this bad feedback loop to occur. Hence, eliminate the floating garbage estimate and simply consider the reachable heap to be the marked heap. This harms overall throughput slightly for allocation-heavy benchmarks, but significantly improves mutator availability. Fixes #12204. This brings the average trigger in this benchmark from 0.95 (the cap) to 0.7 and the active GC utilization from ~90% to ~45%. Updates #14951. This makes the trigger controller much better behaved, so it pulls the trigger lower if assists are consuming a lot of CPU like it's supposed to, increasing mutator availability. name old time/op new time/op delta XBenchGarbage-12 2.21ms ± 1% 2.28ms ± 3% +3.29% (p=0.000 n=17+17) Some of this slow down we paid for in earlier commits. Relative to the start of the series to switch to allocate-black (the parent of "count black allocations toward scan work"), the garbage benchmark is 2.62% slower. name old time/op new time/op delta BinaryTree17-12 2.53s ± 3% 2.53s ± 3% ~ (p=0.708 n=20+19) Fannkuch11-12 2.08s ± 0% 2.08s ± 0% -0.22% (p=0.002 n=19+18) FmtFprintfEmpty-12 45.3ns ± 2% 45.2ns ± 3% ~ (p=0.505 n=20+20) FmtFprintfString-12 129ns ± 0% 131ns ± 2% +1.80% (p=0.000 n=16+19) FmtFprintfInt-12 121ns ± 2% 121ns ± 2% ~ (p=0.768 n=19+19) FmtFprintfIntInt-12 186ns ± 1% 188ns ± 3% +0.99% (p=0.000 n=19+19) FmtFprintfPrefixedInt-12 188ns ± 1% 188ns ± 1% ~ (p=0.947 n=18+16) FmtFprintfFloat-12 254ns ± 1% 255ns ± 1% +0.30% (p=0.002 n=19+17) FmtManyArgs-12 763ns ± 0% 770ns ± 0% +0.92% (p=0.000 n=18+18) GobDecode-12 7.00ms ± 1% 7.04ms ± 1% +0.61% (p=0.049 n=20+20) GobEncode-12 5.88ms ± 1% 5.88ms ± 0% ~ (p=0.641 n=18+19) Gzip-12 214ms ± 1% 215ms ± 1% +0.43% (p=0.002 n=18+19) Gunzip-12 37.6ms ± 0% 37.6ms ± 0% +0.11% (p=0.015 n=17+18) HTTPClientServer-12 76.9µs ± 2% 78.1µs ± 2% +1.44% (p=0.000 n=20+18) JSONEncode-12 15.2ms ± 2% 15.1ms ± 1% ~ (p=0.271 n=19+18) JSONDecode-12 53.1ms ± 1% 53.3ms ± 0% +0.49% (p=0.000 n=18+19) Mandelbrot200-12 4.04ms ± 1% 4.03ms ± 0% -0.33% (p=0.005 n=18+18) GoParse-12 3.29ms ± 1% 3.28ms ± 1% ~ (p=0.146 n=16+17) RegexpMatchEasy0_32-12 69.9ns ± 3% 69.5ns ± 1% ~ (p=0.785 n=20+19) RegexpMatchEasy0_1K-12 237ns ± 0% 237ns ± 0% ~ (p=1.000 n=18+18) RegexpMatchEasy1_32-12 69.5ns ± 1% 69.2ns ± 1% -0.44% (p=0.020 n=16+19) RegexpMatchEasy1_1K-12 372ns ± 1% 371ns ± 2% ~ (p=0.086 n=20+19) RegexpMatchMedium_32-12 108ns ± 3% 107ns ± 1% -1.00% (p=0.004 n=19+14) RegexpMatchMedium_1K-12 34.2µs ± 4% 34.0µs ± 2% ~ (p=0.380 n=19+20) RegexpMatchHard_32-12 1.77µs ± 4% 1.76µs ± 3% ~ (p=0.558 n=18+20) RegexpMatchHard_1K-12 53.4µs ± 4% 52.8µs ± 2% -1.10% (p=0.020 n=18+20) Revcomp-12 359ms ± 4% 377ms ± 0% +5.19% (p=0.000 n=20+18) Template-12 63.7ms ± 2% 62.9ms ± 2% -1.27% (p=0.005 n=18+20) TimeParse-12 316ns ± 2% 313ns ± 1% ~ (p=0.059 n=20+16) TimeFormat-12 329ns ± 0% 331ns ± 0% +0.39% (p=0.000 n=16+18) [Geo mean] 51.6µs 51.7µs +0.18% Change-Id: I1dce4640c8205d41717943b021039fffea863c57 Reviewed-on: https://go-review.googlesource.com/21324 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 20:07:25 +00:00
Austin Clements	6002e01e34	runtime: allocate black during GC Currently we allocate white for most of concurrent marking. This is based on the classical argument that it produces less floating garbage, since allocations during GC may not get linked into the heap and allocating white lets us reclaim these. However, it's not clear how often this actually happens, especially since our write barrier shades any pointer as soon as it's installed in the heap regardless of the color of the slot. On the other hand, allocating black has several advantages that seem to significantly outweigh this downside. 1) It naturally bounds the total scan work to the live heap size at the start of a GC cycle. Allocating white does not, and thus depends entirely on assists to prevent the heap from growing faster than it can be scanned. 2) It reduces the total amount of scan work per GC cycle by the size of newly allocated objects that are linked into the heap graph, since objects allocated black never need to be scanned. 3) It reduces total write barrier work since more objects will already be black when they are linked into the heap graph. This gives a slight overall improvement in benchmarks. name old time/op new time/op delta XBenchGarbage-12 2.24ms ± 0% 2.21ms ± 1% -1.32% (p=0.000 n=18+17) name old time/op new time/op delta BinaryTree17-12 2.60s ± 3% 2.53s ± 3% -2.56% (p=0.000 n=20+20) Fannkuch11-12 2.08s ± 1% 2.08s ± 0% ~ (p=0.452 n=19+19) FmtFprintfEmpty-12 45.1ns ± 2% 45.3ns ± 2% ~ (p=0.367 n=19+20) FmtFprintfString-12 131ns ± 3% 129ns ± 0% -1.60% (p=0.000 n=20+16) FmtFprintfInt-12 122ns ± 0% 121ns ± 2% -0.86% (p=0.000 n=16+19) FmtFprintfIntInt-12 187ns ± 1% 186ns ± 1% ~ (p=0.514 n=18+19) FmtFprintfPrefixedInt-12 189ns ± 0% 188ns ± 1% -0.54% (p=0.000 n=16+18) FmtFprintfFloat-12 256ns ± 0% 254ns ± 1% -0.43% (p=0.000 n=17+19) FmtManyArgs-12 769ns ± 0% 763ns ± 0% -0.72% (p=0.000 n=18+18) GobDecode-12 7.08ms ± 2% 7.00ms ± 1% -1.22% (p=0.000 n=20+20) GobEncode-12 5.88ms ± 0% 5.88ms ± 1% ~ (p=0.406 n=18+18) Gzip-12 214ms ± 0% 214ms ± 1% ~ (p=0.103 n=17+18) Gunzip-12 37.6ms ± 0% 37.6ms ± 0% ~ (p=0.563 n=17+17) HTTPClientServer-12 77.2µs ± 3% 76.9µs ± 2% ~ (p=0.606 n=20+20) JSONEncode-12 15.1ms ± 1% 15.2ms ± 2% ~ (p=0.138 n=19+19) JSONDecode-12 53.3ms ± 1% 53.1ms ± 1% -0.33% (p=0.000 n=19+18) Mandelbrot200-12 4.04ms ± 1% 4.04ms ± 1% ~ (p=0.075 n=19+18) GoParse-12 3.30ms ± 1% 3.29ms ± 1% -0.57% (p=0.000 n=18+16) RegexpMatchEasy0_32-12 69.5ns ± 1% 69.9ns ± 3% ~ (p=0.822 n=18+20) RegexpMatchEasy0_1K-12 237ns ± 1% 237ns ± 0% ~ (p=0.398 n=19+18) RegexpMatchEasy1_32-12 69.8ns ± 2% 69.5ns ± 1% ~ (p=0.090 n=20+16) RegexpMatchEasy1_1K-12 371ns ± 1% 372ns ± 1% ~ (p=0.178 n=19+20) RegexpMatchMedium_32-12 108ns ± 2% 108ns ± 3% ~ (p=0.124 n=20+19) RegexpMatchMedium_1K-12 33.9µs ± 2% 34.2µs ± 4% ~ (p=0.309 n=20+19) RegexpMatchHard_32-12 1.75µs ± 2% 1.77µs ± 4% +1.28% (p=0.018 n=19+18) RegexpMatchHard_1K-12 52.7µs ± 1% 53.4µs ± 4% +1.23% (p=0.013 n=15+18) Revcomp-12 354ms ± 1% 359ms ± 4% +1.27% (p=0.043 n=20+20) Template-12 63.6ms ± 2% 63.7ms ± 2% ~ (p=0.654 n=20+18) TimeParse-12 313ns ± 1% 316ns ± 2% +0.80% (p=0.014 n=17+20) TimeFormat-12 332ns ± 0% 329ns ± 0% -0.66% (p=0.000 n=16+16) [Geo mean] 51.7µs 51.6µs -0.09% Change-Id: I2214a6a0e4f544699ea166073249a8efdf080dc0 Reviewed-on: https://go-review.googlesource.com/21323 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 20:07:22 +00:00
Austin Clements	64a26b79ac	runtime: simplify/optimize allocate-black a bit Currently allocating black switches to the system stack (which is probably a historical accident) and atomically updates the global bytes marked stat. Since we're about to depend on this much more, optimize it a bit by putting it back on the regular stack and updating the per-P bytes marked stat, which gets lazily folded into the global bytes marked stat. Change-Id: Ibbe16e5382d3fd2256e4381f88af342bf7020b04 Reviewed-on: https://go-review.googlesource.com/22170 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 20:07:20 +00:00
Austin Clements	479501c14c	runtime: count black allocations toward scan work Currently we count black allocations toward the scannable heap size, but not toward the scan work we've done so far. This is clearly inconsistent (we have, in effect, scanned these allocations and since they're already black, we're not going to scan them again). Worse, it means we don't count black allocations toward the scannable heap size as of the next GC because this is based on the amount of scan work we did in this cycle. Fix this by counting black allocations as scan work. Currently the GC spends very little time in allocate-black mode, so this probably hasn't been a problem, but this will become important when we switch to always allocating black. Change-Id: If6ff693b070c385b65b6ecbbbbf76283a0f9d990 Reviewed-on: https://go-review.googlesource.com/22119 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 20:07:17 +00:00
Martin Möhrmann	7e460e70d9	runtime: use type int to specify size for newarray Consistently use type int for the size argument of runtime.newarray, runtime.reflect_unsafe_NewArray and reflect.unsafe_NewArray. Change-Id: Ic77bf2dde216c92ca8c49462f8eedc0385b6314e Reviewed-on: https://go-review.googlesource.com/22311 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-21 04:15:14 +00:00
Keith Randall	60fd32a47f	cmd/compile: change the way we handle large map values mapaccess{1,2} returns a pointer to the value. When the key is not in the map, it returns a pointer to zeroed memory. Currently, for large map values we have a complicated scheme which dynamically allocates zeroed memory for this purpose. It is ugly code and requires an atomic.Load in a bunch of places we'd rather not have it. Switch to a scheme where callsites of mapaccess{1,2} which expect large return values pass in a pointer to zeroed memory that mapaccess can return if the key is not found. This avoids the atomic.Load on all map accesses with a few extra instructions only for the large value acccesses, plus a bit of bss space. There was a time (1.4 & 1.5?) where we did something like this but all the tricks to make the right size zero value were done by the linker. That scheme broke in the presence of dyamic linking. The scheme in this CL works even when dynamic linking. Fixes #12337 Change-Id: Ic2d0319944af33bbb59785938d9ab80958d1b4b1 Reviewed-on: https://go-review.googlesource.com/22221 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2016-04-20 21:15:31 +00:00
Keith Randall	001e8e8070	runtime: simplify mallocgc flag argument mallocgc can calculate noscan itself. The only remaining flag argument is needzero, so we just make that a boolean arg. Fixes #15379 Change-Id: I839a70790b2a0c9dbcee2600052bfbd6c8148e20 Reviewed-on: https://go-review.googlesource.com/22290 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-20 14:02:22 +00:00
Keith Randall	bfe0cbdc50	cmd/compile,runtime: pass elem type to {make,grow}slice No point in passing the slice type to these functions. All they need is the element type. One less indirection, maybe a few less []T type descriptors in the binary. Change-Id: Ib0b83b5f14ca21d995ecc199ce8ac00c4eb375e6 Reviewed-on: https://go-review.googlesource.com/22275 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2016-04-20 00:31:16 +00:00
Josh Bleecher Snyder	0150f15a92	runtime: call mallocgc directly from makeslice and growslice The extra checks provided by newarray are redundant in these cases. This shrinks by one frame the call stack expected by the pprof test. name old time/op new time/op delta MakeSlice-8 34.3ns ± 2% 30.5ns ± 3% -11.03% (p=0.000 n=24+22) GrowSlicePtr-8 134ns ± 2% 129ns ± 3% -3.25% (p=0.000 n=25+24) Change-Id: Icd828655906b921c732701fd9d61da3fa217b0af Reviewed-on: https://go-review.googlesource.com/22276 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-04-20 00:05:36 +00:00
Julia Hansbrough	58012ea785	runtime: updated SIGSYS to cause a panic + stacktrace On GNU/Linux, SIGSYS is specified to cause the process to terminate without a core dump. In https://codereview.appspot.com/3749041 , it appears that Golang accidentally introduced incorrect behavior for this signal, which caused Golang processes to keep running after receiving SIGSYS. This change reverts it to the old/correct behavior. Updates #15204 Change-Id: I3aa48a9499c1bc36fa5d3f40c088fdd7599e0db5 Reviewed-on: https://go-review.googlesource.com/22202 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-19 22:48:31 +00:00
Keith Randall	998c8e034c	cmd/compile: convT2{I,E} don't handle direct interfaces We now inline type to interface conversions when the type is pointer-shaped. No need to keep code to handle that in convT2{I,E}. Change-Id: I3a6668259556077cbb2986a9e8fe42a625d506c9 Reviewed-on: https://go-review.googlesource.com/22249 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michel Lespinasse <walken@google.com>	2016-04-19 22:27:08 +00:00
Josh Bleecher Snyder	a4dd6ea152	runtime: add maxSliceCap This avoids expensive division calculations for many common slice element sizes. name old time/op new time/op delta MakeSlice-8 51.9ns ± 3% 35.1ns ± 2% -32.41% (p=0.000 n=10+10) GrowSliceBytes-8 44.1ns ± 2% 44.1ns ± 1% ~ (p=0.984 n=10+10) GrowSliceInts-8 60.9ns ± 3% 60.9ns ± 3% ~ (p=0.698 n=10+10) GrowSlicePtr-8 131ns ± 1% 120ns ± 2% -8.41% (p=0.000 n=8+10) GrowSliceStruct24Bytes-8 111ns ± 2% 103ns ± 3% -7.23% (p=0.000 n=8+8) Change-Id: I2630eb3d73c814db030cad16e620ea7fecbbd312 Reviewed-on: https://go-review.googlesource.com/22223 Reviewed-by: Keith Randall <khr@golang.org>	2016-04-19 21:38:52 +00:00
Josh Bleecher Snyder	411a0adc9b	runtime: add benchmarks for in-place append Change-Id: I2b43cc976d2efbf8b41170be536fdd10364b65e5 Reviewed-on: https://go-review.googlesource.com/22190 Reviewed-by: Keith Randall <khr@golang.org>	2016-04-18 19:08:39 +00:00
David Crawshaw	95df0c6ab9	cmd/compile, etc: use name offset in method tables Introduce and start using nameOff for two encoded names. This pair of changes is best done together because the linker's method decoder expects the method layouts to match. Precursor to converting all existing name and *string fields to nameOff. linux/amd64: cmd/go: -45KB (0.5%) jujud: -389KB (0.6%) linux/amd64 PIE: cmd/go: -170KB (1.4%) jujud: -1.5MB (1.8%) For #6853. Change-Id: Ia044423f010fb987ce070b94c46a16fc78666ff6 Reviewed-on: https://go-review.googlesource.com/21396 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-18 09:12:41 +00:00
Austin Clements	2cdcb6f829	runtime: scavenge memory on physical page-aligned boundaries Currently the scavenger marks memory unused in multiples of the allocator page size (8K). This is safe as long as the true physical page size is 4K (or 8K), as it is on many platforms. However, on ARM64, PPC64x, and MIPS64, the physical page size is larger than 8K, so if we attempt to mark memory unused, the kernel will round the boundaries of the region out to all pages covered by the requested region, and we'll release a larger region of memory than intended. As a result, the scavenger is currently disabled on these platforms. Fix this by first rounding the region to be marked unused in to multiples of the physical page size, so that when we ask the kernel to mark it unused, it releases exactly the requested region. Fixes #9993. Change-Id: I96d5fdc2f77f9d69abadcea29bcfe55e68288cb1 Reviewed-on: https://go-review.googlesource.com/22066 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-16 21:42:43 +00:00
Austin Clements	1151473077	runtime: check that sysUnused is always physical-page aligned If sysUnused is passed an address or length that is not aligned to the physical page boundary, the kernel will unmap more memory than the caller wanted. Add a check for this. For #9993. Change-Id: I68ff03032e7b65cf0a853fe706ce21dc7f2aaaf8 Reviewed-on: https://go-review.googlesource.com/22065 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2016-04-16 21:42:40 +00:00
Austin Clements	8ce844e88e	runtime: check kernel physical page size during init The runtime hard-codes an assumed physical page size. If this is smaller than the kernel's page size or not a multiple of it, sysUnused may incorrectly release more memory to the system than intended. Add a runtime startup check that the runtime's assumed physical page is compatible with the kernel's physical page size. For #9993. Change-Id: Ida9d07f93c00ca9a95dd55fc59bf0d8a607f6728 Reviewed-on: https://go-review.googlesource.com/22064 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-16 21:42:37 +00:00
Austin Clements	d6b177d1eb	runtime: remove empty 386 archauxv archauxv no longer does anything on 386, so remove it. Change-Id: I94545238e40fa6a6832a7c3b40aedfc6c1f6a97b Reviewed-on: https://go-review.googlesource.com/22063 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-16 21:42:34 +00:00
Austin Clements	90addd3d41	runtime: common handling of _AT_RANDOM auxv The Linux kernel provides 16 bytes of random data via the auxv vector at startup. Currently we consume this separately on 386, amd64, arm, and arm64. Now that we have a common auxv parser, handle _AT_RANDOM in the common path. Change-Id: Ib69549a1d37e2d07a351cf0f44007bcd24f0d20d Reviewed-on: https://go-review.googlesource.com/22062 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-16 21:42:31 +00:00
Austin Clements	c955bb2040	runtime: common auxv parser Currently several different Linux architectures have separate copies of the auxv parser. Bring these all together into a single copy of the parser that calls out to a per-arch handler for each tag/value pair. This is in preparation for handling common auxv tags in one place. For #9993. Change-Id: Iceebc3afad6b4133b70fca7003561ae370445c10 Reviewed-on: https://go-review.googlesource.com/22061 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2016-04-16 21:42:27 +00:00
Mikio Hara	6f59ccb052	runtime: don't always unblock all signals on dragonfly, freebsd and openbsd https://golang.org/cl/10173 intrduced msigsave, ensureSigM and _SigUnblock but didn't enable the new signal save/restore mechanism for SIG{HUP,INT,QUIT,ABRT,TERM} on DragonFly BSD, FreeBSD and OpenBSD. At present, it looks like they have the implementation. This change enables the new mechanism on DragonFly BSD, FreeBSD and OpenBSD the same as Darwin, NetBSD. Change-Id: Ifb4b4743b3b4f50bfcdc7cf1fe1b59c377fa2a41 Reviewed-on: https://go-review.googlesource.com/18657 Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-15 21:20:45 +00:00
Austin Clements	7c7081f514	sync/atomic: don't atomically write pointers twice sync/atomic.StorePointer (which is implemented in runtime/atomic_pointer.go) writes the pointer twice (through two completely different code paths, no less). Fix it to only write once. Change-Id: Id3b2aef9aa9081c2cf096833e001b93d3dd1f5da Reviewed-on: https://go-review.googlesource.com/21999 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-14 21:13:26 +00:00
Austin Clements	8f6c35de2f	runtime: make sync_atomic_SwapPointer signature match sync/atomic SwapPointer is declared as func SwapPointer(addr unsafe.Pointer, new unsafe.Pointer) (old unsafe.Pointer) in sync/atomic, but defined in the runtime (where it's actually implemented) as func sync_atomic_SwapPointer(ptr unsafe.Pointer, new unsafe.Pointer) unsafe.Pointer Make ptr a unsafe.Pointer in the runtime definition to match the type in sync/atomic. Change-Id: I99bab651b995001bbe54f9e790fdef2417ef0e9e Reviewed-on: https://go-review.googlesource.com/21998 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2016-04-14 21:13:23 +00:00
Keith Randall	98b6febcef	runtime/internal/sys: better fallback algorithms for intrinsics Use deBruijn sequences to count low-order zeros. Reorg bswap to not use &^, it takes another instruction on x86. Change-Id: I4a5ed9fd16ee6a279d88c067e8a2ba11de821156 Reviewed-on: https://go-review.googlesource.com/22084 Reviewed-by: David Chase <drchase@google.com>	2016-04-14 21:09:03 +00:00
Jeremy Jackins	02b8e6978a	runtime: find a home for orphaned comments These comments were left behind after runtime.h was converted from C to Go. I examined the original code and tried to move these to the places that the most sense. Change-Id: I8769d60234c0113d682f9de3bd8d6c34c450c188 Reviewed-on: https://go-review.googlesource.com/21969 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-14 18:34:09 +00:00
David Crawshaw	f120936dff	cmd/compile, etc: use name for type pkgPath By replacing the string used to represent pkgPath with a reflect.name everywhere, the embedded string for package paths inside the reflect.name can be replaced by an offset, nameOff. This reduces the number of pointers in the type information. This also moves all reflect.name types into the same section, making it possible to use nameOff more widely in later CLs. No significant binary size change for normal binaries, but: linux/amd64 PIE: cmd/go: -440KB (3.7%) jujud: -2.6MB (3.2%) For #6853. Change-Id: I3890b132a784a1090b1b72b32febfe0bea77eaee Reviewed-on: https://go-review.googlesource.com/21395 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-13 20:48:26 +00:00
Brad Fitzpatrick	73e2ad2022	runtime: rename os1_darwin.go to os_darwin.go Change-Id: If0e0bc5a85101db1e70faaab168fc2d12024eb93 Reviewed-on: https://go-review.googlesource.com/22005 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-13 20:37:12 +00:00
Brad Fitzpatrick	d9712aa82a	runtime: merge the darwin os*.go files together Merge them together into os1_darwin.go. A future CL will rename it. Change-Id: Ia4380d3296ebd5ce210908ce3582ff184566f692 Reviewed-on: https://go-review.googlesource.com/22004 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-13 20:35:09 +00:00
Austin Clements	4721ea6abc	runtime/internal/atomic: rename Storep1 to StorepNoWB Make it clear that the point of this function stores a pointer without a write barrier. sed -i -e 's/Storep1/StorepNoWB/' $(git grep -l Storep1) Updates #15270. Change-Id: Ifad7e17815e51a738070655fe3b178afdadaecf6 Reviewed-on: https://go-review.googlesource.com/21994 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2016-04-13 19:17:25 +00:00
Austin Clements	d8e8fc292a	runtime/internal/atomic: remove write barrier from Storep1 on s390x atomic.Storep1 is not supposed to invoke a write barrier (that's what atomicstorep is for), but currently does on s390x. This causes a panic in runtime.mapzero when it tries to use atomic.Storep1 to store what's actually a scalar. Fix this by eliminating the write barrier from atomic.Storep1 on s390x. Also add some documentation to atomicstorep to explain the difference between these. Fixes #15270. Change-Id: I291846732d82f090a218df3ef6351180aff54e81 Reviewed-on: https://go-review.googlesource.com/21993 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Michael Munday <munday@ca.ibm.com>	2016-04-13 16:06:51 +00:00
Lynn Boger	c4807d4cc7	runtime: improve memmove performance ppc64,ppc64le This change improves the performance of memmove on ppc64 & ppc64le mainly for moves >=32 bytes. In addition, the test to detect backward moves was enhanced to avoid backward moves if source and dest were in different types of storage, since backward moves might not always be efficient. Fixes #14507 The following shows some of the improvements from the test in the runtime package: BenchmarkMemmove32 4229.56 4717.13 1.12x BenchmarkMemmove64 6156.03 7810.42 1.27x BenchmarkMemmove128 7521.69 12468.54 1.66x BenchmarkMemmove256 6729.90 18260.33 2.71x BenchmarkMemmove512 8521.59 18033.81 2.12x BenchmarkMemmove1024 9760.92 25762.61 2.64x BenchmarkMemmove2048 10241.00 29584.94 2.89x BenchmarkMemmove4096 10399.37 31882.31 3.07x BenchmarkMemmoveUnalignedDst16 1943.69 2258.33 1.16x BenchmarkMemmoveUnalignedDst32 3885.08 3965.81 1.02x BenchmarkMemmoveUnalignedDst64 5121.63 6965.54 1.36x BenchmarkMemmoveUnalignedDst128 7212.34 11372.68 1.58x BenchmarkMemmoveUnalignedDst256 6564.52 16913.59 2.58x BenchmarkMemmoveUnalignedDst512 8364.35 17782.57 2.13x BenchmarkMemmoveUnalignedDst1024 9539.87 24914.72 2.61x BenchmarkMemmoveUnalignedDst2048 9199.23 21235.11 2.31x BenchmarkMemmoveUnalignedDst4096 10077.39 25231.99 2.50x BenchmarkMemmoveUnalignedSrc32 3249.83 3742.52 1.15x BenchmarkMemmoveUnalignedSrc64 5562.35 6627.96 1.19x BenchmarkMemmoveUnalignedSrc128 6023.98 10200.84 1.69x BenchmarkMemmoveUnalignedSrc256 6921.83 15258.43 2.20x BenchmarkMemmoveUnalignedSrc512 8593.13 16541.97 1.93x BenchmarkMemmoveUnalignedSrc1024 9730.95 22927.84 2.36x BenchmarkMemmoveUnalignedSrc2048 9793.28 21537.73 2.20x BenchmarkMemmoveUnalignedSrc4096 10132.96 26295.06 2.60x Change-Id: I73af59970d4c97c728deabb9708b31ec7e01bdf2 Reviewed-on: https://go-review.googlesource.com/21990 Reviewed-by: Bill O'Farrell <billotosyr@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-13 15:27:59 +00:00
David Crawshaw	7d469179e6	cmd/compile, etc: store method tables as offsets This CL introduces the typeOff type and a lookup method of the same name that can turn a typeOff offset into an rtype. In a typical Go binary (built with buildmode=exe, pie, c-archive, or c-shared), there is one moduledata and all typeOff values are offsets relative to firstmoduledata.types. This makes computing the pointer cheap in typical programs. With buildmode=shared (and one day, buildmode=plugin) there are multiple modules whose relative offset is determined at runtime. We identify a type in the general case by the pair of the original rtype that references it and its typeOff value. We determine the module from the original pointer, and then use the typeOff from there to compute the final rtype. To ensure there is only one rtype representing each type, the runtime initializes a typemap for each module, using any identical type from an earlier module when resolving that offset. This means that types computed from an offset match the type mapped by the pointer dynamic relocations. A series of followup CLs will replace other rtype values with typeOff (and name/string with nameOff). For types created at runtime by reflect, type offsets are treated as global IDs and reference into a reflect offset map kept by the runtime. darwin/amd64: cmd/go: -57KB (0.6%) jujud: -557KB (0.8%) linux/amd64 PIE: cmd/go: -361KB (3.0%) jujud: -3.5MB (4.2%) For #6853. Change-Id: Icf096fd884a0a0cb9f280f46f7a26c70a9006c96 Reviewed-on: https://go-review.googlesource.com/21285 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-13 13:03:11 +00:00
Matthew Dempsky	6af4e996e2	runtime: simplify setPanicOnFault slightly No need to acquire the M just to change G's paniconfault flag, and the original C implementation of SetPanicOnFault did not. The M acquisition logic is an artifact of golang.org/cl/131010044, which was started before golang.org/cl/123640043 (which introduced the current "getg" function) was submitted. Change-Id: I6d1939008660210be46904395cf5f5bbc2c8f754 Reviewed-on: https://go-review.googlesource.com/21935 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-13 06:14:06 +00:00
Keith Randall	260b7daf0a	cmd/compile: fix arg to getcallerpc getcallerpc's arg needs to point to the first argument slot. I believe this bug was introduced by Michel's itab changes (specifically https://go-review.googlesource.com/c/20902). Fixes #15145 Change-Id: Ifb2e17f3658e2136c7950dfc789b4d5706320683 Reviewed-on: https://go-review.googlesource.com/21931 Reviewed-by: Michel Lespinasse <walken@google.com>	2016-04-13 00:24:38 +00:00
David Crawshaw	f028b9f9e2	cmd/link, etc: store typelinks as offsets This is the first in a series of CLs to replace the use of pointers in binary read-only data with offsets. In standard Go binaries these CLs have a small effect, shrinking 8-byte pointers to 4-bytes. In position-independent code, it also saves the dynamic relocation for the pointer. This has a significant effect on the binary size when building as PIE, c-archive, or c-shared. darwin/amd64: cmd/go: -12KB (0.1%) jujud: -82KB (0.1%) linux/amd64 PIE: cmd/go: -86KB (0.7%) jujud: -569KB (0.7%) For #6853. Change-Id: Iad5625bbeba58dabfd4d334dbee3fcbfe04b2dcf Reviewed-on: https://go-review.googlesource.com/21284 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-12 20:32:41 +00:00
Michael Munday	78ecd61f62	runtime/cgo: add s390x support Change-Id: I64ada9fe34c3cfc4bd514ec5d8c8f4d4c99074fb Reviewed-on: https://go-review.googlesource.com/20950 Reviewed-by: Bill O'Farrell <billotosyr@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-12 15:42:36 +00:00
Michael Munday	7cbe7b1e86	runtime/internal/atomic: add s390x atomic operations Load and store instructions are atomic on the s390x. Change-Id: I0031ed2fba43f33863bca114d0fdec2e7d1ce807 Reviewed-on: https://go-review.googlesource.com/20938 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-12 15:40:48 +00:00
Dmitry Vyukov	3fafe2e888	internal/trace: support parsing of 1.5 traces 1. Parse out version from trace header. 2. Restore handling of 1.5 traces. 3. Restore optional symbolization of traces. 4. Add some canned 1.5 traces for regression testing (http benchmark trace, runtime/trace stress traces, plus one with broken timestamps). Change-Id: Idb18a001d03ded8e13c2730eeeb37c5836e31256 Reviewed-on: https://go-review.googlesource.com/21803 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-04-11 17:56:44 +00:00
Dominik Honnef	683917a721	all: use bytes.Equal, bytes.Contains and strings.Contains, again The previous cleanup was done with a buggy tool, missing some potential rewrites. Change-Id: I333467036e355f999a6a493e8de87e084f374e26 Reviewed-on: https://go-review.googlesource.com/21378 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-11 15:16:54 +00:00
Dave Cheney	720c4c016c	runtime: merge lfstack_amd64.go into lfstack_64bit.go Merge the amd64 lfstack implementation into the general 64 bit implementation. Change-Id: Id9ed61b90d2e3bc3b0246294c03eb2c92803b6ca Reviewed-on: https://go-review.googlesource.com/21707 Run-TryBot: Dave Cheney <dave@cheney.net> Reviewed-by: Minux Ma <minux@golang.org>	2016-04-11 06:18:52 +00:00
Jeremy Jackins	ba09d06e16	runtime: remove remaining references to TheChar After mdempsky's recent changes, these are the only references to "TheChar" left in the Go tree. Without the context, and without knowing the history, this is confusing. Also rename sys.TheGoos and sys.TheGoarch to sys.GOOS and sys.GOARCH. Also change the heap dump format to include sys.GOARCH rather than TheChar, which is no longer a concept. Updates #15169 (changes heapdump format) Change-Id: I3e99eeeae00ed55d7d01e6ed503d958c6e931dca Reviewed-on: https://go-review.googlesource.com/21647 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2016-04-11 04:32:07 +00:00
Martin Möhrmann	ad7448fe98	runtime: speed up makeslice by avoiding divisions Only compute the number of maximum allowed elements per slice once. name old time/op new time/op delta MakeSlice-2 55.5ns ± 1% 45.6ns ± 2% -17.88% (p=0.000 n=99+100) Change-Id: I951feffda5d11910a75e55d7e978d306d14da2c5 Reviewed-on: https://go-review.googlesource.com/21801 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-04-10 23:39:46 +00:00
Josh Bleecher Snyder	6b33b0e98e	cmd/compile: avoid a spill in append fast path Instead of spilling newlen, recalculate it. This removes a spill from the fast path, at the cost of a cheap recalculation on the (rare) growth path. This uses 8 bytes less of stack space. It generates two more bytes of code, but that is due to suboptimal register allocation; see far below. Runtime append microbenchmarks are all over the map, presumably due to incidental code movement. Sample code: func s(b []byte) []byte { b = append(b, 1, 2, 3) return b } Before: "".s t=1 size=160 args=0x30 locals=0x48 0x0000 00000 (append.go:8) TEXT "".s(SB), $72-48 0x0000 00000 (append.go:8) MOVQ (TLS), CX 0x0009 00009 (append.go:8) CMPQ SP, 16(CX) 0x000d 00013 (append.go:8) JLS 149 0x0013 00019 (append.go:8) SUBQ $72, SP 0x0017 00023 (append.go:8) FUNCDATA $0, gclocals·6432f8c6a0d23fa7bee6c5d96f21a92a(SB) 0x0017 00023 (append.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0017 00023 (append.go:9) MOVQ "".b+88(FP), CX 0x001c 00028 (append.go:9) LEAQ 3(CX), DX 0x0020 00032 (append.go:9) MOVQ DX, "".autotmp_0+64(SP) 0x0025 00037 (append.go:9) MOVQ "".b+96(FP), BX 0x002a 00042 (append.go:9) CMPQ DX, BX 0x002d 00045 (append.go:9) JGT $0, 86 0x002f 00047 (append.go:8) MOVQ "".b+80(FP), AX 0x0034 00052 (append.go:9) MOVB $1, (AX)(CX1) 0x0038 00056 (append.go:9) MOVB $2, 1(AX)(CX1) 0x003d 00061 (append.go:9) MOVB $3, 2(AX)(CX1) 0x0042 00066 (append.go:10) MOVQ AX, "".~r1+104(FP) 0x0047 00071 (append.go:10) MOVQ DX, "".~r1+112(FP) 0x004c 00076 (append.go:10) MOVQ BX, "".~r1+120(FP) 0x0051 00081 (append.go:10) ADDQ $72, SP 0x0055 00085 (append.go:10) RET 0x0056 00086 (append.go:9) LEAQ type.[]uint8(SB), AX 0x005d 00093 (append.go:9) MOVQ AX, (SP) 0x0061 00097 (append.go:9) MOVQ "".b+80(FP), BP 0x0066 00102 (append.go:9) MOVQ BP, 8(SP) 0x006b 00107 (append.go:9) MOVQ CX, 16(SP) 0x0070 00112 (append.go:9) MOVQ BX, 24(SP) 0x0075 00117 (append.go:9) MOVQ DX, 32(SP) 0x007a 00122 (append.go:9) PCDATA $0, $0 0x007a 00122 (append.go:9) CALL runtime.growslice(SB) 0x007f 00127 (append.go:9) MOVQ 40(SP), AX 0x0084 00132 (append.go:9) MOVQ 56(SP), BX 0x0089 00137 (append.go:8) MOVQ "".b+88(FP), CX 0x008e 00142 (append.go:9) MOVQ "".autotmp_0+64(SP), DX 0x0093 00147 (append.go:9) JMP 52 0x0095 00149 (append.go:9) NOP 0x0095 00149 (append.go:8) CALL runtime.morestack_noctxt(SB) 0x009a 00154 (append.go:8) JMP 0 After: "".s t=1 size=176 args=0x30 locals=0x40 0x0000 00000 (append.go:8) TEXT "".s(SB), $64-48 0x0000 00000 (append.go:8) MOVQ (TLS), CX 0x0009 00009 (append.go:8) CMPQ SP, 16(CX) 0x000d 00013 (append.go:8) JLS 151 0x0013 00019 (append.go:8) SUBQ $64, SP 0x0017 00023 (append.go:8) FUNCDATA $0, gclocals·6432f8c6a0d23fa7bee6c5d96f21a92a(SB) 0x0017 00023 (append.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0017 00023 (append.go:9) MOVQ "".b+80(FP), CX 0x001c 00028 (append.go:9) LEAQ 3(CX), DX 0x0020 00032 (append.go:9) MOVQ "".b+88(FP), BX 0x0025 00037 (append.go:9) CMPQ DX, BX 0x0028 00040 (append.go:9) JGT $0, 81 0x002a 00042 (append.go:8) MOVQ "".b+72(FP), AX 0x002f 00047 (append.go:9) MOVB $1, (AX)(CX1) 0x0033 00051 (append.go:9) MOVB $2, 1(AX)(CX1) 0x0038 00056 (append.go:9) MOVB $3, 2(AX)(CX1) 0x003d 00061 (append.go:10) MOVQ AX, "".~r1+96(FP) 0x0042 00066 (append.go:10) MOVQ DX, "".~r1+104(FP) 0x0047 00071 (append.go:10) MOVQ BX, "".~r1+112(FP) 0x004c 00076 (append.go:10) ADDQ $64, SP 0x0050 00080 (append.go:10) RET 0x0051 00081 (append.go:9) LEAQ type.[]uint8(SB), AX 0x0058 00088 (append.go:9) MOVQ AX, (SP) 0x005c 00092 (append.go:9) MOVQ "".b+72(FP), BP 0x0061 00097 (append.go:9) MOVQ BP, 8(SP) 0x0066 00102 (append.go:9) MOVQ CX, 16(SP) 0x006b 00107 (append.go:9) MOVQ BX, 24(SP) 0x0070 00112 (append.go:9) MOVQ DX, 32(SP) 0x0075 00117 (append.go:9) PCDATA $0, $0 0x0075 00117 (append.go:9) CALL runtime.growslice(SB) 0x007a 00122 (append.go:9) MOVQ 40(SP), AX 0x007f 00127 (append.go:9) MOVQ 48(SP), CX 0x0084 00132 (append.go:9) MOVQ 56(SP), BX 0x0089 00137 (append.go:9) ADDQ $3, CX 0x008d 00141 (append.go:9) MOVQ CX, DX 0x0090 00144 (append.go:8) MOVQ "".b+80(FP), CX 0x0095 00149 (append.go:9) JMP 47 0x0097 00151 (append.go:9) NOP 0x0097 00151 (append.go:8) CALL runtime.morestack_noctxt(SB) 0x009c 00156 (append.go:8) JMP 0 Observe that in the following sequence, we should use DX directly instead of using CX as a temporary register, which would make the new code a strict improvement on the old: 0x007f 00127 (append.go:9) MOVQ 48(SP), CX 0x0084 00132 (append.go:9) MOVQ 56(SP), BX 0x0089 00137 (append.go:9) ADDQ $3, CX 0x008d 00141 (append.go:9) MOVQ CX, DX 0x0090 00144 (append.go:8) MOVQ "".b+80(FP), CX Change-Id: I4ee50b18fa53865901d2d7f86c2cbb54c6fa6924 Reviewed-on: https://go-review.googlesource.com/21812 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-04-10 20:02:04 +00:00
Josh Bleecher Snyder	974c201f74	runtime: avoid unnecessary map iteration write barrier Update #14921 Change-Id: I5c5816d0193757bf7465b1e09c27ca06897df4bf Reviewed-on: https://go-review.googlesource.com/21814 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-04-10 20:01:47 +00:00
Emmanuel Odeke	e4f1d9cf2e	runtime: make execution error panic values implement the Error interface Make execution panics implement Error as mandated by https://golang.org/ref/spec#Run_time_panics, instead of panics with strings. Fixes #14965 Change-Id: I7827f898b9b9c08af541db922cc24fa0800ff18a Reviewed-on: https://go-review.googlesource.com/21214 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-10 01:16:30 +00:00
Dmitry Vyukov	0435e88a11	runtime: revert "do not call timeBeginPeriod on windows" This reverts commit `ab4c9298b8`. Sysmon critically depends on system timer resolution for retaking of Ps blocked in system calls. See #14790 for an example of a program where execution time goes from 2ms to 30ms if timeBeginPeriod(1) is not used. We can remove timeBeginPeriod(1) when we support UMS (#7876). Update #14790 Change-Id: I362b56154359b2c52d47f9f2468fe012b481cf6d Reviewed-on: https://go-review.googlesource.com/20834 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2016-04-09 16:11:41 +00:00
Dmitry Vyukov	0fb7b4cccd	runtime: emit file:line info into traces This makes traces self-contained and simplifies trace workflow in modern cloud environments where it is simpler to reach a service via HTTP than to obtain the binary. Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d Reviewed-on: https://go-review.googlesource.com/21732 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2016-04-08 20:52:30 +00:00
Michael Munday	e6f36f0cd5	runtime: add s390x support (new files and lfstack_64bit.go modifications) Change-Id: I51c0a332e3cbdab348564e5dcd27583e75e4b881 Reviewed-on: https://go-review.googlesource.com/20946 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-07 18:56:54 +00:00
Dave Cheney	9cc9e95b28	Revert "runtime: merge lfstack{Pack,Unpack} into one file" This broke solaris, which apparently does use the upper 17 bits of the address space. This reverts commit `3b02c5b1b6`. Change-Id: Iedfe54abd0384960845468205f20191a97751c0b Reviewed-on: https://go-review.googlesource.com/21652 Reviewed-by: Dave Cheney <dave@cheney.net>	2016-04-07 14:05:24 +00:00
Richard Miller	121c434f7a	runtime/pprof: make TestBlockProfile less timing dependent The test for profiling of channel blocking is timing dependent, and in particular the blockSelectRecvAsync case can fail on a slow builder (plan9_arm) when many tests are run in parallel. The child goroutine sleeps for a fixed period so the parent can be observed to block in a select call reading from the child; but if the OS process running the parent goroutine is delayed long enough, the child may wake again before the parent has reached the blocking point. By repeating the test three times, the likelihood of a blocking event is increased. Fixes #15096 Change-Id: I2ddb9576a83408d06b51ded682bf8e71e53ce59e Reviewed-on: https://go-review.googlesource.com/21604 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-07 09:57:06 +00:00
Dave Cheney	3b02c5b1b6	runtime: merge lfstack{Pack,Unpack} into one file Merge the remaining lfstack{Pack,Unpack} implemetations into one file. unsafe.Sizeof(uintptr(0)) == 4 is a constant comparison so this branch folds away at compile time. Dmitry confirmed that the upper 17 bits of an address will be zero for a user mode pointer, so there is no need to sign extend on amd64 during unpack, so we can reuse the same implementation as all othe 64 bit archs. Change-Id: I99f589416d8b181ccde5364c9c2e78e4a5efc7f1 Reviewed-on: https://go-review.googlesource.com/21597 Run-TryBot: Dave Cheney <dave@cheney.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2016-04-07 09:17:22 +00:00
Michael Hudson-Doyle	31cf1c1779	runtime: clamp OS-reported number of processors to _MaxGomaxprocs So that all Go processes do not die on startup on a system with >256 CPUs. I tested this by hacking osinit to set ncpu to 1000. Updates #15131 Change-Id: I52e061a0de97be41d684dd8b748fa9087d6f1aef Reviewed-on: https://go-review.googlesource.com/21599 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-07 00:11:25 +00:00
Dave Cheney	0c81248bf4	runtime: remove unused return value from lfstackUnpack None of the two places that call lfstackUnpack use the second argument. This simplifies a followup CL that merges the lfstack{Pack,Unpack} implementations. Change-Id: I3c93f6259da99e113d94f8c8027584da79c1ac2c Reviewed-on: https://go-review.googlesource.com/21595 Run-TryBot: Dave Cheney <dave@cheney.net> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-06 21:04:43 +00:00
Brad Fitzpatrick	2cefd12a1b	net, runtime: skip flaky tests on OpenBSD Flaky tests are a distraction and cover up real problems. File bugs instead and mark them as flaky. This moves the net/http flaky test flagging mechanism to internal/testenv. Updates #15156 Updates #15157 Updates #15158 Change-Id: I0e561cd2a09c0dec369cd4ed93bc5a2b40233dfe Reviewed-on: https://go-review.googlesource.com/21614 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-06 19:28:24 +00:00
Dave Cheney	5c7ae10f66	runtime: merge 64bit lfstack impls Merge all the 64bit lfstack impls into one file, adjust build tags to match. Merge all the comments on the various lfstack implementations for posterity. lfstack_amd64.go can probably be merged, but it is slightly different so that will happen in a followup. Change-Id: I5362d5e127daa81c9cb9d4fa8a0cc5c5e5c2707c Reviewed-on: https://go-review.googlesource.com/21591 Run-TryBot: Dave Cheney <dave@cheney.net> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 06:47:31 +00:00
Brad Fitzpatrick	8455f3a3d5	os: consolidate os{1,2}_*.go files Change-Id: I463ca59f486b2842f67f151a55f530ee10663830 Reviewed-on: https://go-review.googlesource.com/21568 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-06 05:04:47 +00:00
Brad Fitzpatrick	fd2bb1e30a	runtime: rename os1_windows.go to os_windows.go Change-Id: I11172f3d0e28f17b812e67a4db9cfe513b8e1974 Reviewed-on: https://go-review.googlesource.com/21565 Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 04:26:01 +00:00
Brad Fitzpatrick	e095f53e9b	runtime: merge os{,2}_windows.go into os1_windows.go. A future CL will rename os1_windows.go to os_windows.go. Change-Id: I223e76002dd1e9c9d1798fb0beac02c7d3bf4812 Reviewed-on: https://go-review.googlesource.com/21564 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 04:25:44 +00:00
Michael Munday	0f08dd2183	runtime: add s390x support (modified files only) Change-Id: Ib79ad4a890994ad64edb1feb79bd242d26b5b08a Reviewed-on: https://go-review.googlesource.com/20945 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-06 04:25:06 +00:00
Shenghou Ma	a2eded3421	runtime: get randomness from AT_RANDOM AUXV on linux/arm64 Fixes #15147. Change-Id: Ibfe46c747dea987787a51eb0c95ccd8c5f24f366 Reviewed-on: https://go-review.googlesource.com/21580 Run-TryBot: Minux Ma <minux@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-06 04:23:37 +00:00
Brad Fitzpatrick	34c58065e5	runtime: rename os1_linux.go to os_linux.go Change-Id: I938f61763c3256a876d62aeb54ef8c25cc4fc90e Reviewed-on: https://go-review.googlesource.com/21567 Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 03:55:38 +00:00
Brad Fitzpatrick	5103fbfdb2	runtime: merge os_linux.go into os1_linux.go Change-Id: I791c47014fe69e8529c7b2f0b9a554e47902d46c Reviewed-on: https://go-review.googlesource.com/21566 Reviewed-by: Minux Ma <minux@golang.org>	2016-04-06 03:54:47 +00:00
Brad Fitzpatrick	8556c76f88	runtime: minor Windows cleanup Change-Id: I9a8081ef1109469e9577c642156aa635188d8954 Reviewed-on: https://go-review.googlesource.com/21538 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2016-04-06 02:23:29 +00:00
David Chase	d8c815d8b5	cmd/compile: note escape of parts of closured-capture vars Missed a case for closure calls (OCALLFUNC && indirect) in esc.go:esccall. Cleanup to runtime code for windows to more thoroughly hide a technical escape. Also made code pickier about failing to late non-optional kernel32.dll. Fixes #14409. Change-Id: Ie75486a2c8626c4583224e02e4872c2875f7bca5 Reviewed-on: https://go-review.googlesource.com/20102 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-05 18:10:09 +00:00
Dmitry Vyukov	475d113b53	runtime: don't burn CPU unnecessarily Two GC-related functions, scang and casgstatus, wait in an active spin loop. Active spinning is never a good idea in user-space. Once we wait several times more than the expected wait time, something unexpected is happenning (e.g. the thread we are waiting for is descheduled or handling a page fault) and we need to yield to OS scheduler. Moreover, the expected wait time is very high for these functions: scang wait time can be tens of milliseconds, casgstatus can be hundreds of microseconds. It does not make sense to spin even for that time. go install -a std profile on a 4-core machine shows that 11% of time is spent in the active spin in scang: 6.12% compile compile [.] runtime.scang 3.27% compile compile [.] runtime.readgstatus 1.72% compile compile [.] runtime/internal/atomic.Load The active spin also increases tail latency in the case of the slightest oversubscription: GC goroutines spend whole quantum in the loop instead of executing user code. Here is scang wait time histogram during go install -a std: 13707.0000 - 1815442.7667 [ 118]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎... 1815442.7667 - 3617178.5333 [ 9]: ∎∎∎∎∎∎∎∎∎ 3617178.5333 - 5418914.3000 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ 5418914.3000 - 7220650.0667 [ 5]: ∎∎∎∎∎ 7220650.0667 - 9022385.8333 [ 12]: ∎∎∎∎∎∎∎∎∎∎∎∎ 9022385.8333 - 10824121.6000 [ 13]: ∎∎∎∎∎∎∎∎∎∎∎∎∎ 10824121.6000 - 12625857.3667 [ 15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 12625857.3667 - 14427593.1333 [ 18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 14427593.1333 - 16229328.9000 [ 18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 16229328.9000 - 18031064.6667 [ 32]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 18031064.6667 - 19832800.4333 [ 28]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 19832800.4333 - 21634536.2000 [ 6]: ∎∎∎∎∎∎ 21634536.2000 - 23436271.9667 [ 15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 23436271.9667 - 25238007.7333 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ 25238007.7333 - 27039743.5000 [ 27]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 27039743.5000 - 28841479.2667 [ 20]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 28841479.2667 - 30643215.0333 [ 10]: ∎∎∎∎∎∎∎∎∎∎ 30643215.0333 - 32444950.8000 [ 7]: ∎∎∎∎∎∎∎ 32444950.8000 - 34246686.5667 [ 4]: ∎∎∎∎ 34246686.5667 - 36048422.3333 [ 4]: ∎∎∎∎ 36048422.3333 - 37850158.1000 [ 1]: ∎ 37850158.1000 - 39651893.8667 [ 5]: ∎∎∎∎∎ 39651893.8667 - 41453629.6333 [ 2]: ∎∎ 41453629.6333 - 43255365.4000 [ 2]: ∎∎ 43255365.4000 - 45057101.1667 [ 2]: ∎∎ 45057101.1667 - 46858836.9333 [ 1]: ∎ 46858836.9333 - 48660572.7000 [ 2]: ∎∎ 48660572.7000 - 50462308.4667 [ 3]: ∎∎∎ 50462308.4667 - 52264044.2333 [ 2]: ∎∎ 52264044.2333 - 54065780.0000 [ 2]: ∎∎ and the zoomed-in first part: 13707.0000 - 19916.7667 [ 2]: ∎∎ 19916.7667 - 26126.5333 [ 2]: ∎∎ 26126.5333 - 32336.3000 [ 9]: ∎∎∎∎∎∎∎∎∎ 32336.3000 - 38546.0667 [ 8]: ∎∎∎∎∎∎∎∎ 38546.0667 - 44755.8333 [ 12]: ∎∎∎∎∎∎∎∎∎∎∎∎ 44755.8333 - 50965.6000 [ 10]: ∎∎∎∎∎∎∎∎∎∎ 50965.6000 - 57175.3667 [ 5]: ∎∎∎∎∎ 57175.3667 - 63385.1333 [ 6]: ∎∎∎∎∎∎ 63385.1333 - 69594.9000 [ 5]: ∎∎∎∎∎ 69594.9000 - 75804.6667 [ 6]: ∎∎∎∎∎∎ 75804.6667 - 82014.4333 [ 6]: ∎∎∎∎∎∎ 82014.4333 - 88224.2000 [ 4]: ∎∎∎∎ 88224.2000 - 94433.9667 [ 1]: ∎ 94433.9667 - 100643.7333 [ 1]: ∎ 100643.7333 - 106853.5000 [ 2]: ∎∎ 106853.5000 - 113063.2667 [ 0]: 113063.2667 - 119273.0333 [ 2]: ∎∎ 119273.0333 - 125482.8000 [ 2]: ∎∎ 125482.8000 - 131692.5667 [ 1]: ∎ 131692.5667 - 137902.3333 [ 1]: ∎ 137902.3333 - 144112.1000 [ 0]: 144112.1000 - 150321.8667 [ 2]: ∎∎ 150321.8667 - 156531.6333 [ 1]: ∎ 156531.6333 - 162741.4000 [ 1]: ∎ 162741.4000 - 168951.1667 [ 0]: 168951.1667 - 175160.9333 [ 0]: 175160.9333 - 181370.7000 [ 1]: ∎ 181370.7000 - 187580.4667 [ 1]: ∎ 187580.4667 - 193790.2333 [ 2]: ∎∎ 193790.2333 - 200000.0000 [ 0]: Here is casgstatus wait time histogram: 631.0000 - 5276.6333 [ 3]: ∎∎∎ 5276.6333 - 9922.2667 [ 5]: ∎∎∎∎∎ 9922.2667 - 14567.9000 [ 2]: ∎∎ 14567.9000 - 19213.5333 [ 6]: ∎∎∎∎∎∎ 19213.5333 - 23859.1667 [ 5]: ∎∎∎∎∎ 23859.1667 - 28504.8000 [ 6]: ∎∎∎∎∎∎ 28504.8000 - 33150.4333 [ 6]: ∎∎∎∎∎∎ 33150.4333 - 37796.0667 [ 2]: ∎∎ 37796.0667 - 42441.7000 [ 1]: ∎ 42441.7000 - 47087.3333 [ 3]: ∎∎∎ 47087.3333 - 51732.9667 [ 0]: 51732.9667 - 56378.6000 [ 1]: ∎ 56378.6000 - 61024.2333 [ 0]: 61024.2333 - 65669.8667 [ 0]: 65669.8667 - 70315.5000 [ 0]: 70315.5000 - 74961.1333 [ 1]: ∎ 74961.1333 - 79606.7667 [ 0]: 79606.7667 - 84252.4000 [ 0]: 84252.4000 - 88898.0333 [ 0]: 88898.0333 - 93543.6667 [ 0]: 93543.6667 - 98189.3000 [ 0]: 98189.3000 - 102834.9333 [ 0]: 102834.9333 - 107480.5667 [ 1]: ∎ 107480.5667 - 112126.2000 [ 0]: 112126.2000 - 116771.8333 [ 0]: 116771.8333 - 121417.4667 [ 0]: 121417.4667 - 126063.1000 [ 0]: 126063.1000 - 130708.7333 [ 0]: 130708.7333 - 135354.3667 [ 0]: 135354.3667 - 140000.0000 [ 1]: ∎ Ideally we eliminate the waiting by switching to async state machine for GC, but for now just yield to OS scheduler after a reasonable wait time. To choose yielding parameters I've measured golang.org/x/benchmarks/http tail latencies with different yield delays and oversubscription levels. With no oversubscription (to the degree possible): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 1.41ms ±15% 1.41ms ± 5% ~ (p=0.611 n=13+12) Latency-95 5.21ms ± 2% 5.15ms ± 2% -1.15% (p=0.012 n=13+13) Latency-99 7.16ms ± 2% 7.05ms ± 2% -1.54% (p=0.002 n=13+13) Latency-999 10.7ms ± 9% 10.2ms ±10% -5.46% (p=0.004 n=12+13) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 1.41ms ±15% 1.41ms ± 8% ~ (p=0.511 n=13+13) Latency-95 5.21ms ± 2% 5.14ms ± 2% -1.23% (p=0.006 n=13+13) Latency-99 7.16ms ± 2% 7.02ms ± 2% -1.94% (p=0.000 n=13+13) Latency-999 10.7ms ± 9% 10.1ms ± 8% -6.14% (p=0.000 n=12+13) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 1.41ms ±15% 1.45ms ± 6% ~ (p=0.724 n=13+13) Latency-95 5.21ms ± 2% 5.18ms ± 1% ~ (p=0.287 n=13+13) Latency-99 7.16ms ± 2% 7.05ms ± 2% -1.64% (p=0.002 n=13+13) Latency-999 10.7ms ± 9% 10.0ms ± 5% -6.72% (p=0.000 n=12+13) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 1.41ms ±15% 1.51ms ± 7% +6.57% (p=0.002 n=13+13) Latency-95 5.21ms ± 2% 5.21ms ± 2% ~ (p=0.960 n=13+13) Latency-99 7.16ms ± 2% 7.06ms ± 2% -1.50% (p=0.012 n=13+13) Latency-999 10.7ms ± 9% 10.0ms ± 6% -6.49% (p=0.000 n=12+13) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 1.41ms ±15% 1.53ms ± 6% +8.48% (p=0.000 n=13+12) Latency-95 5.21ms ± 2% 5.23ms ± 2% ~ (p=0.287 n=13+13) Latency-99 7.16ms ± 2% 7.08ms ± 2% -1.21% (p=0.004 n=13+13) Latency-999 10.7ms ± 9% 9.9ms ± 3% -7.99% (p=0.000 n=12+12) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 1.41ms ±15% 1.47ms ± 5% ~ (p=0.072 n=13+13) Latency-95 5.21ms ± 2% 5.17ms ± 2% ~ (p=0.091 n=13+13) Latency-99 7.16ms ± 2% 7.02ms ± 2% -1.99% (p=0.000 n=13+13) Latency-999 10.7ms ± 9% 9.9ms ± 5% -7.86% (p=0.000 n=12+13) With slight oversubscription (another instance of http benchmark was running in background with reduced GOMAXPROCS): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 840µs ± 3% 804µs ± 3% -4.37% (p=0.000 n=15+18) Latency-95 6.52ms ± 4% 6.03ms ± 4% -7.51% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.0ms ± 4% -7.33% (p=0.000 n=18+14) Latency-999 18.0ms ± 9% 16.8ms ± 7% -6.84% (p=0.000 n=18+18) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 840µs ± 3% 809µs ± 3% -3.71% (p=0.000 n=15+17) Latency-95 6.52ms ± 4% 6.11ms ± 4% -6.29% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 9.9ms ± 6% -7.55% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.5ms ±11% -8.49% (p=0.000 n=18+18) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 840µs ± 3% 823µs ± 5% -2.06% (p=0.002 n=15+18) Latency-95 6.52ms ± 4% 6.32ms ± 3% -3.05% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 4% -5.22% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.7ms ±10% -7.09% (p=0.000 n=18+18) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 840µs ± 3% 836µs ± 5% ~ (p=0.442 n=15+18) Latency-95 6.52ms ± 4% 6.39ms ± 3% -2.00% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 6% -5.15% (p=0.000 n=18+17) Latency-999 18.0ms ± 9% 16.6ms ± 8% -7.48% (p=0.000 n=18+18) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 840µs ± 3% 836µs ± 6% ~ (p=0.401 n=15+18) Latency-95 6.52ms ± 4% 6.40ms ± 4% -1.79% (p=0.010 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 5% -4.95% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.5ms ±14% -8.17% (p=0.000 n=18+18) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 840µs ± 3% 828µs ± 2% -1.49% (p=0.001 n=15+17) Latency-95 6.52ms ± 4% 6.38ms ± 4% -2.04% (p=0.001 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 4% -4.77% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.9ms ± 9% -6.23% (p=0.000 n=18+18) With significant oversubscription (background http benchmark was running with full GOMAXPROCS): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 1.32ms ±12% 1.30ms ±13% ~ (p=0.454 n=14+14) Latency-95 16.3ms ±10% 15.3ms ± 7% -6.29% (p=0.001 n=14+14) Latency-99 29.4ms ±10% 27.9ms ± 5% -5.04% (p=0.001 n=14+12) Latency-999 49.9ms ±19% 45.9ms ± 5% -8.00% (p=0.008 n=14+13) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 1.32ms ±12% 1.29ms ± 9% ~ (p=0.227 n=14+14) Latency-95 16.3ms ±10% 15.4ms ± 5% -5.27% (p=0.002 n=14+14) Latency-99 29.4ms ±10% 27.9ms ± 6% -5.16% (p=0.001 n=14+14) Latency-999 49.9ms ±19% 46.8ms ± 8% -6.21% (p=0.050 n=14+14) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 1.32ms ±12% 1.35ms ± 9% ~ (p=0.401 n=14+14) Latency-95 16.3ms ±10% 15.0ms ± 4% -7.67% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.4ms ± 5% -6.98% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.7ms ± 5% -10.56% (p=0.000 n=14+11) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 1.32ms ±12% 1.36ms ±10% ~ (p=0.246 n=14+14) Latency-95 16.3ms ±10% 14.9ms ± 5% -8.31% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.4ms ± 7% -6.70% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.9ms ±15% -10.13% (p=0.003 n=14+14) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 1.32ms ±12% 1.41ms ± 9% +6.37% (p=0.008 n=14+13) Latency-95 16.3ms ±10% 15.1ms ± 8% -7.45% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.5ms ±12% -6.67% (p=0.002 n=14+14) Latency-999 49.9ms ±19% 45.9ms ±16% -8.06% (p=0.019 n=14+14) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 1.32ms ±12% 1.42ms ±10% +7.21% (p=0.003 n=14+14) Latency-95 16.3ms ±10% 15.0ms ± 7% -7.59% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.3ms ± 8% -7.20% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.8ms ± 8% -10.21% (p=0.001 n=14+13) All numbers are on 8 cores and with GOGC=10 (http benchmark has tiny heap, few goroutines and low allocation rate, so by default GC barely affects tail latency). 10us/5us yield delays seem to provide a reasonable compromise and give 5-10% tail latency reduction. That's what used in this change. go install -a std results on 4 core machine: name old time/op new time/op delta Time 8.39s ± 2% 7.94s ± 2% -5.34% (p=0.000 n=47+49) UserTime 24.6s ± 2% 22.9s ± 2% -6.76% (p=0.000 n=49+49) SysTime 1.77s ± 9% 1.89s ±11% +7.00% (p=0.000 n=49+49) CpuLoad 315ns ± 2% 313ns ± 1% -0.59% (p=0.000 n=49+48) # %CPU MaxRSS 97.1ms ± 4% 97.5ms ± 9% ~ (p=0.838 n=46+49) # bytes Update #14396 Update #14189 Change-Id: I3f4109bf8f7fd79b39c466576690a778232055a2 Reviewed-on: https://go-review.googlesource.com/21503 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-04-05 15:52:03 +00:00
Dmitry Vyukov	3b246fa863	runtime: sleep less when we can do work Usleep(100) in runqgrab negatively affects latency and throughput of parallel application. We are sleeping instead of doing useful work. This is effect is particularly visible on windows where minimal sleep duration is 1-15ms. Reduce sleep from 100us to 3us and use osyield on windows. Sync chan send/recv takes ~50ns, so 3us gives us ~50x overshoot. benchmark old ns/op new ns/op delta BenchmarkChanSync-12 216 217 +0.46% BenchmarkChanSyncWork-12 27213 25816 -5.13% CPU consumption goes up from 106% to 108% in the first case, and from 107% to 125% in the second case. Test case from #14790 on windows: BenchmarkDefaultResolution-8 4583372 29720 -99.35% Benchmark1ms-8 992056 30701 -96.91% 99-th latency percentile for HTTP request serving is improved by up to 15% (see http://golang.org/cl/20835 for details). The following benchmarks are from the change that originally added this sleep (see https://golang.org/s/go15gomaxprocs): name old time/op new time/op delta Chain 22.6µs ± 2% 22.7µs ± 6% ~ (p=0.905 n=9+10) ChainBuf 22.4µs ± 3% 22.5µs ± 4% ~ (p=0.780 n=9+10) Chain-2 23.5µs ± 4% 24.9µs ± 1% +5.66% (p=0.000 n=10+9) ChainBuf-2 23.7µs ± 1% 24.4µs ± 1% +3.31% (p=0.000 n=9+10) Chain-4 24.2µs ± 2% 25.1µs ± 3% +3.70% (p=0.000 n=9+10) ChainBuf-4 24.4µs ± 5% 25.0µs ± 2% +2.37% (p=0.023 n=10+10) Powser 2.37s ± 1% 2.37s ± 1% ~ (p=0.423 n=8+9) Powser-2 2.48s ± 2% 2.57s ± 2% +3.74% (p=0.000 n=10+9) Powser-4 2.66s ± 1% 2.75s ± 1% +3.40% (p=0.000 n=10+10) Sieve 13.3s ± 2% 13.3s ± 2% ~ (p=1.000 n=10+9) Sieve-2 7.00s ± 2% 7.44s ±16% ~ (p=0.408 n=8+10) Sieve-4 4.13s ±21% 3.85s ±22% ~ (p=0.113 n=9+9) Fixes #14790 Change-Id: Ie7c6a1c4f9c8eb2f5d65ab127a3845386d6f8b5d Reviewed-on: https://go-review.googlesource.com/20835 Reviewed-by: Austin Clements <austin@google.com>	2016-04-05 15:32:06 +00:00
Alex Brainman	ffeae198d0	runtime: leave directory before removing it in TestDLLPreloadMitigation Fixes #15120 Change-Id: I1d9a192ac163826bad8b46e8c0b0b9e218e69570 Reviewed-on: https://go-review.googlesource.com/21520 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-05 04:32:49 +00:00
Alex Brainman	fcac88098b	runtime: remove race out of BenchmarkChanToSyscallPing1ms Fixes #15119 Change-Id: I31445bf282a5e2a160ff4e66c5a592b989a5798f Reviewed-on: https://go-review.googlesource.com/21448 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-05 00:52:40 +00:00
Austin Clements	61f56e925e	runtime: fix pagesInUse accounting When we grow the heap, we create a temporary "in use" span for the memory acquired from the OS and then free that span to link it into the heap. Hence, we (1) increase pagesInUse when we make the temporary span so that (2) freeing the span will correctly decrease it. However, currently step (1) increases pagesInUse by the number of pages requested from the heap, while step (2) decreases it by the number of pages requested from the OS (the size of the temporary span). These aren't necessarily the same, since we round up the number of pages we request from the OS, so steps 1 and 2 don't necessarily cancel out like they're supposed to. Over time, this can add up and cause pagesInUse to underflow and wrap around to 2^64. The garbage collector computes the sweep ratio from this, so if this happens, the sweep ratio becomes effectively infinite, causing the first allocation on each P in a sweep cycle to sweep the entire heap. This makes sweeping effectively STW. Fix this by increasing pagesInUse in step 1 by the number of pages requested from the OS, so that the two steps correctly cancel out. We add a test that checks that the running total matches the actual state of the heap. Fixes #15022. For 1.6.x. Change-Id: Iefd9d6abe37d0d447cbdbdf9941662e4f18eeffc Reviewed-on: https://go-review.googlesource.com/21280 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-04-04 15:33:26 +00:00
Alex Brainman	1f5b1b2b66	runtime: change osyield to use Windows SwitchToThread It appears that windows osyield is just 15ms sleep on my computer (see benchmarks below). Replace NtWaitForSingleObject in osyield with SwitchToThread (as suggested by Dmitry). Also add issue #14790 related benchmarks, so we can track perfomance changes in CL 20834 and CL 20835 and beyond. Update #14790 benchmark old ns/op new ns/op delta BenchmarkChanToSyscallPing1ms 1953200 1953000 -0.01% BenchmarkChanToSyscallPing15ms 31562904 31248400 -1.00% BenchmarkSyscallToSyscallPing1ms 5247 4202 -19.92% BenchmarkSyscallToSyscallPing15ms 5260 4374 -16.84% BenchmarkChanToChanPing1ms 474 494 +4.22% BenchmarkChanToChanPing15ms 468 489 +4.49% BenchmarkOsYield1ms 980018 75.5 -99.99% BenchmarkOsYield15ms 15625200 75.8 -100.00% Change-Id: I1b4cc7caca784e2548ee3c846ca07ef152ebedce Reviewed-on: https://go-review.googlesource.com/21294 Run-TryBot: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-04 10:05:05 +00:00
Christopher Nelson	ed8f0e5c33	cmd/go: fix -buildmode=c-archive should work on windows Add supporting code for runtime initialization, including both 32- and 64-bit x86 architectures. Add .ctors section on Windows to PE .o files, and INITENTRY to .ctors section to plug in to the GCC C/C++ startup initialization mechanism. This allows the Go runtime to initialize itself. Add .text section symbol for .ctor relocations. Note: This is unlikely to be useful for MSVC-based toolchains. Fixes #13494 Change-Id: I4286a96f70e5f5228acae88eef46e2bed95813f3 Reviewed-on: https://go-review.googlesource.com/18057 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2016-04-04 03:38:25 +00:00
Eric Engestrom	7a8caf7d43	all: fix spelling mistakes Signed-off-by: Eric Engestrom <eric@engestrom.ch> Change-Id: I91873aaebf79bdf1c00d38aacc1a1fb8d79656a7 Reviewed-on: https://go-review.googlesource.com/21433 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-03 17:03:15 +00:00
Brad Fitzpatrick	683448a304	runtime, syscall: only search for Windows DLLs in the System32 directory Make sure that for any DLL that Go uses itself, we only look for the DLL in the Windows System32 directory, guarding against DLL preloading attacks. (Unless the Windows version is ancient and LoadLibraryEx is unavailable, in which case the user probably has bigger security problems anyway.) This does not change the behavior of syscall.LoadLibrary or NewLazyDLL if the DLL name is something unused by Go itself. This change also intentionally does not add any new API surface. Instead, x/sys is updated with a LoadLibraryEx function and LazyDLL.Flags in: https://golang.org/cl/21388 Updates #14959 Change-Id: I8d29200559cc19edf8dcf41dbdd39a389cd6aeb9 Reviewed-on: https://go-review.googlesource.com/21140 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-01 22:55:36 +00:00
Ian Lance Taylor	59fc42b230	runtime: allocate mp.cgocallers earlier Fixes #15061. Change-Id: I71f69f398d1c5f3a884bbd044786f1a5600d0fae Reviewed-on: https://go-review.googlesource.com/21398 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-01 22:23:13 +00:00
Ian Lance Taylor	58394fd7d5	runtime/cgo: only build _cgo_callers if x_cgo_callers is defined Fixes a problem when using the external linker on Solaris. The Solaris external linker still doesn't work due to issue #14957. The problem is, for example, with `go test cmd/objdump`: objdump_test.go:71: go build fmthello.go: exit status 2 # command-line-arguments /var/gcc/iant/go/pkg/tool/solaris_amd64/link: running gcc failed: exit status 1 Undefined first referenced symbol in file x_cgo_callers /tmp/go-link-355600608/go.o ld: fatal: symbol referencing errors collect2: error: ld returned 1 exit status Change-Id: I54917cfd5c288ee77ea25c439489bd2c9124fe73 Reviewed-on: https://go-review.googlesource.com/21392 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2016-04-01 16:13:45 +00:00
Ian Lance Taylor	ea306ae625	runtime: support symbolic backtrace of C code in a cgo crash The new function runtime.SetCgoTraceback may be used to register stack traceback and symbolizer functions, written in C, to do a stack traceback from cgo code. There is a sample implementation of runtime.SetCgoSymbolizer at github.com/ianlancetaylor/cgosymbolizer. Just importing that package is sufficient to get symbolic C backtraces. Currently only supported on linux/amd64. Change-Id: If96ee2eb41c6c7379d407b9561b87557bfe47341 Reviewed-on: https://go-review.googlesource.com/17761 Reviewed-by: Austin Clements <austin@google.com>	2016-04-01 04:13:44 +00:00
Matthew Dempsky	e6066711a0	cmd/compile, runtime: fix pedantic int->string conversions Previously, cmd/compile rejected constant int->string conversions if the integer value did not fit into an "int" value. Also, runtime incorrectly truncated 64-bit values to 32-bit before checking if they're a valid Unicode code point. According to the Go spec, both of these cases should instead yield "\uFFFD". Fixes #15039. Change-Id: I3c8a3ad9a0780c0a8dc1911386a523800fec9764 Reviewed-on: https://go-review.googlesource.com/21344 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-31 10:28:23 +00:00
Keith Randall	4b209dbf0b	runtime: don't use REP;MOVSB if CPUID doesn't say it is fast Only use REP;MOVSB if: 1) The CPUID flag says it is fast, and 2) The pointers are unaligned Otherwise, use REP;MOVSQ. Update #14630 Change-Id: I946b28b87880c08e5eed1ce2945016466c89db66 Reviewed-on: https://go-review.googlesource.com/21300 Reviewed-by: Nigel Tao <nigeltao@golang.org>	2016-03-31 02:54:10 +00:00
Austin Clements	17f6e5396b	runtime: print sweep ratio if gcpacertrace>0 Change-Id: I5217bf4b75e110ca2946e1abecac6310ed84dad5 Reviewed-on: https://go-review.googlesource.com/21205 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-03-30 02:27:58 +00:00
Michel Lespinasse	859b63cc09	cmd/compile: optimize remaining convT2I calls See #14874 Updates #6853 This change adds a compiler optimization for non pointer shaped convT2I. Since itab symbols are now emitted by the compiler, the itab address can be passed directly to convT2I instead of passing the iface type and a cache pointer argument. Compilebench results for the 5-commits series ending here: name old time/op new time/op delta Template 336ms ± 4% 344ms ± 4% +2.61% (p=0.027 n=9+8) Unicode 165ms ± 6% 173ms ± 7% +5.11% (p=0.014 n=9+9) GoTypes 1.09s ± 1% 1.06s ± 2% -3.29% (p=0.000 n=9+9) Compiler 5.09s ±10% 4.75s ±10% -6.64% (p=0.011 n=10+10) MakeBash 31.1s ± 5% 30.3s ± 3% ~ (p=0.089 n=10+10) name old text-bytes new text-bytes delta HelloSize 558k ± 0% 558k ± 0% +0.02% (p=0.000 n=10+10) CmdGoSize 6.24M ± 0% 6.11M ± 0% -2.11% (p=0.000 n=10+10) name old data-bytes new data-bytes delta HelloSize 3.66k ± 0% 3.74k ± 0% +2.41% (p=0.000 n=10+10) CmdGoSize 134k ± 0% 162k ± 0% +20.76% (p=0.000 n=10+10) name old bss-bytes new bss-bytes delta HelloSize 126k ± 0% 126k ± 0% ~ (all samples are equal) CmdGoSize 149k ± 0% 146k ± 0% -2.17% (p=0.000 n=10+10) name old exe-bytes new exe-bytes delta HelloSize 924k ± 0% 924k ± 0% +0.05% (p=0.000 n=10+10) CmdGoSize 9.77M ± 0% 9.62M ± 0% -1.47% (p=0.000 n=10+10) Change-Id: Ib230ddc04988824035c32287ae544a965fedd344 Reviewed-on: https://go-review.googlesource.com/20902 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Michel Lespinasse <walken@google.com>	2016-03-29 02:21:50 +00:00
Michel Lespinasse	79688ca58f	cmd/link: collect itablinks as a slice in moduledata See #14874 This change tells the linker to collect all the itablink symbols and collect them so that moduledata can have a slice of all compiler generated itabs. The logic is shamelessly adapted from what is done with typelink symbols. Change-Id: Ie93b59acf0fcba908a876d506afbf796f222dbac Reviewed-on: https://go-review.googlesource.com/20889 Reviewed-by: Keith Randall <khr@golang.org>	2016-03-29 02:18:56 +00:00
Michel Lespinasse	f00bbd5f81	cmd/compile: emit itabs and itablinks See #14874 This change tells the compiler to emit itab and itablink symbols in situations where they could be useful; however the compiled code does not actually make use of the new symbols yet. Change-Id: I0db3e6ec0cb1f3b7cebd4c60229e4a48372fe586 Reviewed-on: https://go-review.googlesource.com/20888 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Michel Lespinasse <walken@google.com>	2016-03-29 02:18:03 +00:00
Michel Lespinasse	7043d2bb5e	runtime: insert itabs into hash table during init See #14874 This change makes the runtime register all compiler generated itabs (as obtained from the moduledata) during init. Change-Id: I9969a0985b99b8bda820a631f7fe4c78f1174cdf Reviewed-on: https://go-review.googlesource.com/20900 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Michel Lespinasse <walken@google.com>	2016-03-29 02:14:49 +00:00
Matthew Dempsky	deb83d0639	cmd/compile: remove unused write barrier helpers These have been unused since CL 10316. Passes toolstash -cmp. Change-Id: Icc19f3fcc7275fbee1c665f704e10a110ecce2a5 Reviewed-on: https://go-review.googlesource.com/21242 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-03-29 00:16:55 +00:00
Shinji Tanaka	0f86d1edfb	runtime: use set_thread_area instead of modify_ldt on linux/386 linux/386 depends on modify_ldt system call, but recent Linux kernels can disable this system call. Any Go programs built as linux/386 crash with the message 'Trace/breakpoint trap'. The kernel config CONFIG_MODIFY_LDT_SYSCALL, which control enable/disable modify_ldt, is disabled on Amazon Linux 2016.03. This fixes this problem by using set_thread_area instead of modify_ldt on linux/386. Fixes #14795. Change-Id: I0cc5139e40e9e5591945164156a77b6bdff2c7f1 Reviewed-on: https://go-review.googlesource.com/21190 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Minux Ma <minux@golang.org>	2016-03-28 16:56:38 +00:00
David Chase	8eec2bbfbc	cmd/compile: added some intrinsics to SSA back end One intrinsic was needed to help get the very best performance out of a future GC; as long as that one was being added, I also added Bswap since that is sometimes a handy thing to have. I had intended to fill out the bit-scan intrinsic family, but the mismatch between the "scan forward" instruction and "count leading zeroes" was large enough to cause me to leave it out -- it poses a dilemma that I'd rather dodge right now. These intrinsics are not exposed for general use. That's a separate issue requiring an API proposal change ( https://github.com/golang/proposal ) All intrinsics are tested, both that they are substituted on the appropriate architecture, and that they produce the expected result. Change-Id: I5848037cfd97de4f75bdc33bdd89bba00af4a8ee Reviewed-on: https://go-review.googlesource.com/20564 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-28 16:29:59 +00:00
Matthew Dempsky	995fb0319e	cmd/compile: fix stringtoslicebytetmp optimization Fixes #14973. Change-Id: Iea68c9deca9429bde465c9ae05639209fe0ccf72 Reviewed-on: https://go-review.googlesource.com/21175 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-27 19:12:37 +00:00
Shenghou Ma	a7f7a9cca7	runtime, runtime/cgo: save callee-saved FP registers on arm64 For #14876. Change-Id: I0992859264cbaf9c9b691fad53345bbb01b4cf3b Reviewed-on: https://go-review.googlesource.com/21085 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-25 23:04:44 +00:00
Shenghou Ma	71916437fb	runtime/cgo: save callee-saved xmm registers on windows/amd64 For #14876. Change-Id: I33947f74e8058437a784862f1f064974afc99250 Reviewed-on: https://go-review.googlesource.com/21084 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-25 23:04:33 +00:00
Joe Sylve	562e38c0af	runtime: fix signal handling on Solaris This fixes the problems with signal handling that were inadvertently introduced in https://go-review.googlesource.com/21006. Fixes #14899 Change-Id: Ia746914dcb3146a52413d32c57b089af763f0810 Reviewed-on: https://go-review.googlesource.com/21145 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-25 21:38:47 +00:00
Marvin Stenger	6b0688f742	runtime: speed up growslice by avoiding divisions 2 This is a follow-up of https://go-review.googlesource.com/#/c/20653/ Special case computation for slices with elements of byte size or pointer size. name old time/op new time/op delta GrowSliceBytes-4 86.2ns ± 3% 75.4ns ± 2% -12.50% (p=0.000 n=20+20) GrowSliceInts-4 161ns ± 3% 136ns ± 3% -15.59% (p=0.000 n=19+19) GrowSlicePtr-4 239ns ± 2% 233ns ± 2% -2.52% (p=0.000 n=20+20) GrowSliceStruct24Bytes-4 258ns ± 3% 256ns ± 3% ~ (p=0.134 n=20+20) Change-Id: Ice5fa648058fe9d7fa89dee97ca359966f671128 Reviewed-on: https://go-review.googlesource.com/21101 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-25 16:47:56 +00:00
Richard Miller	967b9940b4	runtime: avoid fork/exit race in plan9 There's a race between runtime.goexitsall killing all OS processes of a go program in order to exit, and runtime.newosproc forking a new one. If the new process has been created but not yet stored its pid in m.procid, it will not be killed by goexitsall and deadlock results. This CL prevents the race by making the newly forked process check whether the program is exiting. It also prevents a potential "shoot-out" if multiple goroutines call Exit at the same time, which could possibly lead to two processes killing each other and leaving the rest deadlocked. Change-Id: I3170b4a62d2461f6b029b3d6aad70373714ed53e Reviewed-on: https://go-review.googlesource.com/21135 Run-TryBot: David du Colombier <0intro@gmail.com> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David du Colombier <0intro@gmail.com>	2016-03-25 15:04:45 +00:00
Dmitry Vyukov	ea0386f85f	runtime: improve randomized stealing logic During random stealing we steal 4*GOMAXPROCS times from random procs. One would expect that most of the time we check all procs this way, but due to low quality PRNG we actually miss procs with frightening probability. Below are modelling experiment results for 1e6 tries: GOMAXPROCS = 2 : missed 1 procs 7944 times GOMAXPROCS = 3 : missed 1 procs 101620 times GOMAXPROCS = 3 : missed 2 procs 3571 times GOMAXPROCS = 4 : missed 1 procs 63916 times GOMAXPROCS = 4 : missed 2 procs 61 times GOMAXPROCS = 4 : missed 3 procs 16 times GOMAXPROCS = 5 : missed 1 procs 133136 times GOMAXPROCS = 5 : missed 2 procs 1025 times GOMAXPROCS = 5 : missed 3 procs 101 times GOMAXPROCS = 5 : missed 4 procs 15 times GOMAXPROCS = 8 : missed 1 procs 151765 times GOMAXPROCS = 8 : missed 2 procs 5057 times GOMAXPROCS = 8 : missed 3 procs 1726 times GOMAXPROCS = 8 : missed 4 procs 68 times GOMAXPROCS = 12 : missed 1 procs 199081 times GOMAXPROCS = 12 : missed 2 procs 27489 times GOMAXPROCS = 12 : missed 3 procs 3113 times GOMAXPROCS = 12 : missed 4 procs 233 times GOMAXPROCS = 12 : missed 5 procs 9 times GOMAXPROCS = 16 : missed 1 procs 237477 times GOMAXPROCS = 16 : missed 2 procs 30037 times GOMAXPROCS = 16 : missed 3 procs 9466 times GOMAXPROCS = 16 : missed 4 procs 1334 times GOMAXPROCS = 16 : missed 5 procs 192 times GOMAXPROCS = 16 : missed 6 procs 5 times GOMAXPROCS = 16 : missed 7 procs 1 times GOMAXPROCS = 16 : missed 8 procs 1 times A missed proc won't lead to underutilization because we check all procs again after dropping P. But it can lead to an unpleasant situation when we miss a proc, drop P, check all procs, discover work, acquire P, miss the proc again, repeat. Improve stealing logic to cover all procs. Also don't enter spinning mode and try to steal when there is nobody around. Change-Id: Ibb6b122cc7fb836991bad7d0639b77c807aab4c2 Reviewed-on: https://go-review.googlesource.com/20836 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>	2016-03-25 11:00:48 +00:00
David Crawshaw	24ce64d1a9	cmd/compile, runtime: new static name encoding Create a byte encoding designed for static Go names. It is intended to be a compact representation of a name and optional tag data that can be turned into a Go string without allocating, and describes whether or not it is exported without unicode table. The encoding is described in reflect/type.go: // The first byte is a bit field containing: // // 1<<0 the name is exported // 1<<1 tag data follows the name // 1<<2 pkgPath string follow the name and tag // // The next two bytes are the data length: // // l := uint16(data[1])<<8 \| uint16(data[2]) // // Bytes [3:3+l] are the string data. // // If tag data follows then bytes 3+l and 3+l+1 are the tag length, // with the data following. // // If the import path follows, then ptrSize bytes at the end of // the data form a string. The import path is only set for concrete // methods that are defined in a different package than their type. Shrinks binary sizes: cmd/go: 164KB (1.6%) jujud: 1.0MB (1.5%) For #6853. Change-Id: I46b6591015b17936a443c9efb5009de8dfe8b609 Reviewed-on: https://go-review.googlesource.com/20968 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-25 00:13:49 +00:00
Joe Sylve	df2b2eb63d	runtime: improve last ditch signal forwarding for Unix libraries The current runtime attempts to forward signals generated by non-Go code to the original signal handler. If it can't call the original handler directly, it currently attempts to re-raise the signal after resetting the handler. In this case, the original context is lost. This fix prevents that problem by simply returning from the go signal handler after resetting the original handler. It only does this when the original handler is the system default handler, which in all cases is known to not recover. The signal is not reset, so it is retriggered and the original handler takes over with the proper context. Fixes #14899 Change-Id: Ib1c19dfa4b50d9732d7a453de3784c8141e1cbb3 Reviewed-on: https://go-review.googlesource.com/21006 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2016-03-24 19:34:17 +00:00
Marvin Stenger	44532f1a9d	runtime: fix inconsistency in slice.go Fixes #14938. Additionally some simplifications along the way. Change-Id: I2c5fb7e32dcc6fab68fff36a49cb72e715756abe Reviewed-on: https://go-review.googlesource.com/21046 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2016-03-24 18:17:28 +00:00
Elias Naur	be10c51500	runtime/cgo: block signals to the iOS mach exception handler For darwin/arm{,64} a non-Go thread is created to convert EXC_BAD_ACCESS to panics. However, the Go signal handler refuse to handle signals that would otherwise be ignored if they arrive at non-Go threads. Block all (posix) signals to that thread, making sure that no unexpected signals arrive to it. At least one test, TestStop in os/signal, depends on signals not arriving on any non-Go threads. For #14318 Change-Id: I901467fb53bdadb0d03b0f1a537116c7f4754423 Reviewed-on: https://go-review.googlesource.com/21047 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2016-03-24 14:12:12 +00:00
Lynn Boger	baec148767	bytes: Equal perf improvements on ppc64le/ppc64 The existing implementation for Equal and similar functions in the bytes package operate on one byte at at time. This performs poorly on ppc64/ppc64le especially when the byte buffers are large. This change improves those functions by loading and comparing double words where possible. The common code has been moved to a function that can be shared by the other functions in this file which perform the same type of comparison. Further optimizations are done for the case where >= 32 bytes are being compared. The new function memeqbody is used by memeq_varlen, Equal, and eqstring. When running the bytes test with -test.bench=Equal benchmark old MB/s new MB/s speedup BenchmarkEqual1 164.83 129.49 0.79x BenchmarkEqual6 563.51 445.47 0.79x BenchmarkEqual9 656.15 1099.00 1.67x BenchmarkEqual15 591.93 1024.30 1.73x BenchmarkEqual16 613.25 1914.12 3.12x BenchmarkEqual20 682.37 1687.04 2.47x BenchmarkEqual32 807.96 3843.29 4.76x BenchmarkEqual4K 1076.25 23280.51 21.63x BenchmarkEqual4M 1079.30 13120.14 12.16x BenchmarkEqual64M 1073.28 10876.92 10.13x It was determined that the degradation in the smaller byte tests were due to unfavorable code alignment of the single byte loop. Fixes #14368 Change-Id: I0dd87382c28887c70f4fbe80877a8ba03c31d7cd Reviewed-on: https://go-review.googlesource.com/20249 Reviewed-by: Minux Ma <minux@golang.org>	2016-03-23 14:21:15 +00:00

1 2 3 4 5 ...

1948 Commits