qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-19 18:24:39 -07:00

Author	SHA1	Message	Date
Austin Clements	6c4a8d195b	runtime: don't hold global gcBitsArenas lock over allocation Currently, newArena holds the gcBitsArenas lock across allocating memory from the OS for a new gcBits arena. This is a global lock and allocating physical memory can be expensive, so this has the potential to cause high lock contention, especially since every single span sweep operation calls newArena (via newMarkBits). Improve the situation by temporarily dropping the lock across allocation. This means the caller now has to revalidate its assumptions after the lock is dropped, so this also factors out that code path and reinvokes it after the lock is acquired. Change-Id: I1113200a954ab4aad16b5071512583cfac744bdc Reviewed-on: https://go-review.googlesource.com/34594 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-06 18:40:23 +00:00
Eitan Adler	789c5255a4	all: remove the the duplicate words Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0 Reviewed-on: https://go-review.googlesource.com/37793 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-06 04:39:12 +00:00
Josh Bleecher Snyder	d4451362c0	runtime: add slicebytetostring benchmark Change-Id: I666d2c6ea8d0b54a71260809d1a2573b122865b2 Reviewed-on: https://go-review.googlesource.com/37790 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-05 05:14:08 +00:00
Austin Clements	4a7cf960c3	runtime: make ReadMemStats STW for < 25µs Currently ReadMemStats stops the world for ~1.7 ms/GB of heap because it collects statistics from every single span. For large heaps, this can be quite costly. This is particularly unfortunate because many production infrastructures call this function regularly to collect and report statistics. Fix this by tracking the necessary cumulative statistics in the mcaches. ReadMemStats still has to stop the world to stabilize these statistics, but there are only O(GOMAXPROCS) mcaches to collect statistics from, so this pause is only 25µs even at GOMAXPROCS=100. Fixes #13613. Change-Id: I3c0a4e14833f4760dab675efc1916e73b4c0032a Reviewed-on: https://go-review.googlesource.com/34937 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-04 02:56:37 +00:00
Austin Clements	3399fd254d	runtime: remove unused gcstats The gcstats structure is no longer consumed by anything and no longer tracks statistics that are particularly relevant to the concurrent garbage collector. Remove it. (Having statistics is probably a good idea, but these aren't the stats we need these days and we don't have a way to get them out of the runtime.) In preparation for #13613. Change-Id: Ib63e2f9067850668f9dcbfd4ed89aab4a6622c3f Reviewed-on: https://go-review.googlesource.com/34936 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-04 02:56:35 +00:00
Elias Naur	7523baed09	misc/ios,cmd/go, runtime/cgo: fix iOS test harness (again) The iOS test harness was recently changed in response to lldb bugs to replace breakpoints with the SIGUSR2 signal (CL 34926), and to pass the current directory in the test binary arguments (CL 35152). Both the signal sending and working directory setup is done from the go test driver. However, the new method doesn't work with tests where a C program is the test driver instead of go test: the current working directory will not be changed and SIGUSR2 is not raised. Instead of copying that logic into any C test program, rework the test harness (again) to move the setup logic to the early runtime cgo setup code. That way, the harness will run even in the library build modes. Then, use the app Info.plist file to pass the working directory, removing the need to alter the arguments after running. Finally, use the SIGINT signal instead of SIGUSR2 to avoid manipulating the signal masks or handlers. Fixes the testcarchive tests on iOS. With this CL, both darwin/arm and darwin/arm64 passes all.bash. This CL replaces CL 34926, CL 35152 as well as the fixup CL 35123 and CL 35255. They are reverted in CLs earlier in the relation chain. Change-Id: I8485c7db1404fbd8daa261efd1ea89e905121a3e Reviewed-on: https://go-review.googlesource.com/36090 Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-03-04 01:43:13 +00:00
David Lazar	781fd3998e	runtime: use inlining tables to generate accurate tracebacks The code in https://play.golang.org/p/aYQPrTtzoK now produces the following stack trace: goroutine 1 [running]: main.(*point).negate(...) /tmp/go/main.go:8 main.main() /tmp/go/main.go:14 +0x23 Previously the stack trace missed the inlined call: goroutine 1 [running]: main.main() /tmp/go/main.go:14 +0x23 Fixes #10152. Updates #19348. Change-Id: Ib43c67012f53da0ef1a1e69bcafb65b57d9cecb2 Reviewed-on: https://go-review.googlesource.com/37233 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-03 21:29:34 +00:00
David Lazar	699175a11a	cmd/compile,link: generate PC-value tables with inlining information In order to generate accurate tracebacks, the runtime needs to know the inlined call stack for a given PC. This creates two tables per function for this purpose. The first table is the inlining tree (stored in the function's funcdata), which has a node containing the file, line, and function name for every inlined call. The second table is a PC-value table that maps each PC to a node in the inlining tree (or -1 if the PC is not the result of inlining). To give the appearance that inlining hasn't happened, the runtime also needs the original source position information of inlined AST nodes. Previously the compiler plastered over the line numbers of inlined AST nodes with the line number of the call. This meant that the PC-line table mapped each PC to line number of the outermost call in its inlined call stack, with no way to access the innermost line number. Now the compiler retains line numbers of inlined AST nodes and writes the innermost source position information to the PC-line and PC-file tables. Some tools and tests expect to see outermost line numbers, so we provide the OutermostLine function for displaying line info. To keep track of the inlined call stack for an AST node, we extend the src.PosBase type with an index into a global inlining tree. Every time the compiler inlines a call, it creates a node in the global inlining tree for the call, and writes its index to the PosBase of every inlined AST node. The parent of this node is the inlining tree index of the call. -1 signifies no parent. For each function, the compiler creates a local inlining tree and a PC-value table mapping each PC to an index in the local tree. These are written to an object file, which is read by the linker. The linker re-encodes these tables compactly by deduplicating function names and file names. This change increases the size of binaries by 4-5%. For example, this is how the go1 benchmark binary is impacted by this change: section old bytes new bytes delta .text 3.49M ± 0% 3.49M ± 0% +0.06% .rodata 1.12M ± 0% 1.21M ± 0% +8.21% .gopclntab 1.50M ± 0% 1.68M ± 0% +11.89% .debug_line 338k ± 0% 435k ± 0% +28.78% Total 9.21M ± 0% 9.58M ± 0% +4.01% Updates #19348. Change-Id: Ic4f180c3b516018138236b0c35e0218270d957d3 Reviewed-on: https://go-review.googlesource.com/37231 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-03 21:29:30 +00:00
Austin Clements	77f64c50db	runtime: clarify access to mheap_.busy There are two accesses to mheap_.busy that are guarded by checks against len(mheap_.free). This works because both lists are (and must be) the same length, but it makes the code less clear. Change these to use len(mheap_.busy) so the access more clearly parallels the check. Fixes #18944. Change-Id: I9bacbd3663988df351ed4396ae9018bc71018311 Reviewed-on: https://go-review.googlesource.com/36354 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:18 +00:00
Austin Clements	b50b728587	runtime: simplify sweep allocation counting Currently sweep counts the number of allocated objects, computes the number of free objects from that, then re-computes the number of allocated objects from that. Simplify and clean this up by skipping these intermediate steps. Change-Id: I3ed98e371eb54bbcab7c8530466c4ab5fde35f0a Reviewed-on: https://go-review.googlesource.com/34935 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:16 +00:00
Austin Clements	f1ba75f8c5	runtime: don't rescan finalizers queue during mark termination Currently we scan the finalizers queue both during concurrent mark and during mark termination. This costs roughly 20ns per queued finalizer and about 1ns per unused finalizer queue slot (allocated queue length never decreases), which can drive up STW time if there are many finalizers. However, we only add finalizers to this queue during sweeping, which means that the second scan will never find anything new. Hence, we can fix this by simply not scanning the finalizers queue during mark termination. This brings the STW time under the 100µs goal even with 1,000,000 queued finalizers. Fixes #18869. Change-Id: I4ce5620c66fb7f13ebeb39ca313ce57047d1d0fb Reviewed-on: https://go-review.googlesource.com/36013 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:14 +00:00
Austin Clements	98da2d1f91	runtime: remove wbufptr Since workbuf is now marked go:notinheap, the write barrier-preventing wrapper type wbufptr is no longer necessary. Remove it. Change-Id: I3e5b5803a1547d65de1c1a9c22458a38e08549b7 Reviewed-on: https://go-review.googlesource.com/35971 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:12 +00:00
Josh Bleecher Snyder	9b15c13dc5	runtime/pprof: fix data race between Profile.Add and Profile.WriteTo p.m is accessed in WriteTo without holding p.mu. Move the access inside the critical section. The race detector catches this bug using this program: package main import ( "os" "runtime/pprof" "time" ) func main() { p := pprof.NewProfile("ABC") go func() { p.WriteTo(os.Stdout, 1) time.Sleep(time.Second) }() p.Add("abc", 0) time.Sleep(time.Second) } $ go run -race x.go ================== WARNING: DATA RACE Write at 0x00c42007c240 by main goroutine: runtime.mapassign() /Users/josh/go/tip/src/runtime/hashmap.go:485 +0x0 runtime/pprof.(Profile).Add() /Users/josh/go/tip/src/runtime/pprof/pprof.go:281 +0x255 main.main() /Users/josh/go/tip/src/p.go:15 +0x9d Previous read at 0x00c42007c240 by goroutine 6: runtime/pprof.(Profile).WriteTo() /Users/josh/go/tip/src/runtime/pprof/pprof.go:314 +0xc5 main.main.func1() /Users/josh/go/tip/src/x.go:12 +0x69 Goroutine 6 (running) created at: main.main() /Users/josh/go/tip/src/x.go:11 +0x6e ================== ABC profile: total 1 1 @ 0x110ccb4 0x111aeee 0x1055053 0x107f031 Found 1 data race(s) exit status 66 (Exit status 66?) Change-Id: I49d884dc3af9cce2209057a3448fe6bf50653523 Reviewed-on: https://go-review.googlesource.com/37730 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-02 23:30:07 +00:00
Josh Bleecher Snyder	04fc887761	runtime: delay marking maps as writing until after first alg call Fixes #19359 Change-Id: I196b47cf0471915b6dc63785e8542aa1876ff695 Reviewed-on: https://go-review.googlesource.com/37665 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-02 17:38:30 +00:00
Lynn Boger	e54bc92a2c	runtime, cmd/go: roll back stale message, test detail Some debugging code was recently added to: 1) provide more detail for the stale reason when it is determined that a package is stale 2) provide file and package time and date information when it is determined that runtime.a is stale This backs out those those debugging messages. Fixes #19116 Change-Id: I8dd0cbe29324820275b481d8bbb78ff2c5fbc362 Reviewed-on: https://go-review.googlesource.com/37382 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-01 18:50:27 +00:00
Josh Bleecher Snyder	064e44f218	runtime: evacuate old map buckets more consistently During map growth, buckets are evacuated in two ways. When a value is altered, its containing bucket is evacuated. Also, an evacuation mark is maintained and advanced every time. Prior to this CL, the evacuation mark was always incremented, even if the next bucket to be evacuated had already been evacuated. This CL changes evacuation mark advancement to skip previously evacuated buckets. This has the effect of making map evacuation both more aggressive and more consistent. Aggressive map evacuation is good. While the map is growing, map accesses must check two buckets, which may be far apart in memory. Map growth also delays garbage collection. And if map evacuation is not aggressive enough, there is a risk that a populate-once read-many map may be stuck permanently in map growth. This CL does not eliminate that possibility, but it shrinks the window. There is minimal impact on map benchmarks: name old time/op new time/op delta MapPop100-8 12.4µs ±11% 12.4µs ± 7% ~ (p=0.798 n=15+15) MapPop1000-8 240µs ± 8% 235µs ± 8% ~ (p=0.217 n=15+14) MapPop10000-8 4.49ms ±10% 4.51ms ±15% ~ (p=1.000 n=15+13) MegMap-8 11.9ns ± 2% 11.8ns ± 0% -1.01% (p=0.000 n=15+11) MegOneMap-8 9.30ns ± 1% 9.29ns ± 1% ~ (p=0.955 n=14+14) MegEqMap-8 31.9µs ± 5% 31.9µs ± 3% ~ (p=0.935 n=15+15) MegEmptyMap-8 2.41ns ± 2% 2.41ns ± 0% ~ (p=0.594 n=12+14) SmallStrMap-8 12.8ns ± 1% 12.7ns ± 1% ~ (p=0.569 n=14+13) MapStringKeysEight_16-8 13.6ns ± 1% 13.7ns ± 2% ~ (p=0.100 n=13+15) MapStringKeysEight_32-8 12.1ns ± 1% 12.1ns ± 2% ~ (p=0.340 n=15+15) MapStringKeysEight_64-8 12.1ns ± 1% 12.1ns ± 2% ~ (p=0.582 n=15+14) MapStringKeysEight_1M-8 12.0ns ± 1% 12.1ns ± 1% ~ (p=0.267 n=15+14) IntMap-8 7.96ns ± 1% 7.97ns ± 2% ~ (p=0.991 n=15+13) RepeatedLookupStrMapKey32-8 15.8ns ± 2% 15.8ns ± 1% ~ (p=0.393 n=15+14) RepeatedLookupStrMapKey1M-8 35.3µs ± 2% 35.3µs ± 1% ~ (p=0.815 n=15+15) NewEmptyMap-8 36.0ns ± 4% 36.4ns ± 7% ~ (p=0.270 n=15+15) NewSmallMap-8 85.5ns ± 1% 85.6ns ± 1% ~ (p=0.674 n=14+15) MapIter-8 89.9ns ± 6% 90.8ns ± 6% ~ (p=0.467 n=15+15) MapIterEmpty-8 10.0ns ±22% 10.0ns ±25% ~ (p=0.846 n=15+15) SameLengthMap-8 4.18ns ± 1% 4.17ns ± 1% ~ (p=0.653 n=15+14) BigKeyMap-8 20.2ns ± 1% 20.1ns ± 1% -0.82% (p=0.002 n=15+15) BigValMap-8 22.5ns ± 8% 22.3ns ± 6% ~ (p=0.615 n=15+15) SmallKeyMap-8 15.3ns ± 1% 15.3ns ± 1% ~ (p=0.754 n=15+14) ComplexAlgMap-8 58.4ns ± 1% 58.7ns ± 1% +0.52% (p=0.000 n=14+15) There is a tiny but detectable difference in the compiler: name old time/op new time/op delta Template 218ms ± 5% 219ms ± 4% ~ (p=0.094 n=98+98) Unicode 93.6ms ± 5% 93.6ms ± 4% ~ (p=0.910 n=94+95) GoTypes 596ms ± 5% 598ms ± 6% ~ (p=0.533 n=98+100) Compiler 2.72s ± 3% 2.72s ± 4% ~ (p=0.238 n=100+99) SSA 4.11s ± 3% 4.11s ± 3% ~ (p=0.864 n=99+98) Flate 129ms ± 6% 129ms ± 4% ~ (p=0.522 n=98+96) GoParser 151ms ± 4% 151ms ± 4% -0.48% (p=0.017 n=96+96) Reflect 379ms ± 3% 376ms ± 4% -0.57% (p=0.011 n=99+99) Tar 112ms ± 5% 112ms ± 6% ~ (p=0.688 n=93+95) XML 214ms ± 4% 214ms ± 5% ~ (p=0.968 n=100+99) StdCmd 16.2s ± 2% 16.2s ± 2% -0.26% (p=0.048 n=99+99) name old user-ns/op new user-ns/op delta Template 252user-ms ± 4% 250user-ms ± 4% -0.63% (p=0.020 n=98+97) Unicode 113user-ms ± 7% 114user-ms ± 5% ~ (p=0.057 n=97+94) GoTypes 776user-ms ± 5% 777user-ms ± 5% ~ (p=0.375 n=97+96) Compiler 3.61user-s ± 3% 3.60user-s ± 3% ~ (p=0.445 n=98+93) SSA 5.84user-s ± 6% 5.85user-s ± 5% ~ (p=0.542 n=100+95) Flate 154user-ms ± 5% 154user-ms ± 5% ~ (p=0.699 n=99+99) GoParser 184user-ms ± 6% 183user-ms ± 4% ~ (p=0.557 n=98+95) Reflect 461user-ms ± 5% 462user-ms ± 4% ~ (p=0.853 n=97+99) Tar 130user-ms ± 5% 129user-ms ± 6% ~ (p=0.567 n=93+100) XML 257user-ms ± 6% 258user-ms ± 6% ~ (p=0.205 n=99+100) Change-Id: Id92dd54a152904069aac415e6aaaab5c67f5f476 Reviewed-on: https://go-review.googlesource.com/37011 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-28 20:42:50 +00:00
Josh Bleecher Snyder	504bc3ed24	cmd/compile, runtime: specialize convT2x, don't alloc for zero vals Prior to this CL, all runtime conversions from a concrete value to an interface went through one of two runtime calls: convT2E or convT2I. However, in practice, basic types are very common. Specializing convT2x for those basic types allows for a more efficient implementation for those types. For basic scalars and strings, allocation and copying can use the same methods as normal code. For pointer-free types, allocation can occur without zeroing, and copying can take place without GC calls. For slices, copying is cheaper and simpler. This CL adds twelve runtime routines: convT2E16, convT2I16 convT2E32, convT2I32 convT2E64, convT2I64 convT2Estring, convT2Istring convT2Eslice, convT2Islice convT2Enoptr, convT2Inoptr While compiling make.bash, 93% of all convT2x calls are now to one of these specialized convT2x call. Within specialized convT2x routines, it is cheap to check for a zero value, in a way that it is not in general. When we detect a zero value there, we return a pointer to zeroVal, rather than allocating. name old time/op new time/op delta ConvT2Ezero/zero/16-8 17.9ns ± 2% 3.0ns ± 3% -83.20% (p=0.000 n=56+56) ConvT2Ezero/zero/32-8 17.8ns ± 2% 3.0ns ± 3% -83.15% (p=0.000 n=59+60) ConvT2Ezero/zero/64-8 20.1ns ± 1% 3.0ns ± 2% -84.98% (p=0.000 n=57+57) ConvT2Ezero/zero/str-8 32.6ns ± 1% 3.0ns ± 4% -90.70% (p=0.000 n=59+60) ConvT2Ezero/zero/slice-8 36.7ns ± 2% 3.0ns ± 2% -91.78% (p=0.000 n=59+59) ConvT2Ezero/zero/big-8 91.9ns ± 2% 85.9ns ± 2% -6.52% (p=0.000 n=57+57) ConvT2Ezero/nonzero/16-8 17.7ns ± 2% 12.7ns ± 3% -28.38% (p=0.000 n=55+60) ConvT2Ezero/nonzero/32-8 17.8ns ± 1% 12.7ns ± 1% -28.44% (p=0.000 n=54+57) ConvT2Ezero/nonzero/64-8 20.0ns ± 1% 15.0ns ± 1% -24.90% (p=0.000 n=56+58) ConvT2Ezero/nonzero/str-8 32.6ns ± 1% 25.7ns ± 1% -21.17% (p=0.000 n=58+55) ConvT2Ezero/nonzero/slice-8 36.8ns ± 2% 30.4ns ± 1% -17.32% (p=0.000 n=60+52) ConvT2Ezero/nonzero/big-8 92.1ns ± 2% 85.9ns ± 2% -6.70% (p=0.000 n=57+59) Benchmarks on a real program (the compiler): name old time/op new time/op delta Template 227ms ± 5% 221ms ± 2% -2.48% (p=0.000 n=30+26) Unicode 102ms ± 5% 100ms ± 3% -1.30% (p=0.009 n=30+26) GoTypes 656ms ± 5% 659ms ± 4% ~ (p=0.208 n=30+30) Compiler 2.82s ± 2% 2.82s ± 1% ~ (p=0.614 n=29+27) Flate 128ms ± 2% 128ms ± 5% ~ (p=0.783 n=27+28) GoParser 158ms ± 3% 158ms ± 3% ~ (p=0.261 n=28+30) Reflect 408ms ± 7% 401ms ± 3% ~ (p=0.075 n=30+30) Tar 123ms ± 6% 121ms ± 8% ~ (p=0.287 n=29+30) XML 220ms ± 2% 220ms ± 4% ~ (p=0.805 n=29+29) name old user-ns/op new user-ns/op delta Template 281user-ms ± 4% 279user-ms ± 3% -0.87% (p=0.044 n=28+28) Unicode 142user-ms ± 4% 141user-ms ± 3% -1.04% (p=0.015 n=30+27) GoTypes 884user-ms ± 3% 886user-ms ± 2% ~ (p=0.532 n=30+30) Compiler 3.94user-s ± 3% 3.92user-s ± 1% ~ (p=0.185 n=30+28) Flate 165user-ms ± 2% 165user-ms ± 4% ~ (p=0.780 n=27+29) GoParser 209user-ms ± 2% 208user-ms ± 3% ~ (p=0.453 n=28+30) Reflect 533user-ms ± 6% 526user-ms ± 3% ~ (p=0.057 n=30+30) Tar 156user-ms ± 6% 154user-ms ± 6% ~ (p=0.133 n=29+30) XML 288user-ms ± 4% 288user-ms ± 4% ~ (p=0.633 n=30+30) name old alloc/op new alloc/op delta Template 41.0MB ± 0% 40.9MB ± 0% -0.11% (p=0.000 n=29+29) Unicode 32.6MB ± 0% 32.6MB ± 0% ~ (p=0.572 n=29+30) GoTypes 122MB ± 0% 122MB ± 0% -0.10% (p=0.000 n=30+30) Compiler 482MB ± 0% 481MB ± 0% -0.07% (p=0.000 n=30+29) Flate 26.6MB ± 0% 26.6MB ± 0% ~ (p=0.096 n=30+30) GoParser 32.7MB ± 0% 32.6MB ± 0% -0.06% (p=0.011 n=28+28) Reflect 84.2MB ± 0% 84.1MB ± 0% -0.17% (p=0.000 n=29+30) Tar 27.7MB ± 0% 27.7MB ± 0% -0.05% (p=0.032 n=27+28) XML 44.7MB ± 0% 44.7MB ± 0% ~ (p=0.131 n=28+30) name old allocs/op new allocs/op delta Template 373k ± 1% 370k ± 1% -0.76% (p=0.000 n=30+30) Unicode 325k ± 1% 325k ± 1% ~ (p=0.383 n=29+30) GoTypes 1.16M ± 0% 1.15M ± 0% -0.75% (p=0.000 n=29+30) Compiler 4.15M ± 0% 4.13M ± 0% -0.59% (p=0.000 n=30+29) Flate 238k ± 1% 237k ± 1% -0.62% (p=0.000 n=30+30) GoParser 304k ± 1% 302k ± 1% -0.64% (p=0.000 n=30+28) Reflect 1.00M ± 0% 0.99M ± 0% -1.10% (p=0.000 n=29+30) Tar 245k ± 1% 244k ± 1% -0.59% (p=0.000 n=27+29) XML 391k ± 1% 389k ± 1% -0.59% (p=0.000 n=29+30) Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f Reviewed-on: https://go-review.googlesource.com/36476 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-28 19:23:33 +00:00
Austin Clements	bab191042b	cmd/internal/obj, runtime: update funcdata comments The comments in cmd/internal/obj/funcdata.go are identical to the comments in runtime/funcdata.h, but the majority of the definitions they refer to don't apply to Go sources and have been stripped out of funcdata.go. Remove these stale comments from funcdata.go and clean up the references to other copies of the PCDATA and FUNCDATA indexes. Change-Id: I5d6e49a6e586cc9aecd7c3ce1567679f2a605884 Reviewed-on: https://go-review.googlesource.com/37330 Reviewed-by: Keith Randall <khr@golang.org>	2017-02-27 22:29:28 +00:00
Dmitry Vyukov	ba6e5776fd	runtime: remove unused RaceSemacquire declaration These functions are not defined and are not used. Fixes #19290 Change-Id: I2978147220af83cf319f7439f076c131870fb9ee Reviewed-on: https://go-review.googlesource.com/37448 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-27 20:15:51 +00:00
Josh Bleecher Snyder	c7894924c7	runtime/pprof: handle empty stack traces in Profile.Add If the caller passes a large number to Profile.Add, the list of pcs is empty, which results in junk (a nil pc) being recorded. Check for that explicitly, and replace such stack traces with a lostProfileEvent. Fixes #18836. Change-Id: I99c96aa67dd5525cd239ea96452e6e8fcb25ce02 Reviewed-on: https://go-review.googlesource.com/36891 Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-27 17:11:07 +00:00
Russ Cox	8c24e52247	runtime: check that pprof accepts but doesn't need executable The profiles are self-contained now. Check that they work by themselves in the tests that invoke pprof, but also keep checking that the old command lines work. Change-Id: I24c74b5456f0b50473883c3640625c6612f72309 Reviewed-on: https://go-review.googlesource.com/37166 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 20:46:37 +00:00
Russ Cox	0b8c983ece	runtime/pprof/internal/profile: move internal/pprof/profile here Nothing needs internal/pprof anymore except the runtime/pprof tests. Move the package here to prevent new dependencies. Change-Id: Ia119af91cc2b980e0fa03a15f46f69d7f71d2926 Reviewed-on: https://go-review.googlesource.com/37165 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 20:45:21 +00:00
Russ Cox	cbab65fdfa	runtime/pprof: add streaming protobuf encoder The existing code builds a full profile in memory. Then it translates that profile into a data structure (in memory). Then it marshals that data structure into a protocol buffer (in memory). Then it gzips that marshaled form into the underlying writer. So there are three copies of the full profile data in memory at the same time before we're done. This is obviously dumb. This CL implements a fully streaming conversion from the original in-memory profile to the underlying writer. There is now only one copy of the profile in memory. For the non-CPU profiles, this is optimal, since we have to have a full copy in memory to start with. For the CPU profiles, we could still try to bound the profile size stored in memory and stream fragments out during the actual profiling, as Go 1.7 did (with a simpler format), but so far that hasn't been necessary. Change-Id: Ic36141021857791bf0cd1fce84178fb5e744b989 Reviewed-on: https://go-review.googlesource.com/37164 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 20:15:56 +00:00
Russ Cox	1564817d8c	runtime/pprof: use more efficient hash table for staging profile The old hash table was a place holder that allocates memory during every lookup for key generation, even for keys that hit in the the table. Change-Id: I4f601bbfd349f0be76d6259a8989c9c17ccfac21 Reviewed-on: https://go-review.googlesource.com/37163 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 17:05:37 +00:00
Russ Cox	1a680a902a	runtime/pprof: use new profile buffers for CPU profiling This doesn't change the functionality of the current code, but it sets us up for exporting the profiling labels into the profile. The old code had a hash table of profile samples maintained during the signal handler, with evictions going into a log. The new code just logs every sample directly, leaving the hash-based deduplication to an ordinary goroutine. The new code also avoids storing the entire profile in two forms in memory, an unfortunate regression introduced when binary profile support was added. After this CL the entire profile is only stored once in memory. We'd still like to get back down to storing it zero times (streaming it to the underlying io.Writer). Change-Id: I0893a1788267c564aa1af17970d47377b2a43457 Reviewed-on: https://go-review.googlesource.com/36712 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 17:01:47 +00:00
Russ Cox	a1261b8b0a	runtime: do not allocate on every time.Sleep It's common for some goroutines to loop calling time.Sleep. Allocate once per goroutine, not every time. This comes up in runtime/pprof's background reader. Change-Id: I89d17dc7379dca266d2c9cd3aefc2382f5bdbade Reviewed-on: https://go-review.googlesource.com/37162 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-24 15:34:01 +00:00
Russ Cox	b788fd80e6	runtime: new profile buffer implementation supporting label pointers The existing CPU profiling buffer is a slice of uintptr, but we want to start including profiling label data in the profiles, and those labels need to be pointers in order to let them describe rich information. This CL implements a new profBuf type that holds both a slice of uint64 for data and a slice of unsafe.Pointer for profiling labels (aka tags). Making the runtime use these buffers will happen in followup CLs. Change-Id: I9ff16b532d8edaf4ce0cbba1098229a561834efc Reviewed-on: https://go-review.googlesource.com/36713 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-23 19:47:23 +00:00
Lynn Boger	ea48c9d232	runtime: more detail for crash_test.go This updates the testcase to display the timestamps for the runtime.a, it dependent packages atomic.a and sys.a, and source files. Change-Id: Id2901b4e8aa8eb9775c4f404ac01cc07b394ba91 Reviewed-on: https://go-review.googlesource.com/37332 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-22 16:34:14 +00:00
Josh Bleecher Snyder	4208fcdcd4	runtime: use standard linux/mipsx clone variable names Change-Id: I62118e197190af1d11a89921d5769101ce6d2257 Reviewed-on: https://go-review.googlesource.com/37306 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-21 18:42:38 +00:00
Josh Bleecher Snyder	b6e0d4647f	runtime: update assembly var names after monotonic time changes Change-Id: I721045120a4df41462c02252e2e5e8529ae2d694 Reviewed-on: https://go-review.googlesource.com/37303 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-21 18:42:05 +00:00
Brad Fitzpatrick	a37f9d8a17	runtime/pprof: mark TestMutexProfile as flaky for now Flaky tests hurt productivity. Disable for now. Updates #19139 Change-Id: I2e3040bdf0e53597a1c4f925b788e3268ea284c1 Reviewed-on: https://go-review.googlesource.com/37291 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Peter Weinberger <pjw@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-20 20:17:16 +00:00
Dmitry Vyukov	0556e26273	sync: make Mutex more fair Add new starvation mode for Mutex. In starvation mode ownership is directly handed off from unlocking goroutine to the next waiter. New arriving goroutines don't compete for ownership. Unfair wait time is now limited to 1ms. Also fix a long standing bug that goroutines were requeued at the tail of the wait queue. That lead to even more unfair acquisition times with multiple waiters. Performance of normal mode is not considerably affected. Fixes #13086 On the provided in the issue lockskew program: done in 1.207853ms done in 1.177451ms done in 1.184168ms done in 1.198633ms done in 1.185797ms done in 1.182502ms done in 1.316485ms done in 1.211611ms done in 1.182418ms name old time/op new time/op delta MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10) Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10) MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10) MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10) MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10) MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10) MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10) Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10) RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10) RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10) RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10) RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10) Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9) MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10) Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10) MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10) MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9) MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9) MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10) MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10) RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10) RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9) RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10) RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10) Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10) MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10) Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10) MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9) MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10) MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10) MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10) MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10) RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10) RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9) RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9) RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8) Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10) MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal) Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10) MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10) MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10) MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10) MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10) MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10) RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10) RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10) RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10) RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal) Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10) MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10) Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10) MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9) MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9) MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8) MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9) MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8) RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8) RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10) RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9) RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10) Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10) MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8) Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10) MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10) MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8) MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10) MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10) MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8) RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9) RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9) RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal) RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10) Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b Reviewed-on: https://go-review.googlesource.com/34310 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-17 17:24:59 +00:00
Russ Cox	990124da2a	runtime: use balanced tree for addr lookup in semaphore implementation CL 36792 fixed #17953, a linear scan caused by n goroutines piling into two different locks that hashed to the same bucket in the semaphore table. In that CL, n goroutines contending for 2 unfortunately chosen locks went from O(n²) to O(n). This CL fixes a different linear scan, when n goroutines are contending for n/2 different locks that all hash to the same bucket in the semaphore table. In this CL, n goroutines contending for n/2 unfortunately chosen locks goes from O(n²) to O(n log n). This case is much less likely, but any linear scan eventually hurts, so we might as well fix it while the problem is fresh in our minds. The new test in this CL checks for both linear scans. The effect of this CL on the sync benchmarks is negligible (but it fixes the new test). name old time/op new time/op delta Cond1-48 576ns ±10% 575ns ±13% ~ (p=0.679 n=71+71) Cond2-48 1.59µs ± 8% 1.61µs ± 9% ~ (p=0.107 n=73+69) Cond4-48 4.56µs ± 7% 4.55µs ± 7% ~ (p=0.670 n=74+72) Cond8-48 9.87µs ± 9% 9.90µs ± 7% ~ (p=0.507 n=69+73) Cond16-48 20.4µs ± 7% 20.4µs ±10% ~ (p=0.588 n=69+71) Cond32-48 45.4µs ±10% 45.4µs ±14% ~ (p=0.944 n=73+73) UncontendedSemaphore-48 19.7ns ±12% 19.7ns ± 8% ~ (p=0.589 n=65+63) ContendedSemaphore-48 55.4ns ±26% 54.9ns ±32% ~ (p=0.441 n=75+75) MutexUncontended-48 0.63ns ± 0% 0.63ns ± 0% ~ (all equal) Mutex-48 210ns ± 6% 213ns ±10% +1.30% (p=0.035 n=70+74) MutexSlack-48 210ns ± 7% 211ns ± 9% ~ (p=0.184 n=71+72) MutexWork-48 299ns ± 5% 300ns ± 5% ~ (p=0.678 n=73+75) MutexWorkSlack-48 302ns ± 6% 300ns ± 5% ~ (p=0.149 n=74+72) MutexNoSpin-48 135ns ± 6% 135ns ±10% ~ (p=0.788 n=67+75) MutexSpin-48 693ns ± 5% 689ns ± 6% ~ (p=0.092 n=65+74) Once-48 0.22ns ±25% 0.22ns ±24% ~ (p=0.882 n=74+73) Pool-48 5.88ns ±36% 5.79ns ±24% ~ (p=0.655 n=69+69) PoolOverflow-48 4.79µs ±18% 4.87µs ±20% ~ (p=0.233 n=75+75) SemaUncontended-48 0.80ns ± 1% 0.82ns ± 8% +2.46% (p=0.000 n=60+74) SemaSyntNonblock-48 103ns ± 4% 102ns ± 5% -1.11% (p=0.003 n=75+75) SemaSyntBlock-48 104ns ± 4% 104ns ± 5% ~ (p=0.231 n=71+75) SemaWorkNonblock-48 128ns ± 4% 129ns ± 6% +1.51% (p=0.000 n=63+75) SemaWorkBlock-48 129ns ± 8% 130ns ± 7% ~ (p=0.072 n=75+74) RWMutexUncontended-48 2.35ns ± 1% 2.35ns ± 0% ~ (p=0.144 n=70+55) RWMutexWrite100-48 139ns ±18% 141ns ±21% ~ (p=0.071 n=75+73) RWMutexWrite10-48 145ns ± 9% 145ns ± 8% ~ (p=0.553 n=75+75) RWMutexWorkWrite100-48 297ns ±13% 297ns ±15% ~ (p=0.519 n=75+74) RWMutexWorkWrite10-48 588ns ± 7% 585ns ± 5% ~ (p=0.173 n=73+70) WaitGroupUncontended-48 0.87ns ± 0% 0.87ns ± 0% ~ (all equal) WaitGroupAddDone-48 63.2ns ± 4% 62.7ns ± 4% -0.82% (p=0.027 n=72+75) WaitGroupAddDoneWork-48 109ns ± 5% 109ns ± 4% ~ (p=0.233 n=75+75) WaitGroupWait-48 0.17ns ± 0% 0.16ns ±16% -8.55% (p=0.000 n=56+75) WaitGroupWaitWork-48 1.78ns ± 1% 2.08ns ± 5% +16.92% (p=0.000 n=74+70) WaitGroupActuallyWait-48 52.0ns ± 3% 50.6ns ± 5% -2.70% (p=0.000 n=71+69) https://perf.golang.org/search?q=upload:20170215.1 Change-Id: Ia29a8bd006c089e401ec4297c3038cca656bcd0a Reviewed-on: https://go-review.googlesource.com/37103 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-16 17:52:15 +00:00
Russ Cox	58d762176a	runtime: run mutexevent profiling without holding semaRoot lock Suggested by Dmitry in CL 36792 review. Clearly safe since there are many different semaRoots that could all have profiled sudogs calling mutexevent. Change-Id: I45eed47a5be3e513b2dad63b60afcd94800e16d1 Reviewed-on: https://go-review.googlesource.com/37104 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2017-02-16 17:16:41 +00:00
Russ Cox	1f77db94f8	runtime: do not call wakep from enlistWorker, to avoid possible deadlock We have seen one instance of a production job suddenly spinning to 100% CPU and becoming unresponsive. In that one instance, a SIGQUIT was sent after 328 minutes of spinning, and the stacks showed a single goroutine in "IO wait (scan)" state. Looking for things that might get stuck if a goroutine got stuck in scanning a stack, we found that injectglist does: lock(&sched.lock) var n int for n = 0; glist != nil; n++ { gp := glist glist = gp.schedlink.ptr() casgstatus(gp, _Gwaiting, _Grunnable) globrunqput(gp) } unlock(&sched.lock) and that casgstatus spins on gp.atomicstatus until the _Gscan bit goes away. Essentially, this code locks sched.lock and then while holding sched.lock, waits to lock gp.atomicstatus. The code that is doing the scan is: if castogscanstatus(gp, s, s\|_Gscan) { if !gp.gcscandone { scanstack(gp, gcw) gp.gcscandone = true } restartg(gp) break loop } More analysis showed that scanstack can, in a rare case, end up calling back into code that acquires sched.lock. For example: runtime.scanstack at proc.go:866 calls runtime.gentraceback at mgcmark.go:842 calls runtime.scanstack$1 at traceback.go:378 calls runtime.scanframeworker at mgcmark.go:819 calls runtime.scanblock at mgcmark.go:904 calls runtime.greyobject at mgcmark.go:1221 calls (runtime.gcWork).put at mgcmark.go:1412 calls (runtime.gcControllerState).enlistWorker at mgcwork.go:127 calls runtime.wakep at mgc.go:632 calls runtime.startm at proc.go:1779 acquires runtime.sched.lock at proc.go:1675 This path was found with an automated deadlock-detecting tool. There are many such paths but they all go through enlistWorker -> wakep. The evidence strongly suggests that one of these paths is what caused the deadlock we observed. We're running those jobs with GOTRACEBACK=crash now to try to get more information if it happens again. Further refinement and analysis shows that if we drop the wakep call from enlistWorker, the remaining few deadlock cycles found by the tool are all false positives caused by not understanding the effect of calls to func variables. The enlistWorker -> wakep call was intended only as a performance optimization, it rarely executes, and if it does execute at just the wrong time it can (and plausibly did) cause the deadlock we saw. Comment it out, to avoid the potential deadlock. Fixes #19112. Unfixes #14179. Change-Id: I6f7e10b890b991c11e79fab7aeefaf70b5d5a07b Reviewed-on: https://go-review.googlesource.com/37093 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-15 21:22:36 +00:00
Hana Kim	8833af3f4b	runtime/pprof: print newly added fields of runtime.MemStats in heap profile with debug mode Change-Id: I3a80d03a4aa556614626067a8fd698b3b00f4290 Reviewed-on: https://go-review.googlesource.com/36962 Reviewed-by: Austin Clements <austin@google.com>	2017-02-15 21:14:37 +00:00
Ian Lance Taylor	c05b06a12d	os: use poller for file I/O This changes the os package to use the runtime poller for file I/O where possible. When a system call blocks on a pollable descriptor, the goroutine will be blocked on the poller but the thread will be released to run other goroutines. When using a non-pollable descriptor, the os package will continue to use thread-blocking system calls as before. For example, on GNU/Linux, the runtime poller uses epoll. epoll does not support ordinary disk files, so they will continue to use blocking I/O as before. The poller will be used for pipes. Since this means that the poller is used for many more programs, this modifies the runtime to only block waiting for the poller if there is some goroutine that is waiting on the poller. Otherwise, there is no point, as the poller will never make any goroutine ready. This preserves the runtime's current simple deadlock detection. This seems to crash FreeBSD systems, so it is disabled on FreeBSD. This is issue 19093. Using the poller on Windows requires opening the file with FILE_FLAG_OVERLAPPED. We should only do that if we can remove that flag if the program calls the Fd method. This is issue 19098. Update #6817. Update #7903. Update #15021. Update #18507. Update #19093. Update #19098. Change-Id: Ia5197dcefa7c6fbcca97d19a6f8621b2abcbb1fe Reviewed-on: https://go-review.googlesource.com/36800 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-15 19:31:55 +00:00
Austin Clements	0993b2fd06	runtime: remove g.stackAlloc Since we're no longer stealing space for the stack barrier array from the stack allocation, the stack allocation is simply g.stack.hi-g.stack.lo. Updates #17503. Change-Id: Id9b450ae12c3df9ec59cfc4365481a0a16b7c601 Reviewed-on: https://go-review.googlesource.com/36621 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-02-14 15:52:56 +00:00
Austin Clements	d089a6c718	runtime: remove stack barriers Now that we don't rescan stacks, stack barriers are unnecessary. This removes all of the code and structures supporting them as well as tests that were specifically for stack barriers. Updates #17503. Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c Reviewed-on: https://go-review.googlesource.com/36620 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-14 15:52:54 +00:00
Austin Clements	c5ebcd2c8a	runtime: remove rescan list With the hybrid barrier, rescanning stacks is no longer necessary so the rescan list is no longer necessary. Remove it. This leaves the gcrescanstacks GODEBUG variable, since it's useful for debugging, but changes it to simply walk all of the Gs to rescan stacks rather than using the rescan list. We could also remove g.gcscanvalid, which is effectively a distributed rescan list. However, it's still useful for gcrescanstacks mode and it adds little complexity, so we'll leave it in. Fixes #17099. Updates #17503. Change-Id: I776d43f0729567335ef1bfd145b75c74de2cc7a9 Reviewed-on: https://go-review.googlesource.com/36619 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-02-14 15:52:51 +00:00
Austin Clements	7aeb915d6b	runtime: remove unused debug.wbshadow The wbshadow implementation was removed a year and a half ago in `1635ab7dfe`, but the GODEBUG setting remained. Remove the GODEBUG setting since it doesn't do anything. Change-Id: I19cde324a79472aff60acb5cc9f7d4aa86c0c0ed Reviewed-on: https://go-review.googlesource.com/36618 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-02-14 15:52:49 +00:00
Josh Bleecher Snyder	ef30a1c8aa	runtime: fix some assembly offset names For vet. There are more. This is a start. Change-Id: Ibbbb2b20b5db60ee3fac4a1b5913d18fab01f6b9 Reviewed-on: https://go-review.googlesource.com/36939 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 02:09:48 +00:00
Josh Bleecher Snyder	cc2a52adef	all: use keyed composite literals Makes vet happy. Change-Id: I7250f283c96e82b9796c5672a0a143ba7568fa63 Reviewed-on: https://go-review.googlesource.com/36937 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 02:09:14 +00:00
Josh Bleecher Snyder	46a75870ad	runtime: speed up fastrand() % n This occurs a fair amount in the runtime for non-power-of-two n. Use an alternative, faster formulation. name old time/op new time/op delta Fastrandn/2-8 4.45ns ± 2% 2.09ns ± 3% -53.12% (p=0.000 n=14+14) Fastrandn/3-8 4.78ns ±11% 2.06ns ± 2% -56.94% (p=0.000 n=15+15) Fastrandn/4-8 4.76ns ± 9% 1.99ns ± 3% -58.28% (p=0.000 n=15+13) Fastrandn/5-8 4.96ns ±13% 2.03ns ± 6% -59.14% (p=0.000 n=15+15) name old time/op new time/op delta SelectUncontended-8 33.7ns ± 2% 33.9ns ± 2% +0.70% (p=0.000 n=49+50) SelectSyncContended-8 1.68µs ± 4% 1.65µs ± 4% -1.54% (p=0.000 n=50+45) SelectAsyncContended-8 282ns ± 1% 277ns ± 1% -1.50% (p=0.000 n=48+43) SelectNonblock-8 5.31ns ± 1% 5.32ns ± 1% ~ (p=0.275 n=45+44) SelectProdCons-8 585ns ± 3% 577ns ± 2% -1.35% (p=0.000 n=50+50) GoroutineSelect-8 1.59ms ± 2% 1.59ms ± 1% ~ (p=0.084 n=49+48) Updates #16213 Change-Id: Ib555a4d7da2042a25c3976f76a436b536487d5b7 Reviewed-on: https://go-review.googlesource.com/36932 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 00:01:22 +00:00
Ian Lance Taylor	62237c2c8e	runtime: if runtime is stale while testing, show StaleReason Update #19062. Change-Id: I7397b573389145b56e73d2150ce0fc9aa75b3caa Reviewed-on: https://go-review.googlesource.com/36934 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-13 23:46:12 +00:00
Sokolov Yura	663226d8e1	runtime: make fastrand to generate 32bit values Extend period of fastrand from (1<<31)-1 to (1<<32)-1 by choosing other polynom and reacting on high bit before shift. Polynomial is taken at https://users.ece.cmu.edu/~koopman/lfsr/index.html from 32.dat.gz . It is referred as F7711115 cause this list of polynomials is for LFSR with shift to right (and fastrand uses shift to left). (old polynomial is referred in 31.dat.gz as 7BB88888). There were couple of places with conversation of fastrand to int, which leads to negative values on 32bit platforms. They are fixed. Change-Id: Ibee518a3f9103e0aea220ada494b3aec77babb72 Reviewed-on: https://go-review.googlesource.com/36875 Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-13 20:22:02 +00:00
Ian Lance Taylor	40c27ed5bc	runtime: if runtime is stale while testing, show cmd/go output Update #19062. Change-Id: If6a4c4f8d12e148b162256f13a8ee423f6e30637 Reviewed-on: https://go-review.googlesource.com/36918 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-13 20:00:05 +00:00
Ian Lance Taylor	3792db5183	net: refactor poller into new internal/poll package This will make it possible to use the poller with the os package. This is a lot of code movement but the behavior is intended to be unchanged. Update #6817. Update #7903. Update #15021. Update #18507. Change-Id: I1413685928017c32df5654ded73a2643820977ae Reviewed-on: https://go-review.googlesource.com/36799 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-13 18:36:28 +00:00
Keith Randall	5a75d6a08e	cmd/compile: optimize non-empty-interface type conversions When doing i.(T) for non-empty-interface i and concrete type T, there's no need to read the type out of the itab. Just compare the itab to the itab we expect for that interface/type pair. Also optimize type switches by putting the type hash of the concrete type in the itab. That way we don't need to load the type pointer out of the itab. Update #18492 Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce Reviewed-on: https://go-review.googlesource.com/34810 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: David Chase <drchase@google.com>	2017-02-13 18:16:31 +00:00
Josh Bleecher Snyder	8da91a6297	runtime: add Frames example Based on sample code from iant. Fixes #18788. Change-Id: I6bb33ed05af2538fbde42ddcac629280ef7c00a6 Reviewed-on: https://go-review.googlesource.com/36892 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-13 06:10:35 +00:00

1 2 3 4 5 ...

2502 Commits