qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-14 21:30:21 -07:00

Author	SHA1	Message	Date
Bryan C. Mills	ce5263ff8d	net/rpc: use a sync.Map for serviceMap instead of RWMutex This has no measurable impact on performance, but somewhat simplifies the code. updates #18177 name old time/op new time/op delta EndToEnd 54.3µs ±10% 55.7µs ±12% ~ (p=0.505 n=8+8) EndToEnd-6 31.4µs ± 9% 32.7µs ± 6% ~ (p=0.130 n=8+8) EndToEnd-48 25.5µs ±12% 26.4µs ± 6% ~ (p=0.195 n=8+8) EndToEndHTTP 53.7µs ± 8% 51.2µs ±15% ~ (p=0.463 n=7+8) EndToEndHTTP-6 30.9µs ±18% 31.2µs ±14% ~ (p=0.959 n=8+8) EndToEndHTTP-48 24.9µs ±11% 25.7µs ± 6% ~ (p=0.382 n=8+8) EndToEndAsync 23.6µs ± 7% 24.2µs ± 6% ~ (p=0.383 n=7+7) EndToEndAsync-6 21.0µs ±23% 22.0µs ±20% ~ (p=0.574 n=8+8) EndToEndAsync-48 22.8µs ±16% 23.3µs ±13% ~ (p=0.721 n=8+8) EndToEndAsyncHTTP 25.8µs ± 7% 24.7µs ±14% ~ (p=0.161 n=8+8) EndToEndAsyncHTTP-6 22.1µs ±19% 22.6µs ±12% ~ (p=0.645 n=8+8) EndToEndAsyncHTTP-48 22.9µs ±13% 22.1µs ±20% ~ (p=0.574 n=8+8) name old alloc/op new alloc/op delta EndToEnd 320B ± 0% 321B ± 0% ~ (p=1.000 n=8+8) EndToEnd-6 320B ± 0% 321B ± 0% +0.20% (p=0.037 n=8+7) EndToEnd-48 326B ± 0% 326B ± 0% ~ (p=0.124 n=8+8) EndToEndHTTP 320B ± 0% 320B ± 0% ~ (all equal) EndToEndHTTP-6 320B ± 0% 321B ± 0% ~ (p=0.077 n=8+8) EndToEndHTTP-48 324B ± 0% 324B ± 0% ~ (p=1.000 n=8+8) EndToEndAsync 227B ± 0% 227B ± 0% ~ (p=0.154 n=8+7) EndToEndAsync-6 226B ± 0% 226B ± 0% ~ (all equal) EndToEndAsync-48 230B ± 1% 229B ± 1% ~ (p=0.072 n=8+8) EndToEndAsyncHTTP 227B ± 0% 227B ± 0% ~ (all equal) EndToEndAsyncHTTP-6 226B ± 0% 226B ± 0% ~ (p=0.400 n=8+7) EndToEndAsyncHTTP-48 228B ± 0% 228B ± 0% ~ (p=0.949 n=8+6) name old allocs/op new allocs/op delta EndToEnd 9.00 ± 0% 9.00 ± 0% ~ (all equal) EndToEnd-6 9.00 ± 0% 9.00 ± 0% ~ (all equal) EndToEnd-48 9.00 ± 0% 9.00 ± 0% ~ (all equal) EndToEndHTTP 9.00 ± 0% 9.00 ± 0% ~ (all equal) EndToEndHTTP-6 9.00 ± 0% 9.00 ± 0% ~ (all equal) EndToEndHTTP-48 9.00 ± 0% 9.00 ± 0% ~ (all equal) EndToEndAsync 8.00 ± 0% 8.00 ± 0% ~ (all equal) EndToEndAsync-6 8.00 ± 0% 8.00 ± 0% ~ (all equal) EndToEndAsync-48 8.00 ± 0% 8.00 ± 0% ~ (all equal) EndToEndAsyncHTTP 8.00 ± 0% 8.00 ± 0% ~ (all equal) EndToEndAsyncHTTP-6 8.00 ± 0% 8.00 ± 0% ~ (all equal) EndToEndAsyncHTTP-48 8.00 ± 0% 8.00 ± 0% ~ (all equal) https://perf.golang.org/search?q=upload:20170428.2 Change-Id: I8ef7f71a7602302aa78c144327270dfce9211539 Reviewed-on: https://go-review.googlesource.com/42112 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-28 20:42:11 +00:00
Bryan C. Mills	d6ce7e4fec	encoding/json: replace encoderCache RWMutex with a sync.Map This provides a moderate speedup for encoding when using many CPU cores. name old time/op new time/op delta CodeEncoder 14.1ms ±10% 13.5ms ± 4% ~ (p=0.867 n=8+7) CodeEncoder-6 2.58ms ± 8% 2.72ms ± 6% ~ (p=0.065 n=8+8) CodeEncoder-48 629µs ± 1% 629µs ± 1% ~ (p=0.867 n=8+7) CodeMarshal 14.9ms ± 5% 14.9ms ± 5% ~ (p=0.721 n=8+8) CodeMarshal-6 3.28ms ±11% 3.24ms ±12% ~ (p=0.798 n=8+8) CodeMarshal-48 739µs ± 1% 745µs ± 2% ~ (p=0.328 n=8+8) CodeDecoder 49.7ms ± 4% 49.2ms ± 4% ~ (p=0.463 n=7+8) CodeDecoder-6 10.1ms ± 8% 10.4ms ± 3% ~ (p=0.232 n=7+8) CodeDecoder-48 2.60ms ± 3% 2.61ms ± 2% ~ (p=1.000 n=8+8) DecoderStream 352ns ± 5% 344ns ± 4% ~ (p=0.077 n=8+8) DecoderStream-6 485ns ± 8% 503ns ± 6% ~ (p=0.123 n=8+8) DecoderStream-48 522ns ± 7% 520ns ± 5% ~ (p=0.959 n=8+8) CodeUnmarshal 52.2ms ± 5% 54.4ms ±18% ~ (p=0.955 n=7+8) CodeUnmarshal-6 12.4ms ± 6% 12.3ms ± 6% ~ (p=0.878 n=8+8) CodeUnmarshal-48 3.46ms ± 7% 3.40ms ± 9% ~ (p=0.442 n=8+8) CodeUnmarshalReuse 48.9ms ± 6% 50.3ms ± 7% ~ (p=0.279 n=8+8) CodeUnmarshalReuse-6 10.3ms ±11% 10.3ms ±10% ~ (p=0.959 n=8+8) CodeUnmarshalReuse-48 2.68ms ± 3% 2.67ms ± 4% ~ (p=0.878 n=8+8) UnmarshalString 476ns ± 7% 474ns ± 7% ~ (p=0.644 n=8+8) UnmarshalString-6 164ns ± 9% 160ns ±10% ~ (p=0.556 n=8+8) UnmarshalString-48 181ns ± 0% 177ns ± 2% -2.36% (p=0.001 n=7+7) UnmarshalFloat64 414ns ± 4% 418ns ± 4% ~ (p=0.382 n=8+8) UnmarshalFloat64-6 147ns ± 9% 143ns ±16% ~ (p=0.457 n=8+8) UnmarshalFloat64-48 176ns ± 2% 174ns ± 2% ~ (p=0.118 n=8+8) UnmarshalInt64 369ns ± 4% 354ns ± 1% -3.85% (p=0.005 n=8+7) UnmarshalInt64-6 132ns ±11% 132ns ±10% ~ (p=0.982 n=8+8) UnmarshalInt64-48 177ns ± 3% 174ns ± 2% -1.84% (p=0.028 n=8+7) Issue10335 540ns ± 5% 535ns ± 0% ~ (p=0.330 n=7+7) Issue10335-6 159ns ± 8% 164ns ± 8% ~ (p=0.246 n=8+8) Issue10335-48 186ns ± 1% 182ns ± 2% -1.89% (p=0.010 n=8+8) Unmapped 1.74µs ± 2% 1.76µs ± 6% ~ (p=0.181 n=6+8) Unmapped-6 414ns ± 5% 402ns ±10% ~ (p=0.244 n=7+8) Unmapped-48 226ns ± 2% 224ns ± 2% ~ (p=0.144 n=7+8) NumberIsValid 20.1ns ± 4% 19.7ns ± 3% ~ (p=0.204 n=8+8) NumberIsValid-6 20.4ns ± 8% 22.2ns ±16% ~ (p=0.129 n=7+8) NumberIsValid-48 23.1ns ±12% 23.8ns ± 8% ~ (p=0.104 n=8+8) NumberIsValidRegexp 629ns ± 5% 622ns ± 0% ~ (p=0.148 n=7+7) NumberIsValidRegexp-6 757ns ± 2% 725ns ±14% ~ (p=0.351 n=8+7) NumberIsValidRegexp-48 757ns ± 2% 723ns ±13% ~ (p=0.521 n=8+8) SkipValue 13.2ms ± 9% 13.3ms ± 1% ~ (p=0.130 n=8+8) SkipValue-6 15.1ms ±10% 14.8ms ± 2% ~ (p=0.397 n=7+8) SkipValue-48 13.9ms ±12% 14.3ms ± 1% ~ (p=0.694 n=8+7) EncoderEncode 433ns ± 4% 410ns ± 3% -5.48% (p=0.001 n=8+8) EncoderEncode-6 221ns ±15% 75ns ± 5% -66.15% (p=0.000 n=7+8) EncoderEncode-48 161ns ± 4% 19ns ± 7% -88.29% (p=0.000 n=7+8) name old speed new speed delta CodeEncoder 139MB/s ±10% 144MB/s ± 4% ~ (p=0.844 n=8+7) CodeEncoder-6 756MB/s ± 8% 714MB/s ± 6% ~ (p=0.065 n=8+8) CodeEncoder-48 3.08GB/s ± 1% 3.09GB/s ± 1% ~ (p=0.867 n=8+7) CodeMarshal 130MB/s ± 5% 130MB/s ± 5% ~ (p=0.721 n=8+8) CodeMarshal-6 594MB/s ±10% 601MB/s ±11% ~ (p=0.798 n=8+8) CodeMarshal-48 2.62GB/s ± 1% 2.60GB/s ± 2% ~ (p=0.328 n=8+8) CodeDecoder 39.0MB/s ± 4% 39.5MB/s ± 4% ~ (p=0.463 n=7+8) CodeDecoder-6 189MB/s ±13% 187MB/s ± 3% ~ (p=0.505 n=8+8) CodeDecoder-48 746MB/s ± 2% 745MB/s ± 2% ~ (p=1.000 n=8+8) CodeUnmarshal 37.2MB/s ± 5% 35.9MB/s ±16% ~ (p=0.955 n=7+8) CodeUnmarshal-6 157MB/s ± 6% 158MB/s ± 6% ~ (p=0.878 n=8+8) CodeUnmarshal-48 561MB/s ± 7% 572MB/s ±10% ~ (p=0.442 n=8+8) SkipValue 141MB/s ±10% 139MB/s ± 1% ~ (p=0.130 n=8+8) SkipValue-6 131MB/s ± 3% 133MB/s ± 2% ~ (p=0.662 n=6+8) SkipValue-48 138MB/s ±11% 132MB/s ± 1% ~ (p=0.281 n=8+7) name old alloc/op new alloc/op delta CodeEncoder 45.9kB ± 0% 45.9kB ± 0% -0.02% (p=0.002 n=7+8) CodeEncoder-6 55.1kB ± 0% 55.1kB ± 0% -0.01% (p=0.002 n=7+8) CodeEncoder-48 110kB ± 0% 110kB ± 0% -0.00% (p=0.030 n=7+8) CodeMarshal 4.59MB ± 0% 4.59MB ± 0% -0.00% (p=0.000 n=8+8) CodeMarshal-6 4.59MB ± 0% 4.59MB ± 0% -0.00% (p=0.000 n=8+8) CodeMarshal-48 4.59MB ± 0% 4.59MB ± 0% -0.00% (p=0.001 n=7+8) CodeDecoder 2.28MB ± 5% 2.21MB ± 0% ~ (p=0.257 n=8+7) CodeDecoder-6 2.43MB ±11% 2.51MB ± 0% ~ (p=0.473 n=8+8) CodeDecoder-48 2.93MB ± 0% 2.93MB ± 0% ~ (p=0.554 n=7+8) DecoderStream 16.0B ± 0% 16.0B ± 0% ~ (all equal) DecoderStream-6 16.0B ± 0% 16.0B ± 0% ~ (all equal) DecoderStream-48 16.0B ± 0% 16.0B ± 0% ~ (all equal) CodeUnmarshal 3.28MB ± 0% 3.28MB ± 0% ~ (p=1.000 n=7+7) CodeUnmarshal-6 3.28MB ± 0% 3.28MB ± 0% ~ (p=0.593 n=8+8) CodeUnmarshal-48 3.28MB ± 0% 3.28MB ± 0% ~ (p=0.670 n=8+8) CodeUnmarshalReuse 1.87MB ± 0% 1.88MB ± 1% +0.48% (p=0.011 n=7+8) CodeUnmarshalReuse-6 1.90MB ± 1% 1.90MB ± 1% ~ (p=0.589 n=8+8) CodeUnmarshalReuse-48 1.96MB ± 0% 1.96MB ± 0% +0.00% (p=0.002 n=7+8) UnmarshalString 304B ± 0% 304B ± 0% ~ (all equal) UnmarshalString-6 304B ± 0% 304B ± 0% ~ (all equal) UnmarshalString-48 304B ± 0% 304B ± 0% ~ (all equal) UnmarshalFloat64 292B ± 0% 292B ± 0% ~ (all equal) UnmarshalFloat64-6 292B ± 0% 292B ± 0% ~ (all equal) UnmarshalFloat64-48 292B ± 0% 292B ± 0% ~ (all equal) UnmarshalInt64 289B ± 0% 289B ± 0% ~ (all equal) UnmarshalInt64-6 289B ± 0% 289B ± 0% ~ (all equal) UnmarshalInt64-48 289B ± 0% 289B ± 0% ~ (all equal) Issue10335 312B ± 0% 312B ± 0% ~ (all equal) Issue10335-6 312B ± 0% 312B ± 0% ~ (all equal) Issue10335-48 312B ± 0% 312B ± 0% ~ (all equal) Unmapped 344B ± 0% 344B ± 0% ~ (all equal) Unmapped-6 344B ± 0% 344B ± 0% ~ (all equal) Unmapped-48 344B ± 0% 344B ± 0% ~ (all equal) NumberIsValid 0.00B 0.00B ~ (all equal) NumberIsValid-6 0.00B 0.00B ~ (all equal) NumberIsValid-48 0.00B 0.00B ~ (all equal) NumberIsValidRegexp 0.00B 0.00B ~ (all equal) NumberIsValidRegexp-6 0.00B 0.00B ~ (all equal) NumberIsValidRegexp-48 0.00B 0.00B ~ (all equal) SkipValue 0.00B 0.00B ~ (all equal) SkipValue-6 0.00B 0.00B ~ (all equal) SkipValue-48 15.0B ±167% 0.0B ~ (p=0.200 n=8+8) EncoderEncode 8.00B ± 0% 0.00B -100.00% (p=0.000 n=8+8) EncoderEncode-6 8.00B ± 0% 0.00B -100.00% (p=0.000 n=8+8) EncoderEncode-48 8.00B ± 0% 0.00B -100.00% (p=0.000 n=8+8) name old allocs/op new allocs/op delta CodeEncoder 1.00 ± 0% 0.00 -100.00% (p=0.000 n=8+8) CodeEncoder-6 1.00 ± 0% 0.00 -100.00% (p=0.000 n=8+8) CodeEncoder-48 1.00 ± 0% 0.00 -100.00% (p=0.000 n=8+8) CodeMarshal 17.0 ± 0% 16.0 ± 0% -5.88% (p=0.000 n=8+8) CodeMarshal-6 17.0 ± 0% 16.0 ± 0% -5.88% (p=0.000 n=8+8) CodeMarshal-48 17.0 ± 0% 16.0 ± 0% -5.88% (p=0.000 n=8+8) CodeDecoder 89.6k ± 0% 89.5k ± 0% ~ (p=0.154 n=8+7) CodeDecoder-6 89.8k ± 0% 89.9k ± 0% ~ (p=0.467 n=8+8) CodeDecoder-48 90.5k ± 0% 90.5k ± 0% ~ (p=0.533 n=8+7) DecoderStream 2.00 ± 0% 2.00 ± 0% ~ (all equal) DecoderStream-6 2.00 ± 0% 2.00 ± 0% ~ (all equal) DecoderStream-48 2.00 ± 0% 2.00 ± 0% ~ (all equal) CodeUnmarshal 105k ± 0% 105k ± 0% ~ (all equal) CodeUnmarshal-6 105k ± 0% 105k ± 0% ~ (all equal) CodeUnmarshal-48 105k ± 0% 105k ± 0% ~ (all equal) CodeUnmarshalReuse 89.5k ± 0% 89.6k ± 0% ~ (p=0.246 n=7+8) CodeUnmarshalReuse-6 89.8k ± 0% 89.8k ± 0% ~ (p=1.000 n=8+8) CodeUnmarshalReuse-48 90.5k ± 0% 90.5k ± 0% ~ (all equal) UnmarshalString 2.00 ± 0% 2.00 ± 0% ~ (all equal) UnmarshalString-6 2.00 ± 0% 2.00 ± 0% ~ (all equal) UnmarshalString-48 2.00 ± 0% 2.00 ± 0% ~ (all equal) UnmarshalFloat64 2.00 ± 0% 2.00 ± 0% ~ (all equal) UnmarshalFloat64-6 2.00 ± 0% 2.00 ± 0% ~ (all equal) UnmarshalFloat64-48 2.00 ± 0% 2.00 ± 0% ~ (all equal) UnmarshalInt64 2.00 ± 0% 2.00 ± 0% ~ (all equal) UnmarshalInt64-6 2.00 ± 0% 2.00 ± 0% ~ (all equal) UnmarshalInt64-48 2.00 ± 0% 2.00 ± 0% ~ (all equal) Issue10335 3.00 ± 0% 3.00 ± 0% ~ (all equal) Issue10335-6 3.00 ± 0% 3.00 ± 0% ~ (all equal) Issue10335-48 3.00 ± 0% 3.00 ± 0% ~ (all equal) Unmapped 4.00 ± 0% 4.00 ± 0% ~ (all equal) Unmapped-6 4.00 ± 0% 4.00 ± 0% ~ (all equal) Unmapped-48 4.00 ± 0% 4.00 ± 0% ~ (all equal) NumberIsValid 0.00 0.00 ~ (all equal) NumberIsValid-6 0.00 0.00 ~ (all equal) NumberIsValid-48 0.00 0.00 ~ (all equal) NumberIsValidRegexp 0.00 0.00 ~ (all equal) NumberIsValidRegexp-6 0.00 0.00 ~ (all equal) NumberIsValidRegexp-48 0.00 0.00 ~ (all equal) SkipValue 0.00 0.00 ~ (all equal) SkipValue-6 0.00 0.00 ~ (all equal) SkipValue-48 0.00 0.00 ~ (all equal) EncoderEncode 1.00 ± 0% 0.00 -100.00% (p=0.000 n=8+8) EncoderEncode-6 1.00 ± 0% 0.00 -100.00% (p=0.000 n=8+8) EncoderEncode-48 1.00 ± 0% 0.00 -100.00% (p=0.000 n=8+8) https://perf.golang.org/search?q=upload:20170427.2 updates #17973 updates #18177 Change-Id: I5881c7a2bfad1766e6aa3444bb630883e0be467b Reviewed-on: https://go-review.googlesource.com/41931 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-28 20:19:10 +00:00
Josh Bleecher Snyder	92363d52c0	cmd/compile: check width of embedded interfaces in expandiface The code in #20162 contains an embedded interface. It didn't get dowidth'd by the frontend, and during DWARF generation, ngotype asked for a string description of it, which triggered a request for the number of fields in the interface, which triggered a dowidth, which is disallowed in the backend. The other changes in this CL are to support the test. Fixes #20162 Change-Id: I4d0be5bd949c361d4cdc89a8ed28b10977e40cf9 Reviewed-on: https://go-review.googlesource.com/42131 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-28 20:07:38 +00:00
Michael Hudson-Doyle	e29ea14100	cmd/link/internal/ld: unexport ReadOnly and RelROMap Change-Id: I08e33b92dd8a22e28ec15aa5753904aa8e1c71f5 Reviewed-on: https://go-review.googlesource.com/42031 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-28 20:02:50 +00:00
Michael Hudson-Doyle	4aca8b00ff	cmd/internal/objabi: shrink SymType down to a uint8 Now that it only takes small values. Change-Id: I08086d392529d8775b470d65afc2475f8d0e7f4a Reviewed-on: https://go-review.googlesource.com/42030 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-28 20:02:20 +00:00
Michael Hudson-Doyle	d2a9545178	cmd/internal: remove SymKind values that are only checked for, never set Change-Id: Id152767c033c12966e9e12ae303b99f38776f919 Reviewed-on: https://go-review.googlesource.com/40987 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-28 20:01:54 +00:00
Josh Bleecher Snyder	794d29a46f	cmd/compile: use a map to track liveness variable indices It is not safe to modify Node.Opt in the backend. Instead of using Node.Opt to store liveness variable indices, use a map. This simplifies the code and makes it much more clearly race-free. There are generally few such variables, so the maps are not a significant source of allocations; this also remove some allocations from putting int32s into interfaces. Because map lookups are more expensive than interface value extraction, reorder valueEffects to do the map lookup last. The only remaining use of Node.Opt is now in esc.go. Passes toolstash-check. Fixes #20144 name old alloc/op new alloc/op delta Template 37.8MB ± 0% 37.9MB ± 0% ~ (p=0.548 n=5+5) Unicode 28.9MB ± 0% 28.9MB ± 0% ~ (p=0.548 n=5+5) GoTypes 110MB ± 0% 110MB ± 0% +0.16% (p=0.008 n=5+5) Compiler 461MB ± 0% 462MB ± 0% +0.08% (p=0.008 n=5+5) SSA 1.11GB ± 0% 1.11GB ± 0% +0.11% (p=0.008 n=5+5) Flate 24.7MB ± 0% 24.7MB ± 0% ~ (p=0.690 n=5+5) GoParser 31.1MB ± 0% 31.1MB ± 0% ~ (p=0.841 n=5+5) Reflect 73.7MB ± 0% 73.8MB ± 0% +0.23% (p=0.008 n=5+5) Tar 25.8MB ± 0% 25.7MB ± 0% ~ (p=0.690 n=5+5) XML 41.2MB ± 0% 41.2MB ± 0% ~ (p=0.841 n=5+5) [Geo mean] 71.9MB 71.9MB +0.06% name old allocs/op new allocs/op delta Template 385k ± 0% 384k ± 0% ~ (p=0.548 n=5+5) Unicode 344k ± 0% 343k ± 1% ~ (p=0.421 n=5+5) GoTypes 1.16M ± 0% 1.16M ± 0% ~ (p=0.690 n=5+5) Compiler 4.43M ± 0% 4.42M ± 0% ~ (p=0.095 n=5+5) SSA 9.86M ± 0% 9.84M ± 0% -0.19% (p=0.008 n=5+5) Flate 238k ± 0% 238k ± 0% ~ (p=1.000 n=5+5) GoParser 321k ± 0% 320k ± 0% ~ (p=0.310 n=5+5) Reflect 956k ± 0% 956k ± 0% ~ (p=1.000 n=5+5) Tar 252k ± 0% 251k ± 0% ~ (p=0.056 n=5+5) XML 402k ± 1% 400k ± 1% -0.57% (p=0.032 n=5+5) [Geo mean] 740k 739k -0.19% Change-Id: Id5916c9def76add272e89c59fe10968f0a6bb01d Reviewed-on: https://go-review.googlesource.com/42135 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-28 19:50:53 +00:00
Brad Fitzpatrick	07a22bbc11	net/http: re-simplify HTTP/1.x status line writing It used to be simple, and then it got complicated for speed (to reduce allocations, mostly), but that involved a mutex and hurt multi-core performance, contending on the mutex. A change was sent to try to improve that mutex contention in https://go-review.googlesource.com/c/42110/2/src/net/http/server.go but that introduced its own allocations (the string->interface{} boxing for the sync.Map key), which runs counter to the whole point of that statusLine function: to remove allocations. Instead, make the code simple again and not have a mutex. It's a bit slower for the single-core case, but nobody with a single-user HTTP server cares about 50 nanoseconds: name old time/op new time/op delta ResponseStatusLine 37.5ns ± 2% 87.1ns ± 2% +132.42% (p=0.029 n=4+4) ResponseStatusLine-2 63.1ns ± 1% 43.1ns ±12% -31.67% (p=0.029 n=4+4) ResponseStatusLine-4 53.8ns ± 8% 40.2ns ± 2% -25.29% (p=0.029 n=4+4) name old alloc/op new alloc/op delta ResponseStatusLine 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) ResponseStatusLine-2 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) ResponseStatusLine-4 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) name old allocs/op new allocs/op delta ResponseStatusLine 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) ResponseStatusLine-2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) ResponseStatusLine-4 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) (Note the code could be even simpler with fmt.Fprintf, but that is relatively slow and involves a bunch of allocations getting arguments into interface{} for the call) Change-Id: I1fa119132dbbf97a8e7204ce3e0707d433060da2 Reviewed-on: https://go-review.googlesource.com/42133 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com>	2017-04-28 19:11:17 +00:00
Daniel Martí	16b6bb88eb	cmd/go: error on space-separated list with comma Using 'go build -tags "foo,bar"' might seem to work when you wanted -tags "foo bar", since they make up a single tag that doesn't exist and the build is unaffected. Instead, error on any tag that contains a comma. Fixes #18800. Change-Id: I6641e03e2ae121c8878d6301c4311aef97026b73 Reviewed-on: https://go-review.googlesource.com/41951 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-04-28 19:08:35 +00:00
Ian Lance Taylor	60db9fb6bc	cmd/go: don't run TestTestRaceInstall in short mode Fixes #20158 Change-Id: Iefa9a33569eb805f5ab678d17c37787835bc7efa Reviewed-on: https://go-review.googlesource.com/42134 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-28 18:34:49 +00:00
Justin Nuß	585be4639b	os/exec: document that non-comparable writers may race The comment for Cmd.Stdout and Cmd.Stderr says that it's safe to set both to the same writer, but it doesn't say that this only works when both writers are comparable. This change updates the comment to explain that using a non-comparable writer may still lead to a race. Fixes #19804 Change-Id: I63b420034666209a2b6fab48b9047c9d07b825e2 Reviewed-on: https://go-review.googlesource.com/42052 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-04-28 17:57:01 +00:00
Michael Matloob	f105c91757	runtime/pprof: propagate profile labels into profile proto Profile labels added by the user using pprof.Do, if present will be in a *labelMap stored in the unsafe.Pointer 'tag' field of the profile map entry. This change extracts the labels from the tag field and writes them to the profile proto. Change-Id: Ic40fdc58b66e993ca91d5d5effe0e04ffbb5bc46 Reviewed-on: https://go-review.googlesource.com/39613 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-04-28 17:37:58 +00:00
Russ Cox	c82efb1fa3	runtime: fix profile handling of labels for race detector If g1 sets its labels and then they are copied into a profile buffer and then g2 reads the profile buffer and inspects the labels, the race detector must understand that g1's recording of the labels happens before g2's use of the labels. Make that so. Fixes race test failure in CL 39613. Change-Id: Id7cda1c2aac6f8eef49213b5ca414f7154b4acfa Reviewed-on: https://go-review.googlesource.com/42111 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-04-28 17:37:46 +00:00
Robert Griesemer	50f67add81	spec: clarify admissible argument types for print, println Fixes #19885. Change-Id: I55420aace1b0f714df2d6460d2d1595f6863dd06 Reviewed-on: https://go-review.googlesource.com/42023 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-04-28 16:37:31 +00:00
Robert Griesemer	86cfe93515	bytes: clarify documentation for UnreadByte/Rune Fixes #19522. Change-Id: Ib3cf0336e0bf91580d533704ec1a9d45eb0bf62d Reviewed-on: https://go-review.googlesource.com/42020 Reviewed-by: Rob Pike <r@golang.org>	2017-04-28 16:37:13 +00:00
Josh Bleecher Snyder	85d6a29ae6	cmd/compile: prevent infinite recursion printing types in Fatalf Updates #20162 Change-Id: Ie289bae0d0be8430e492ac73fd6e6bf36991d4a1 Reviewed-on: https://go-review.googlesource.com/42130 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-28 16:08:01 +00:00
Dmitri Shuralyov	6511931810	cmd/go/internal/get: allow go get on github.com/ import paths with Unicode letters More specifically, allow Unicode letters in the directories of GitHub repositories, which can occur and don't have a valid reason to be disallowed by go get. Do so by using a predefined character class, the Unicode character property class \p{L} that describes the Unicode characters that are letters: http://www.regular-expressions.info/unicode.html#category Since it's not possible to create GitHub usernames or repositories containing Unicode letters at this time, those parts of the import path are still restricted to ASCII letters only. Fix name of tested func in t.Errorf messages. Fixes #18660. Change-Id: Ia0ef4742bfd8317d989ef1eb1d7065e382852fe2 Reviewed-on: https://go-review.googlesource.com/41822 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-28 15:32:18 +00:00
Bryan C. Mills	eb6adc27d5	encoding/xml: replace tinfoMap RWMutex with sync.Map This simplifies the code a bit and provides a modest speedup for Marshal with many CPUs. updates #17973 updates #18177 name old time/op new time/op delta Marshal 15.8µs ± 1% 15.9µs ± 1% +0.67% (p=0.021 n=8+7) Marshal-6 5.76µs ±11% 5.17µs ± 2% -10.36% (p=0.002 n=8+8) Marshal-48 9.88µs ± 5% 7.31µs ± 6% -26.04% (p=0.000 n=8+8) Unmarshal 44.7µs ± 3% 45.1µs ± 5% ~ (p=0.645 n=8+8) Unmarshal-6 12.1µs ± 7% 11.8µs ± 8% ~ (p=0.442 n=8+8) Unmarshal-48 18.7µs ± 3% 18.2µs ± 4% ~ (p=0.054 n=7+8) name old alloc/op new alloc/op delta Marshal 5.78kB ± 0% 5.78kB ± 0% ~ (all equal) Marshal-6 5.78kB ± 0% 5.78kB ± 0% ~ (all equal) Marshal-48 5.78kB ± 0% 5.78kB ± 0% ~ (all equal) Unmarshal 8.58kB ± 0% 8.58kB ± 0% ~ (all equal) Unmarshal-6 8.58kB ± 0% 8.58kB ± 0% ~ (all equal) Unmarshal-48 8.58kB ± 0% 8.58kB ± 0% ~ (p=1.000 n=8+8) name old allocs/op new allocs/op delta Marshal 23.0 ± 0% 23.0 ± 0% ~ (all equal) Marshal-6 23.0 ± 0% 23.0 ± 0% ~ (all equal) Marshal-48 23.0 ± 0% 23.0 ± 0% ~ (all equal) Unmarshal 189 ± 0% 189 ± 0% ~ (all equal) Unmarshal-6 189 ± 0% 189 ± 0% ~ (all equal) Unmarshal-48 189 ± 0% 189 ± 0% ~ (all equal) https://perf.golang.org/search?q=upload:20170427.5 Change-Id: I4ee95a99540d3e4e47e056fff18357efd2cd340a Reviewed-on: https://go-review.googlesource.com/41991 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-04-28 14:36:14 +00:00
Alberto Donizetti	8db4d02e8f	cmd/go: reject buildmode=pie when -race is enabled Fixes #20038 Change-Id: Id692790ea406892bbe29090d461356bac28b6150 Reviewed-on: https://go-review.googlesource.com/41333 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-28 07:27:25 +00:00
Kevin Burke	89ebdbb5fd	regexp: speed up QuoteMeta with a lookup table This is the same technique used in CL 24466. By adding a little bit of size to the binary, we can remove a function call and gain a lot of performance. A raw array ([128]bool) would be faster, but is also be 128 bytes instead of 16. Running tip on a Mac: name old time/op new time/op delta QuoteMetaAll-4 192ns ±12% 120ns ±11% -37.27% (p=0.000 n=10+10) QuoteMetaNone-4 186ns ± 6% 64ns ± 6% -65.52% (p=0.000 n=10+10) name old speed new speed delta QuoteMetaAll-4 73.2MB/s ±11% 116.6MB/s ±10% +59.21% (p=0.000 n=10+10) QuoteMetaNone-4 139MB/s ± 6% 405MB/s ± 6% +190.74% (p=0.000 n=10+10) Change-Id: I68ce9fe2ef1c28e2274157789b35b0dd6ae3efb5 Reviewed-on: https://go-review.googlesource.com/41495 Run-TryBot: Kevin Burke <kev@inburke.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-28 06:43:14 +00:00
Nigel Tao	642a1cc756	compress/lzw: fix hi code overflow. Change-Id: I2d3c3c715d857305944cd96c45554a16cb7967e9 Reviewed-on: https://go-review.googlesource.com/42032 Reviewed-by: David Symonds <dsymonds@golang.org>	2017-04-28 05:59:30 +00:00
Tommy Schaefer	4fcceca192	syscall: fix typo in documentation for StringToUTF16Ptr Fixes #20133 Change-Id: Ic1a6eb35de1f9ddac9527335eb49bf0b52963b6a Reviewed-on: https://go-review.googlesource.com/41992 Reviewed-by: Rob Pike <r@golang.org>	2017-04-28 05:28:27 +00:00
Josh Bleecher Snyder	c51559813f	cmd/compile: add sizeCalculationDisabled flag Use it to ensure that dowidth is not called from the backend on a type whose size has not yet been calculated. This is an alternative to CL 42016. Change-Id: I8c7b4410ee4c2a68573102f6b9b635f4fdcf392e Reviewed-on: https://go-review.googlesource.com/42018 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-28 01:24:52 +00:00
Josh Bleecher Snyder	dae5389d3d	Revert "cmd/compile: add Type.MustSize and Type.MustAlignment" This reverts commit `94d540a4b6`. Reason for revert: prefer something along the lines of CL 42018. Change-Id: I876fe32e98f37d8d725fe55e0fd0ea429c0198e0 Reviewed-on: https://go-review.googlesource.com/42022 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-28 01:24:13 +00:00
Mikio Hara	3a342af977	net: simplify probeWindowsIPStack Change-Id: Ia45f05c63611ade4fe605b389c404953a7afbd1d Reviewed-on: https://go-review.googlesource.com/41837 Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-04-28 00:03:30 +00:00
Mikio Hara	bf4cd98c8b	net: make zone helpers into methods of ipv6ZoneCache Change-Id: Id93e78f0c8bef125f124a0a919053208e24a63cd Reviewed-on: https://go-review.googlesource.com/41836 Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-04-28 00:03:17 +00:00
Mikio Hara	cf74533b6b	syscall: stylistic cleanup and typo fixes in syscall_dragonfly.go Now it's not very different from syscall_dragonfly.go in golang.org/x/sys/unix repository. Change-Id: I8dfd22e1ebce9dc2cc71ab9ab7f0c92d93b2b762 Reviewed-on: https://go-review.googlesource.com/41835 Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-04-28 00:03:04 +00:00
Josh Bleecher Snyder	12c286c149	cmd/compile: minor writebarrier cleanup This CL mainly moves some work to the switch on w.Op, to make a follow-up change simpler and clearer. Updates #19838 Change-Id: I86f3181c380dd60960afcc24224f655276b8956c Reviewed-on: https://go-review.googlesource.com/42010 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-04-27 23:44:49 +00:00
Josh Bleecher Snyder	fc08a19cef	cmd/compile: move Used from gc.Node to gc.Name Node.Used was written to from the backend concurrently with reads of Node.Class for the same ONAME Nodes. I do not know why it was not failing consistently under the race detector, but it is a race. This is likely also a problem with Node.HasVal and Node.HasOpt. They will be handled in a separate CL. Fix Used by moving it to gc.Name and making it a separate bool. There was one non-Name use of Used, marking OLABELs as used. That is no longer needed, now that goto and label checking happens early in the front end. Leave the getters and setters in place, to ease changing the representation in the future (or changing to an interface!). Updates #20144 Change-Id: I9bec7c6d33dcb129a4cfa9d338462ea33087f9f7 Reviewed-on: https://go-review.googlesource.com/42015 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-27 22:58:13 +00:00
Josh Bleecher Snyder	94d540a4b6	cmd/compile: add Type.MustSize and Type.MustAlignment Type.Size and Type.Alignment are for the front end: They calculate size and alignment if needed. Type.MustSize and Type.MustAlignment are for the back end: They call Fatal if size and alignment are not already calculated. Most uses are of MustSize and MustAlignment, but that's because the back end is newer, and this API was added to support it. This CL was mostly generated with sed and selective reversion. The only mildly interesting bit is the change of the ssa.Type interface and the supporting ssa dummy types. Follow-up to review feedback on CL 41970. Passes toolstash-check. Change-Id: I0d9b9505e57453dae8fb6a236a07a7a02abd459e Reviewed-on: https://go-review.googlesource.com/42016 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-27 22:57:57 +00:00
Josh Bleecher Snyder	0b6a10ef24	cmd/compile: dowidth more in the front end dowidth is fundamentally unsafe to call from the back end; it will cause data races. Replace all calls to dowidth in the backend with assertions that the width has been calculated. Then fix all the cases in which that was not so, including the cases from #20145. Fixes #20145. Change-Id: Idba3d19d75638851a30ec2ebcdb703c19da3e92b Reviewed-on: https://go-review.googlesource.com/41970 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-27 22:10:32 +00:00
Michael Hudson-Doyle	be2ee2a4b4	cmd/internal/objabi, cmd/link: move linker-only symkind values into linker Many (most!) of the values of objapi.SymKind are used only in the linker, so this creates a separate cmd/link/internal/ld.SymKind type, removes most values from SymKind and maps one to the other when reading object files in the linker. Two of the remaining objapi.SymKind values are only checked for, never set and so will never be actually found but I wanted to keep this to the most mechanical change possible. Change-Id: I4bbc5aed6713cab3e8de732e6e288eb77be0474c Reviewed-on: https://go-review.googlesource.com/40985 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-27 21:56:12 +00:00
Hana Kim	b1868cf107	dwarf: add marker for embedded fields in dwarf Currently, the following two codes generate the identical dwarf info for type Foo. prog 1) type Foo struct { Bar } prog 2) type Foo struct { Bar Bar } This change adds a go-specific attribute DW_AT_go_embedded_field to annotate each member entry. Its absence or false value indicates the corresponding member is not an embedded field. Update #20037 Change-Id: Ibcbd2714f3e4d97c7b523d7398f29ab2301cc897 Reviewed-on: https://go-review.googlesource.com/41873 Reviewed-by: David Chase <drchase@google.com>	2017-04-27 19:57:02 +00:00
Josh Bleecher Snyder	f5c878e030	cmd/compile: randomize compilation order when race-enabled There's been one failure on the race builder so far, before we started sorting functions by length. The race detector can only detect actual races, and ordering functions by length might reduce the odds of catching some kinds of races. Give it more to chew on. Updates #20144 Change-Id: I0206ac182cb98b70a729dea9703ecb0fef54d2d0 Reviewed-on: https://go-review.googlesource.com/41973 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-27 19:27:22 +00:00
Josh Bleecher Snyder	26e126d6e6	cmd/compile: move nodarg to walk.go Its sole use is in walk.go. 100% code movement. gsubr.go increasingly contains backend-y things. With a few more relocations, it could probably be fruitfully renamed progs.go. Change-Id: I61ec5c2bc1f8cfdda64c6d6f580952c154ff60e0 Reviewed-on: https://go-review.googlesource.com/41972 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-27 19:08:26 +00:00
Josh Bleecher Snyder	fcee3777fd	cmd/compile: move addrescapes and moveToHeap to esc.go They were used only in esc.go. 100% code movement. Also, remove the rather outdated comment at the top of gen.go. It's not really clear what gen.go is for any more. Change-Id: Iaedfe7015ef6f5c11c49f3e6721b15d779a00faa Reviewed-on: https://go-review.googlesource.com/41971 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-27 19:08:20 +00:00
Keith Randall	14f3ca56ed	cmd/internal/obj: ARM, use immediates instead of constant pool entries When a constant doesn't fit in a single instruction, use two paired instructions instead of the constant pool. For example ADD $0xaa00bb, R0, R1 Used to rewrite to: MOV ?(IP), R11 ADD R11, R0, R1 Instead, do: ADD $0xaa0000, R0, R1 ADD $0xbb, R1, R1 Same number of instructions. Good: 4 less bytes (no constant pool entry) One less load. Bad: Critical path is one instruction longer. It's probably worth it to avoid the loads, they are expensive. Dave Cheney got us some performance numbers: https://perf.golang.org/search?q=upload:20170426.1 TL;DR mean 1.37% improvement. Change-Id: Ib206836161fdc94a3962db6f9caa635c87d57cf1 Reviewed-on: https://go-review.googlesource.com/41612 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-04-27 16:45:01 +00:00
Bryan C. Mills	c120e449fb	encoding/gob: replace RWMutex usage with sync.Map This provides a significant speedup for encoding and decoding when using many CPU cores. name old time/op new time/op delta EndToEndPipe 5.26µs ± 2% 5.38µs ± 7% ~ (p=0.121 n=8+7) EndToEndPipe-6 1.86µs ± 5% 1.80µs ±11% ~ (p=0.442 n=8+8) EndToEndPipe-48 1.39µs ± 2% 1.41µs ± 4% ~ (p=0.645 n=8+8) EndToEndByteBuffer 1.54µs ± 5% 1.57µs ± 5% ~ (p=0.130 n=8+8) EndToEndByteBuffer-6 620ns ± 6% 310ns ± 8% -50.04% (p=0.000 n=8+8) EndToEndByteBuffer-48 506ns ± 4% 110ns ± 3% -78.22% (p=0.000 n=8+8) EndToEndSliceByteBuffer 149µs ± 3% 153µs ± 5% +2.80% (p=0.021 n=8+8) EndToEndSliceByteBuffer-6 103µs ±17% 31µs ±12% -70.06% (p=0.000 n=8+8) EndToEndSliceByteBuffer-48 93.2µs ± 2% 18.0µs ± 5% -80.66% (p=0.000 n=7+8) EncodeComplex128Slice 20.6µs ± 5% 20.9µs ± 8% ~ (p=0.959 n=8+8) EncodeComplex128Slice-6 4.10µs ±10% 3.75µs ± 8% -8.58% (p=0.004 n=8+7) EncodeComplex128Slice-48 1.14µs ± 2% 0.81µs ± 2% -28.98% (p=0.000 n=8+8) EncodeFloat64Slice 10.2µs ± 7% 10.1µs ± 6% ~ (p=0.694 n=7+8) EncodeFloat64Slice-6 2.01µs ± 6% 1.80µs ±11% -10.30% (p=0.004 n=8+8) EncodeFloat64Slice-48 701ns ± 3% 408ns ± 2% -41.72% (p=0.000 n=8+8) EncodeInt32Slice 11.8µs ± 7% 11.7µs ± 6% ~ (p=0.463 n=8+7) EncodeInt32Slice-6 2.32µs ± 4% 2.06µs ± 5% -10.89% (p=0.000 n=8+8) EncodeInt32Slice-48 731ns ± 2% 445ns ± 2% -39.10% (p=0.000 n=7+8) EncodeStringSlice 9.13µs ± 9% 9.18µs ± 8% ~ (p=0.798 n=8+8) EncodeStringSlice-6 1.91µs ± 5% 1.70µs ± 5% -11.07% (p=0.000 n=8+8) EncodeStringSlice-48 679ns ± 3% 397ns ± 3% -41.50% (p=0.000 n=8+8) EncodeInterfaceSlice 449µs ±11% 461µs ± 9% ~ (p=0.328 n=8+8) EncodeInterfaceSlice-6 503µs ± 7% 88µs ± 7% -82.51% (p=0.000 n=7+8) EncodeInterfaceSlice-48 335µs ± 8% 22µs ± 1% -93.55% (p=0.000 n=8+7) DecodeComplex128Slice 67.2µs ± 4% 67.0µs ± 6% ~ (p=0.721 n=8+8) DecodeComplex128Slice-6 22.0µs ± 8% 18.9µs ± 5% -14.44% (p=0.000 n=8+8) DecodeComplex128Slice-48 46.8µs ± 3% 34.9µs ± 3% -25.48% (p=0.000 n=8+8) DecodeFloat64Slice 39.4µs ± 4% 40.3µs ± 3% ~ (p=0.105 n=8+8) DecodeFloat64Slice-6 16.1µs ± 2% 11.2µs ± 7% -30.64% (p=0.001 n=6+7) DecodeFloat64Slice-48 38.1µs ± 3% 24.0µs ± 7% -37.10% (p=0.000 n=8+8) DecodeInt32Slice 39.1µs ± 4% 40.1µs ± 5% ~ (p=0.083 n=8+8) DecodeInt32Slice-6 16.3µs ±21% 10.6µs ± 1% -35.17% (p=0.000 n=8+7) DecodeInt32Slice-48 36.5µs ± 6% 21.9µs ± 9% -39.89% (p=0.000 n=8+8) DecodeStringSlice 82.9µs ± 6% 85.5µs ± 5% ~ (p=0.121 n=8+7) DecodeStringSlice-6 32.4µs ±11% 26.8µs ±16% -17.37% (p=0.000 n=8+8) DecodeStringSlice-48 76.0µs ± 2% 57.0µs ± 5% -25.02% (p=0.000 n=8+8) DecodeInterfaceSlice 718µs ± 4% 752µs ± 5% +4.83% (p=0.038 n=8+8) DecodeInterfaceSlice-6 500µs ± 6% 165µs ± 7% -66.95% (p=0.000 n=7+8) DecodeInterfaceSlice-48 470µs ± 5% 120µs ± 6% -74.55% (p=0.000 n=8+7) DecodeMap 3.29ms ± 5% 3.34ms ± 5% ~ (p=0.279 n=8+8) DecodeMap-6 7.73ms ± 8% 7.53ms ±18% ~ (p=0.779 n=7+8) DecodeMap-48 7.46ms ± 6% 7.71ms ± 3% ~ (p=0.161 n=8+8) https://perf.golang.org/search?q=upload:20170426.4 Change-Id: I335874028ef8d7c991051004f8caadd16c92d5cc Reviewed-on: https://go-review.googlesource.com/41872 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-04-27 15:34:57 +00:00
Bryan C. Mills	33b92cd6ce	reflect: use sync.Map instead of RWMutex for type caches This provides a significant speedup when using reflection-heavy code on many CPU cores, such as when marshaling or unmarshaling protocol buffers. updates #17973 updates #18177 name old time/op new time/op delta Call 239ns ±10% 245ns ± 7% ~ (p=0.562 n=10+9) Call-6 201ns ±38% 48ns ±29% -76.39% (p=0.000 n=10+9) Call-48 133ns ± 8% 12ns ± 2% -90.92% (p=0.000 n=10+8) CallArgCopy/size=128 169ns ±12% 197ns ± 2% +16.35% (p=0.000 n=10+7) CallArgCopy/size=128-6 142ns ± 9% 34ns ± 7% -76.10% (p=0.000 n=10+9) CallArgCopy/size=128-48 125ns ± 3% 9ns ± 7% -93.01% (p=0.000 n=8+8) CallArgCopy/size=256 177ns ± 8% 197ns ± 5% +11.24% (p=0.000 n=10+9) CallArgCopy/size=256-6 148ns ±11% 35ns ± 6% -76.23% (p=0.000 n=10+9) CallArgCopy/size=256-48 127ns ± 4% 9ns ± 9% -92.66% (p=0.000 n=10+9) CallArgCopy/size=1024 196ns ± 6% 228ns ± 7% +16.09% (p=0.000 n=10+9) CallArgCopy/size=1024-6 143ns ± 6% 42ns ± 5% -70.39% (p=0.000 n=8+8) CallArgCopy/size=1024-48 130ns ± 7% 10ns ± 1% -91.99% (p=0.000 n=10+8) CallArgCopy/size=4096 330ns ± 9% 351ns ± 5% +6.20% (p=0.004 n=10+9) CallArgCopy/size=4096-6 173ns ±14% 62ns ± 6% -63.83% (p=0.000 n=10+8) CallArgCopy/size=4096-48 141ns ± 6% 15ns ± 6% -89.59% (p=0.000 n=10+8) CallArgCopy/size=65536 7.71µs ±10% 7.74µs ±10% ~ (p=0.859 n=10+9) CallArgCopy/size=65536-6 1.33µs ± 4% 1.34µs ± 6% ~ (p=0.720 n=10+9) CallArgCopy/size=65536-48 347ns ± 2% 344ns ± 2% ~ (p=0.202 n=10+9) PtrTo 30.2ns ±10% 41.3ns ±11% +36.97% (p=0.000 n=10+9) PtrTo-6 126ns ± 6% 7ns ±10% -94.47% (p=0.000 n=9+9) PtrTo-48 86.9ns ± 9% 1.7ns ± 9% -98.08% (p=0.000 n=10+9) FieldByName1 86.6ns ± 5% 87.3ns ± 7% ~ (p=0.737 n=10+9) FieldByName1-6 19.8ns ±10% 18.7ns ±10% ~ (p=0.073 n=9+9) FieldByName1-48 7.54ns ± 4% 7.74ns ± 5% +2.55% (p=0.023 n=9+9) FieldByName2 1.63µs ± 8% 1.70µs ± 4% +4.13% (p=0.020 n=9+9) FieldByName2-6 481ns ± 6% 490ns ±10% ~ (p=0.474 n=9+9) FieldByName2-48 723ns ± 3% 736ns ± 2% +1.76% (p=0.045 n=8+8) FieldByName3 10.5µs ± 7% 10.8µs ± 7% ~ (p=0.234 n=8+8) FieldByName3-6 2.78µs ± 3% 2.94µs ±10% +5.87% (p=0.031 n=9+9) FieldByName3-48 3.72µs ± 2% 3.91µs ± 5% +4.91% (p=0.003 n=9+9) InterfaceBig 10.8ns ± 5% 10.7ns ± 5% ~ (p=0.849 n=9+9) InterfaceBig-6 9.62ns ±81% 1.79ns ± 4% -81.38% (p=0.003 n=9+9) InterfaceBig-48 0.48ns ±34% 0.50ns ± 7% ~ (p=0.071 n=8+9) InterfaceSmall 10.7ns ± 5% 10.9ns ± 4% ~ (p=0.243 n=9+9) InterfaceSmall-6 1.85ns ± 5% 1.79ns ± 1% -2.97% (p=0.006 n=7+8) InterfaceSmall-48 0.49ns ±20% 0.48ns ± 5% ~ (p=0.740 n=7+9) New 28.2ns ±20% 26.6ns ± 3% ~ (p=0.617 n=9+9) New-6 4.69ns ± 4% 4.44ns ± 3% -5.33% (p=0.001 n=9+9) New-48 1.10ns ± 9% 1.08ns ± 6% ~ (p=0.285 n=9+8) name old alloc/op new alloc/op delta Call 0.00B 0.00B ~ (all equal) Call-6 0.00B 0.00B ~ (all equal) Call-48 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta Call 0.00 0.00 ~ (all equal) Call-6 0.00 0.00 ~ (all equal) Call-48 0.00 0.00 ~ (all equal) name old speed new speed delta CallArgCopy/size=128 757MB/s ±11% 649MB/s ± 1% -14.33% (p=0.000 n=10+7) CallArgCopy/size=128-6 901MB/s ± 9% 3781MB/s ± 7% +319.69% (p=0.000 n=10+9) CallArgCopy/size=128-48 1.02GB/s ± 2% 14.63GB/s ± 6% +1337.98% (p=0.000 n=8+8) CallArgCopy/size=256 1.45GB/s ± 9% 1.30GB/s ± 5% -10.17% (p=0.000 n=10+9) CallArgCopy/size=256-6 1.73GB/s ±11% 7.28GB/s ± 7% +320.76% (p=0.000 n=10+9) CallArgCopy/size=256-48 2.00GB/s ± 4% 27.46GB/s ± 9% +1270.85% (p=0.000 n=10+9) CallArgCopy/size=1024 5.21GB/s ± 6% 4.49GB/s ± 8% -13.74% (p=0.000 n=10+9) CallArgCopy/size=1024-6 7.18GB/s ± 7% 24.17GB/s ± 5% +236.64% (p=0.000 n=9+8) CallArgCopy/size=1024-48 7.87GB/s ± 7% 98.43GB/s ± 1% +1150.99% (p=0.000 n=10+8) CallArgCopy/size=4096 12.3GB/s ± 6% 11.7GB/s ± 5% -5.00% (p=0.008 n=9+9) CallArgCopy/size=4096-6 23.8GB/s ±16% 65.6GB/s ± 5% +175.02% (p=0.000 n=10+8) CallArgCopy/size=4096-48 29.0GB/s ± 7% 279.6GB/s ± 6% +862.87% (p=0.000 n=10+8) CallArgCopy/size=65536 8.52GB/s ±11% 8.49GB/s ± 9% ~ (p=0.842 n=10+9) CallArgCopy/size=65536-6 49.3GB/s ± 4% 49.0GB/s ± 6% ~ (p=0.720 n=10+9) CallArgCopy/size=65536-48 189GB/s ± 2% 190GB/s ± 2% ~ (p=0.211 n=10+9) https://perf.golang.org/search?q=upload:20170426.3 Change-Id: Iff68f18ef69defb7f30962e21736ac7685a48a27 Reviewed-on: https://go-review.googlesource.com/41871 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-27 15:34:41 +00:00
Elias Naur	6e54fe47ce	misc/ios: increase iOS test harness timeout The "lldb start" phase often times out on the iOS builder. Increase the timeout and see if that helps. Change-Id: I92fd67cbfa90659600e713198d6b2c5c78dde20f Reviewed-on: https://go-review.googlesource.com/41863 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-27 14:42:37 +00:00
Weichao Tang	e51e0f9cdd	net/http: close resp.Body when error occurred during redirection Fixes #19976 Change-Id: I48486467066784a9dcc24357ec94a1be85265a6f Reviewed-on: https://go-review.googlesource.com/40940 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-27 14:28:47 +00:00
Wei Xiao	2b6c58f6d5	cmd/internal/obj/arm64: fix encoding of condition The current code treats condition as special register and write its raw data directly into instruction. The fix converts the raw data into correct condition encoding. Also fix the operand catogery of FCCMP. Add tests to cover all cases. Change-Id: Ib194041bd9017dd0edbc241564fe983082ac616b Reviewed-on: https://go-review.googlesource.com/41511 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-04-27 13:35:59 +00:00
Ian Lance Taylor	220e0e0f73	os: use kernel limit on pipe size if possible Fixes #20134 Change-Id: I92699d118c713179961c037a6bbbcbec4efa63ba Reviewed-on: https://go-review.googlesource.com/41823 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-27 04:42:21 +00:00
Nigel Tao	35cbc3b55b	image/jpeg: fix extended sequential Huffman table selector (Th). Previously, the package did not distinguish between baseline and extended sequential images. Both are non-progressive images, but the Th range differs between the two, as per Annex B of https://www.w3.org/Graphics/JPEG/itu-t81.pdf Extended sequential images are often emitted by the Guetzli encoder. Fixes #19913 Change-Id: I3d0f9e16d5d374ee1c65e3a8fb87519de61cff94 Reviewed-on: https://go-review.googlesource.com/41831 Reviewed-by: David Symonds <dsymonds@golang.org>	2017-04-27 03:48:40 +00:00
Josh Bleecher Snyder	6664ccb453	cmd/compile: compile more complex functions first When using a concurrent backend, the overall compilation time is bounded in part by the slowest function to compile. The number of top-level statements in a function is an easily calculated and fairly reliable proxy for compilation time. Here's a standard compilecmp output for -c=8 with this CL: name old time/op new time/op delta Template 127ms ± 4% 125ms ± 6% -1.33% (p=0.000 n=47+50) Unicode 84.8ms ± 4% 84.5ms ± 4% ~ (p=0.217 n=49+49) GoTypes 289ms ± 3% 287ms ± 3% -0.78% (p=0.002 n=48+50) Compiler 1.36s ± 3% 1.34s ± 2% -1.29% (p=0.000 n=49+47) SSA 2.95s ± 3% 2.77s ± 4% -6.23% (p=0.000 n=50+49) Flate 70.7ms ± 3% 70.9ms ± 2% ~ (p=0.112 n=50+49) GoParser 85.0ms ± 3% 83.0ms ± 4% -2.31% (p=0.000 n=48+49) Reflect 229ms ± 3% 225ms ± 4% -1.83% (p=0.000 n=49+49) Tar 70.2ms ± 3% 69.4ms ± 3% -1.17% (p=0.000 n=49+49) XML 115ms ± 7% 114ms ± 6% ~ (p=0.158 n=49+47) name old user-time/op new user-time/op delta Template 352ms ± 5% 342ms ± 8% -2.74% (p=0.000 n=49+50) Unicode 117ms ± 5% 118ms ± 4% +0.88% (p=0.005 n=46+48) GoTypes 986ms ± 3% 980ms ± 4% ~ (p=0.110 n=46+48) Compiler 4.39s ± 2% 4.43s ± 4% +0.97% (p=0.002 n=50+50) SSA 12.0s ± 2% 13.3s ± 3% +11.33% (p=0.000 n=49+49) Flate 222ms ± 5% 219ms ± 6% -1.56% (p=0.002 n=50+50) GoParser 271ms ± 5% 268ms ± 4% -0.83% (p=0.036 n=49+48) Reflect 560ms ± 4% 571ms ± 3% +1.90% (p=0.000 n=50+49) Tar 183ms ± 3% 183ms ± 3% ~ (p=0.903 n=45+50) XML 364ms ±13% 391ms ± 4% +7.16% (p=0.000 n=50+40) A more interesting way of viewing the data is by looking at the ratio of the time taken to compile the slowest-to-compile function to the overall time spent compiling functions. If this ratio is small (near 0), then increased concurrency might help. If this ratio is big (near 1), then we're bounded by that single function. I instrumented the compiler to emit this ratio per-package, ran 'go build -a -gcflags=-c=C -p=P std cmd' three times, for varying values of C and P, and collected the ratios encountered into an ASCII histogram. Here's c=1 p=1, which is a non-concurrent backend, single process at a time: 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| 30%\| 20%\| 10%\|* 0%\|********* ----+---------- \|0123456789 The x-axis is floor(10ratio), so the first column indicates the percent of ratios that fell in the 0% to 9.9999% range. We can see in this histogram that more concurrency will help; in most cases, the ratio is small. Here's c=8 p=1, before this CL: 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| 30%\| 20%\| * 10%\|* * * 0%\|********** ----+---------- \|0123456789 In 30-40% of cases, we're mostly bound by the compilation time of a single function. Here's c=8 p=1, after this CL: 90%\| 80%\| 70%\| 60%\| 50%\| * 40%\| * 30%\| * 20%\| * 10%\| * 0%\|********** ----+---------- \|0123456789 The sorting pays off; we are bound by the compilation time of a single function in over half of packages. The single * in the histogram indicates 0-10%. The actual values for this chart are: 0: 5%, 1: 1%, 2: 1%, 3: 4%, 4: 5%, 5: 7%, 6: 7%, 7: 7%, 8: 9%, 9: 55% This indicates that efforts to increase or enable more concurrency, e.g. by optimizing mutexes or increasing the value of c, will probably not yield fruit. That matches what compilecmp tells us. Further optimization efforts should thus focus instead on one of: (1) making more functions compile concurrently (2) improving the compilation time of the slowest functions (3) speeding up the remaining serial parts of the compiler (4) automatically splitting up some large autogenerated functions into small ones, as discussed in #19751 I hope to spend more time on (1) before the freeze. Adding process parallelism doesn't change the story much. For example, here's c=8 p=8, after this CL: 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| * 30%\| * 20%\| * 10%\| * 0%\|****** ----+---------- \|0123456789 Since we don't need to worry much about p, these histograms can help us select a good general value of c to use as a default, assuming we're not bounded by GOMAXPROCS. Here are some charts after this CL, for c from 1 to 8: c=1 p=1 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| 30%\| 20%\| 10%\|* 0%\|***** ----+---------- \|0123456789 c=2 p=1 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| 30%\| 20%\| 10%\| ** * 0%\|********** ----+---------- \|0123456789 c=3 p=1 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| 30%\| 20%\| * 10%\| ** * * 0%\|********** ----+---------- \|0123456789 c=4 p=1 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| 30%\| * 20%\| * 10%\| * * 0%\|********** ----+---------- \|0123456789 c=5 p=1 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| 30%\| * 20%\| * 10%\| * * 0%\|********** ----+---------- \|0123456789 c=6 p=1 90%\| 80%\| 70%\| 60%\| 50%\| 40%\| * 30%\| * 20%\| * 10%\| * 0%\|********** ----+---------- \|0123456789 c=7 p=1 90%\| 80%\| 70%\| 60%\| 50%\| * 40%\| * 30%\| * 20%\| * 10%\| 0%\|******** ----+---------- \|0123456789 c=8 p=1 90%\| 80%\| 70%\| 60%\| 50%\| * 40%\| * 30%\| * 20%\| * 10%\| * 0%\|********** ----+---------- \|0123456789 Given the increased user-CPU costs as c increases, it looks like c=4 is probably the sweet spot, at least for now. Pleasingly, this matches (and explains) the results of the standard benchmarking that I have done. Updates #15756 Change-Id: I82b606c06efd34a5dbd1afdbcf66a605905b2aeb Reviewed-on: https://go-review.googlesource.com/41192 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-27 01:08:35 +00:00
Josh Bleecher Snyder	756b9ce3a5	cmd/compile: add initial backend concurrency support This CL adds initial support for concurrent backend compilation. BACKGROUND The compiler currently consists (very roughly) of the following phases: 1. Initialization. 2. Lexing and parsing into the cmd/compile/internal/syntax AST. 3. Translation into the cmd/compile/internal/gc AST. 4. Some gc AST passes: typechecking, escape analysis, inlining, closure handling, expression evaluation ordering (order.go), and some lowering and optimization (walk.go). 5. Translation into the cmd/compile/internal/ssa SSA form. 6. Optimization and lowering of SSA form. 7. Translation from SSA form to assembler instructions. 8. Translation from assembler instructions to machine code. 9. Writing lots of output: machine code, DWARF symbols, type and reflection info, export data. Phase 2 was already concurrent as of Go 1.8. Phase 3 is planned for eventual removal; we hope to go straight from syntax AST to SSA. Phases 5–8 are per-function; this CL adds support for processing multiple functions concurrently. The slowest phases in the compiler are 5 and 6, so this offers the opportunity for some good speed-ups. Unfortunately, it's not quite that straightforward. In the current compiler, the latter parts of phase 4 (order, walk) are done function-at-a-time as needed. Making order and walk concurrency-safe proved hard, and they're not particularly slow, so there wasn't much reward. To enable phases 5–8 to be done concurrently, when concurrent backend compilation is requested, we complete phase 4 for all functions before starting later phases for any functions. Also, in reality, we automatically generate new functions in phase 9, such as method wrappers and equality and has routines. Those new functions then go through phases 4–8. This CL disables concurrent backend compilation after the first, big, user-provided batch of functions has been compiled. This is done to keep things simple, and because the autogenerated functions tend to be small, few, simple, and fast to compile. USAGE Concurrent backend compilation still defaults to off. To set the number of functions that may be backend-compiled concurrently, use the compiler flag -c. In future work, cmd/go will automatically set -c. Furthermore, this CL has been intentionally written so that the c=1 path has no backend concurrency whatsoever, not even spawning any goroutines. This helps ensure that, should problems arise late in the development cycle, we can simply have cmd/go set c=1 always, and revert to the original compiler behavior. MUTEXES Most of the work required to make concurrent backend compilation safe has occurred over the past month. This CL adds a handful of mutexes to get the rest of the way there; they are the mutexes that I didn't see a clean way to avoid. Some of them may still be eliminable in future work. In no particular order: * gc.funcsymsmu. The global funcsyms slice is populated lazily when we need function symbols for closures. This occurs during gc AST to SSA translation. The function funcsym also does a package lookup, which is a source of races on types.Pkg.Syms; funcsymsmu also covers that package lookup. This mutex is low priority: it adds a single global, it is in an infrequently used code path, and it is low contention. Since funcsyms may now be added in any order, we must sort them to preserve reproducible builds. * gc.largeStackFramesMu. We don't discover until after SSA compilation that a function's stack frame is gigantic. Recording that error happens basically never, but it does happen concurrently. Fix with a low priority mutex and sorting. * obj.Link.hashmu. ctxt.hash stores the mapping from types.Syms (compiler symbols) to obj.LSyms (linker symbols). It is accessed fairly heavily through all the phases. This is the only heavily contended mutex. * gc.signatlistmu. The global signatlist map is populated with types through several of the concurrent phases, including notably via ngotype during DWARF generation. It is low priority for removal. * gc.typepkgmu. Looking up symbols in the types package happens a fair amount during backend compilation and DWARF generation, particularly via ngotype. This mutex helps us to avoid a broader mutex on types.Pkg.Syms. It has low-to-moderate contention. * types.internedStringsmu. gc AST to SSA conversion and some SSA work introduce new autotmps. Those autotmps have their names interned to reduce allocations. That interning requires protecting types.internedStrings. The autotmp names are heavily re-used, and the mutex overhead and contention here are low, so it is probably a worthwhile performance optimization to keep this mutex. TESTING I have been testing this code locally by running 'go install -race cmd/compile' and then doing 'go build -a -gcflags=-c=128 std cmd' for all architectures and a variety of compiler flags. This obviously needs to be made part of the builders, but it is too expensive to make part of all.bash. I have filed #19962 for this. REPRODUCIBLE BUILDS This version of the compiler generates reproducible builds. Testing reproducible builds also needs automation, however, and is also too expensive for all.bash. This is #19961. Also of note is that some of the compiler flags used by 'toolstash -cmp' are currently incompatible with concurrent backend compilation. They still work fine with c=1. Time will tell whether this is a problem. NEXT STEPS * Continue to find and fix races and bugs, using a combination of code inspection, fuzzing, and hopefully some community experimentation. I do not know of any outstanding races, but there probably are some. * Improve testing. * Improve performance, for many values of c. * Integrate with cmd/go and fine tune. * Support concurrent compilation with the -race flag. It is a sad irony that it does not yet work. * Minor code cleanup that has been deferred during the last month due to uncertainty about the ultimate shape of this CL. PERFORMANCE Here's the buried lede, at last. :) All benchmarks are from my 8 core 2.9 GHz Intel Core i7 darwin/amd64 laptop. First, going from tip to this CL with c=1 has almost no impact. name old time/op new time/op delta Template 195ms ± 3% 194ms ± 5% ~ (p=0.370 n=30+29) Unicode 86.6ms ± 3% 87.0ms ± 7% ~ (p=0.958 n=29+30) GoTypes 548ms ± 3% 555ms ± 4% +1.35% (p=0.001 n=30+28) Compiler 2.51s ± 2% 2.54s ± 2% +1.17% (p=0.000 n=28+30) SSA 5.16s ± 3% 5.16s ± 2% ~ (p=0.910 n=30+29) Flate 124ms ± 5% 124ms ± 4% ~ (p=0.947 n=30+30) GoParser 146ms ± 3% 146ms ± 3% ~ (p=0.150 n=29+28) Reflect 354ms ± 3% 352ms ± 4% ~ (p=0.096 n=29+29) Tar 107ms ± 5% 106ms ± 3% ~ (p=0.370 n=30+29) XML 200ms ± 4% 201ms ± 4% ~ (p=0.313 n=29+28) [Geo mean] 332ms 333ms +0.10% name old user-time/op new user-time/op delta Template 227ms ± 5% 225ms ± 5% ~ (p=0.457 n=28+27) Unicode 109ms ± 4% 109ms ± 5% ~ (p=0.758 n=29+29) GoTypes 713ms ± 4% 721ms ± 5% ~ (p=0.051 n=30+29) Compiler 3.36s ± 2% 3.38s ± 3% ~ (p=0.146 n=30+30) SSA 7.46s ± 3% 7.47s ± 3% ~ (p=0.804 n=30+29) Flate 146ms ± 7% 147ms ± 3% ~ (p=0.833 n=29+27) GoParser 179ms ± 5% 179ms ± 5% ~ (p=0.866 n=30+30) Reflect 431ms ± 4% 429ms ± 4% ~ (p=0.593 n=29+30) Tar 124ms ± 5% 123ms ± 5% ~ (p=0.140 n=29+29) XML 243ms ± 4% 242ms ± 7% ~ (p=0.404 n=29+29) [Geo mean] 415ms 415ms +0.02% name old obj-bytes new obj-bytes delta Template 382k ± 0% 382k ± 0% ~ (all equal) Unicode 203k ± 0% 203k ± 0% ~ (all equal) GoTypes 1.18M ± 0% 1.18M ± 0% ~ (all equal) Compiler 3.98M ± 0% 3.98M ± 0% ~ (all equal) SSA 8.28M ± 0% 8.28M ± 0% ~ (all equal) Flate 230k ± 0% 230k ± 0% ~ (all equal) GoParser 287k ± 0% 287k ± 0% ~ (all equal) Reflect 1.00M ± 0% 1.00M ± 0% ~ (all equal) Tar 190k ± 0% 190k ± 0% ~ (all equal) XML 416k ± 0% 416k ± 0% ~ (all equal) [Geo mean] 660k 660k +0.00% Comparing this CL to itself, from c=1 to c=2 improves real times 20-30%, costs 5-10% more CPU time, and adds about 2% alloc. The allocation increase comes from allocating more ssa.Caches. name old time/op new time/op delta Template 202ms ± 3% 149ms ± 3% -26.15% (p=0.000 n=49+49) Unicode 87.4ms ± 4% 84.2ms ± 3% -3.68% (p=0.000 n=48+48) GoTypes 560ms ± 2% 398ms ± 2% -28.96% (p=0.000 n=49+49) Compiler 2.46s ± 3% 1.76s ± 2% -28.61% (p=0.000 n=48+46) SSA 6.17s ± 2% 4.04s ± 1% -34.52% (p=0.000 n=49+49) Flate 126ms ± 3% 92ms ± 2% -26.81% (p=0.000 n=49+48) GoParser 148ms ± 4% 107ms ± 2% -27.78% (p=0.000 n=49+48) Reflect 361ms ± 3% 281ms ± 3% -22.10% (p=0.000 n=49+49) Tar 109ms ± 4% 86ms ± 3% -20.81% (p=0.000 n=49+47) XML 204ms ± 3% 144ms ± 2% -29.53% (p=0.000 n=48+45) name old user-time/op new user-time/op delta Template 246ms ± 9% 246ms ± 4% ~ (p=0.401 n=50+48) Unicode 109ms ± 4% 111ms ± 4% +1.47% (p=0.000 n=44+50) GoTypes 728ms ± 3% 765ms ± 3% +5.04% (p=0.000 n=46+50) Compiler 3.33s ± 3% 3.41s ± 2% +2.31% (p=0.000 n=49+48) SSA 8.52s ± 2% 9.11s ± 2% +6.93% (p=0.000 n=49+47) Flate 149ms ± 4% 161ms ± 3% +8.13% (p=0.000 n=50+47) GoParser 181ms ± 5% 192ms ± 2% +6.40% (p=0.000 n=49+46) Reflect 452ms ± 9% 474ms ± 2% +4.99% (p=0.000 n=50+48) Tar 126ms ± 6% 136ms ± 4% +7.95% (p=0.000 n=50+49) XML 247ms ± 5% 264ms ± 3% +6.94% (p=0.000 n=48+50) name old alloc/op new alloc/op delta Template 38.8MB ± 0% 39.3MB ± 0% +1.48% (p=0.008 n=5+5) Unicode 29.8MB ± 0% 30.2MB ± 0% +1.19% (p=0.008 n=5+5) GoTypes 113MB ± 0% 114MB ± 0% +0.69% (p=0.008 n=5+5) Compiler 443MB ± 0% 447MB ± 0% +0.95% (p=0.008 n=5+5) SSA 1.25GB ± 0% 1.26GB ± 0% +0.89% (p=0.008 n=5+5) Flate 25.3MB ± 0% 25.9MB ± 1% +2.35% (p=0.008 n=5+5) GoParser 31.7MB ± 0% 32.2MB ± 0% +1.59% (p=0.008 n=5+5) Reflect 78.2MB ± 0% 78.9MB ± 0% +0.91% (p=0.008 n=5+5) Tar 26.6MB ± 0% 27.0MB ± 0% +1.80% (p=0.008 n=5+5) XML 42.4MB ± 0% 43.4MB ± 0% +2.35% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Template 379k ± 0% 378k ± 0% ~ (p=0.421 n=5+5) Unicode 322k ± 0% 321k ± 0% ~ (p=0.222 n=5+5) GoTypes 1.14M ± 0% 1.14M ± 0% ~ (p=0.548 n=5+5) Compiler 4.12M ± 0% 4.11M ± 0% -0.14% (p=0.032 n=5+5) SSA 9.72M ± 0% 9.72M ± 0% ~ (p=0.421 n=5+5) Flate 234k ± 1% 234k ± 0% ~ (p=0.421 n=5+5) GoParser 316k ± 1% 315k ± 0% ~ (p=0.222 n=5+5) Reflect 980k ± 0% 979k ± 0% ~ (p=0.095 n=5+5) Tar 249k ± 1% 249k ± 1% ~ (p=0.841 n=5+5) XML 392k ± 0% 391k ± 0% ~ (p=0.095 n=5+5) From c=1 to c=4, real time is down ~40%, CPU usage up 10-20%, alloc up ~5%: name old time/op new time/op delta Template 203ms ± 3% 131ms ± 5% -35.45% (p=0.000 n=50+50) Unicode 87.2ms ± 4% 84.1ms ± 2% -3.61% (p=0.000 n=48+47) GoTypes 560ms ± 4% 310ms ± 2% -44.65% (p=0.000 n=50+49) Compiler 2.47s ± 3% 1.41s ± 2% -43.10% (p=0.000 n=50+46) SSA 6.17s ± 2% 3.20s ± 2% -48.06% (p=0.000 n=49+49) Flate 126ms ± 4% 74ms ± 2% -41.06% (p=0.000 n=49+48) GoParser 148ms ± 4% 89ms ± 3% -39.97% (p=0.000 n=49+50) Reflect 360ms ± 3% 242ms ± 3% -32.81% (p=0.000 n=49+49) Tar 108ms ± 4% 73ms ± 4% -32.48% (p=0.000 n=50+49) XML 203ms ± 3% 119ms ± 3% -41.56% (p=0.000 n=49+48) name old user-time/op new user-time/op delta Template 246ms ± 9% 287ms ± 9% +16.98% (p=0.000 n=50+50) Unicode 109ms ± 4% 118ms ± 5% +7.56% (p=0.000 n=46+50) GoTypes 735ms ± 4% 806ms ± 2% +9.62% (p=0.000 n=50+50) Compiler 3.34s ± 4% 3.56s ± 2% +6.78% (p=0.000 n=49+49) SSA 8.54s ± 3% 10.04s ± 3% +17.55% (p=0.000 n=50+50) Flate 149ms ± 6% 176ms ± 3% +17.82% (p=0.000 n=50+48) GoParser 181ms ± 5% 213ms ± 3% +17.47% (p=0.000 n=50+50) Reflect 453ms ± 6% 499ms ± 2% +10.11% (p=0.000 n=50+48) Tar 126ms ± 5% 149ms ±11% +18.76% (p=0.000 n=50+50) XML 246ms ± 5% 287ms ± 4% +16.53% (p=0.000 n=49+50) name old alloc/op new alloc/op delta Template 38.8MB ± 0% 40.4MB ± 0% +4.21% (p=0.008 n=5+5) Unicode 29.8MB ± 0% 30.9MB ± 0% +3.68% (p=0.008 n=5+5) GoTypes 113MB ± 0% 116MB ± 0% +2.71% (p=0.008 n=5+5) Compiler 443MB ± 0% 455MB ± 0% +2.75% (p=0.008 n=5+5) SSA 1.25GB ± 0% 1.27GB ± 0% +1.84% (p=0.008 n=5+5) Flate 25.3MB ± 0% 26.9MB ± 1% +6.31% (p=0.008 n=5+5) GoParser 31.7MB ± 0% 33.2MB ± 0% +4.61% (p=0.008 n=5+5) Reflect 78.2MB ± 0% 80.2MB ± 0% +2.53% (p=0.008 n=5+5) Tar 26.6MB ± 0% 27.9MB ± 0% +5.19% (p=0.008 n=5+5) XML 42.4MB ± 0% 44.6MB ± 0% +5.20% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Template 380k ± 0% 379k ± 0% -0.39% (p=0.032 n=5+5) Unicode 321k ± 0% 321k ± 0% ~ (p=0.841 n=5+5) GoTypes 1.14M ± 0% 1.14M ± 0% ~ (p=0.421 n=5+5) Compiler 4.12M ± 0% 4.14M ± 0% +0.52% (p=0.008 n=5+5) SSA 9.72M ± 0% 9.76M ± 0% +0.37% (p=0.008 n=5+5) Flate 234k ± 1% 234k ± 1% ~ (p=0.690 n=5+5) GoParser 316k ± 0% 317k ± 1% ~ (p=0.841 n=5+5) Reflect 981k ± 0% 981k ± 0% ~ (p=1.000 n=5+5) Tar 250k ± 0% 249k ± 1% ~ (p=0.151 n=5+5) XML 393k ± 0% 392k ± 0% ~ (p=0.056 n=5+5) Going beyond c=4 on my machine tends to increase CPU time and allocs without impacting real time. The CPU time numbers matter, because when there are many concurrent compilation processes, that will impact the overall throughput. The numbers above are in many ways the best case scenario; we can take full advantage of all cores. Fortunately, the most common compilation scenario is incremental re-compilation of a single package during a build/test cycle. Updates #15756 Change-Id: I6725558ca2069edec0ac5b0d1683105a9fff6bea Reviewed-on: https://go-review.googlesource.com/40693 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-27 00:59:07 +00:00
Alex Brainman	1989921aef	os: do not report ModeDir for symlinks on windows When using Lstat against symlinks that point to a directory, the function returns FileInfo with both ModeDir and ModeSymlink set. Change that to never set ModeDir if ModeSymlink is set. Fixes #10424 Fixes #17540 Fixes #17541 Change-Id: Iba280888aad108360b8c1f18180a24493fe7ad2b Reviewed-on: https://go-review.googlesource.com/41830 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-26 23:17:23 +00:00
Mostyn Bramley-Moore	3d86d45dd6	build: fail nicely if somebody runs all.bash from a binary tarball package Fixes golang/go#20008. Change-Id: I7a429490320595fc558a8c5e260ec41bc3a788e2 Reviewed-on: https://go-review.googlesource.com/41858 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-26 22:57:29 +00:00
Damien Lespiau	92d918da03	cmd/internal/obj/x86: fix adcb r/mem8,reg8 encoding Taken from the Intel Software Development Manual (of course, in the line below it's ADC DST, SRC; The opposite of the commit subject). 12 /r ADC r8, r/m8 We need 0x12 for the corresponding ytab line, not 0x10. {Ymb, Ynone, Yrb, Zm_r, 1}, Updates #14069 Change-Id: Id37cbd0c581c9988c2de355efa908956278e2189 Reviewed-on: https://go-review.googlesource.com/41857 Reviewed-by: Keith Randall <khr@golang.org>	2017-04-26 20:41:12 +00:00
Josh Bleecher Snyder	92607fdd30	cmd/compile: split dumptypestructs further This is preparatory cleanup to make future changes clearer. Change-Id: I20fb9c78257de61b8bd096fce6b1e751995c01f2 Reviewed-on: https://go-review.googlesource.com/41818 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-26 20:16:41 +00:00

1 2 3 4 5 ...

32582 Commits