qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-14 06:50:21 -07:00

Author	SHA1	Message	Date
Keith Randall	a9292b833b	cmd/compile: fix 32-bit unsigned division on 64-bit machines The type of an intermediate multiply was wrong. When that intermediate multiply was spilled, the top 32 bits were lost. Fixes #19153 Change-Id: Ib29350a4351efa405935b7f7ee3c112668e64108 Reviewed-on: https://go-review.googlesource.com/37212 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-17 22:21:04 +00:00
Robert Griesemer	4498b68390	math/bits: faster Reverse, ReverseBytes - moved from: x&m>>k \| x&^m<<k to: x&m>>k \| x<<k&m This permits use of the same constant m twice () which may be better for machines that can't use large immediate constants directly with an AND instruction and have to load them explicitly. ) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m) - simplified returns This improves the generated code because the compiler recognizes x>>k \| x<<k as ROT when k is the bitsize of x. The 8-bit versions of these instructions can be significantly faster still if they are replaced with table lookups, as long as the table is in cache. If the table is not in cache, table-lookup is probably slower, hence the choice of an explicit register-only implementation for now. BenchmarkReverse-8 8.50 6.86 -19.29% BenchmarkReverse8-8 2.17 1.74 -19.82% BenchmarkReverse16-8 2.89 2.34 -19.03% BenchmarkReverse32-8 3.55 2.95 -16.90% BenchmarkReverse64-8 6.81 5.57 -18.21% BenchmarkReverseBytes-8 3.49 2.48 -28.94% BenchmarkReverseBytes16-8 0.93 0.62 -33.33% BenchmarkReverseBytes32-8 1.55 1.13 -27.10% BenchmarkReverseBytes64-8 2.47 2.47 +0.00% Reverse-8 8.50ns ± 0% 6.86ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 2.17ns ± 0% 1.74ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 2.89ns ± 0% 2.34ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 3.55ns ± 0% 2.95ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 6.81ns ± 0% 5.57ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 3.49ns ± 0% 2.48ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 0.93ns ± 0% 0.62ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 1.55ns ± 0% 1.13ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 2.47ns ± 0% 2.47ns ± 0% ~ (all samples are equal) Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d Reviewed-on: https://go-review.googlesource.com/37215 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-17 22:20:28 +00:00
Matthew Dempsky	c61cf5e6b7	cmd/compile/internal/gc: remove Node.IsStatic field We can immediately emit static assignment data rather than queueing them up to be processed during SSA building. Passes toolstash -cmp. Change-Id: I8bcea4b72eafb0cc0b849cd93e9cde9d84f30d5e Reviewed-on: https://go-review.googlesource.com/37024 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-17 22:06:52 +00:00
Cherry Zhang	3557d54609	cmd/compile: check both syms when folding address into load/store on ARM64 The rules for folding addresses into load/stores checks sym1 is not on stack (because the stack offset is not known at that point). But sym1 could be nil, which invalidates the check. Check merged sym instead. Fixes #19137. Change-Id: I8574da22ced1216bb5850403d8f08ec60a8d1005 Reviewed-on: https://go-review.googlesource.com/37145 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-02-17 21:23:24 +00:00
Robert Griesemer	3a239a6ae4	math/bits: fix benchmarks (make sure calls don't get optimized away) Sum up function results and store them in an exported (global) variable. This prevents the compiler from optimizing away the otherwise side-effect free function calls. We now have more realistic set of benchmark numbers... Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. Note: These measurements are based on the same "old" implementation as the prior measurements (commit `7d5c003`). benchmark old ns/op new ns/op delta BenchmarkReverse-8 72.9 8.50 -88.34% BenchmarkReverse8-8 13.2 2.17 -83.56% BenchmarkReverse16-8 21.2 2.89 -86.37% BenchmarkReverse32-8 36.3 3.55 -90.22% BenchmarkReverse64-8 71.3 6.81 -90.45% BenchmarkReverseBytes-8 11.2 3.49 -68.84% BenchmarkReverseBytes16-8 6.24 0.93 -85.10% BenchmarkReverseBytes32-8 7.40 1.55 -79.05% BenchmarkReverseBytes64-8 10.5 2.47 -76.48% Reverse-8 72.9ns ± 0% 8.5ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 13.2ns ± 0% 2.2ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 21.2ns ± 0% 2.9ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 36.3ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 71.3ns ± 0% 6.8ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 11.2ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 6.24ns ± 0% 0.93ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 7.40ns ± 0% 1.55ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 10.5ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I8aef1334b84f6cafd25edccad7e6868b37969efb Reviewed-on: https://go-review.googlesource.com/37213 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 20:58:12 +00:00
Robert Griesemer	ddb15cea4a	math/bits: much faster ReverseBytes, added respective benchmarks Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkReverseBytes-8 11.4 3.51 -69.21% BenchmarkReverseBytes16-8 6.87 0.64 -90.68% BenchmarkReverseBytes32-8 7.79 0.65 -91.66% BenchmarkReverseBytes64-8 11.6 0.64 -94.48% name old time/op new time/op delta ReverseBytes-8 11.4ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 6.87ns ± 0% 0.64ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 7.79ns ± 0% 0.65ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 11.6ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I67b529652b3b613c61687e9e185e8d4ee40c51a2 Reviewed-on: https://go-review.googlesource.com/37211 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 19:38:26 +00:00
Robert Griesemer	7d5c003a3a	math/bits: much faster Reverse, added respective benchmarks Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. name old time/op new time/op delta Reverse-8 76.6ns ± 0% 8.1ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 12.6ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 20.8ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 36.5ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 74.0ns ± 0% 6.4ns ± 0% ~ (p=1.000 n=1+1) benchmark old ns/op new ns/op delta BenchmarkReverse-8 76.6 8.07 -89.46% BenchmarkReverse8-8 12.6 0.64 -94.92% BenchmarkReverse16-8 20.8 0.64 -96.92% BenchmarkReverse32-8 36.5 0.64 -98.25% BenchmarkReverse64-8 74.0 6.38 -91.38% Change-Id: I6b99b10cee2f2babfe79342b50ee36a45a34da30 Reviewed-on: https://go-review.googlesource.com/37149 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 19:38:13 +00:00
Cherry Zhang	c4b8dadb40	cmd/compile: fix some types in SSA These seem not to really matter, but good to be correct. Change-Id: I02edb9797c3d6739725cfbe4723c75f151acd05e Reviewed-on: https://go-review.googlesource.com/36837 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-17 19:20:46 +00:00
Cherry Zhang	c4ef597c47	cmd/compile: redo writebarrier pass SSA's writebarrier pass requires WB store ops are always at the end of a block. If we move write barrier insertion into SSA and emits normal Store ops when building SSA, this requirement becomes impractical -- it will create too many blocks for all the Store ops. Redo SSA's writebarrier pass, explicitly order values in store order, so it no longer needs this requirement. Updates #17583. Fixes #19067. Change-Id: I66e817e526affb7e13517d4245905300a90b7170 Reviewed-on: https://go-review.googlesource.com/36834 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-02-17 19:20:25 +00:00
Cherry Zhang	98061fa5f3	cmd/compile: re-enable nilcheck removal in same block Nil check removal in the same block is disabled due to issue 18725: because the values are not ordered, a nilcheck may influence a value that is logically before it. This CL re-enables same-block nilcheck removal by ordering values in store order first. Updates #18725. Change-Id: I287a38525230c14c5412cbcdbc422547dabd54f6 Reviewed-on: https://go-review.googlesource.com/35496 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-02-17 19:19:59 +00:00
Robert Griesemer	81acd308a4	math/bits: expand doc strings for all functions Follow-up on https://go-review.googlesource.com/36315. No functionality change. For #18616. Change-Id: Id4df34dd7d0381be06eea483a11bf92f4a01f604 Reviewed-on: https://go-review.googlesource.com/37140 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 19:02:56 +00:00
Koki Ide	045ad5bab8	all: fix a few typos in comments Change-Id: I0455ffaa51c661803d8013c7961910f920d3c3cc Reviewed-on: https://go-review.googlesource.com/37043 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-17 18:15:41 +00:00
Dmitry Vyukov	0556e26273	sync: make Mutex more fair Add new starvation mode for Mutex. In starvation mode ownership is directly handed off from unlocking goroutine to the next waiter. New arriving goroutines don't compete for ownership. Unfair wait time is now limited to 1ms. Also fix a long standing bug that goroutines were requeued at the tail of the wait queue. That lead to even more unfair acquisition times with multiple waiters. Performance of normal mode is not considerably affected. Fixes #13086 On the provided in the issue lockskew program: done in 1.207853ms done in 1.177451ms done in 1.184168ms done in 1.198633ms done in 1.185797ms done in 1.182502ms done in 1.316485ms done in 1.211611ms done in 1.182418ms name old time/op new time/op delta MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10) Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10) MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10) MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10) MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10) MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10) MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10) Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10) RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10) RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10) RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10) RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10) Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9) MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10) Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10) MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10) MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9) MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9) MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10) MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10) RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10) RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9) RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10) RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10) Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10) MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10) Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10) MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9) MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10) MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10) MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10) MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10) RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10) RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9) RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9) RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8) Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10) MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal) Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10) MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10) MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10) MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10) MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10) MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10) RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10) RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10) RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10) RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal) Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10) MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10) Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10) MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9) MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9) MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8) MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9) MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8) RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8) RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10) RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9) RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10) Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10) MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8) Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10) MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10) MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8) MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10) MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10) MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8) RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9) RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9) RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal) RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10) Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b Reviewed-on: https://go-review.googlesource.com/34310 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-17 17:24:59 +00:00
Wander Lairson Costa	79f6a5c7bd	syscall: only call setgroups if we need to If the caller set ups a Credential in os/exec.Command, os/exec.Command.Start will end up calling setgroups(2), even if no supplementary groups were given. Only root can call setgroups(2) on BSD kernels, which causes Start to fail for non-root users when they try to set uid and gid for the new process. We fix by introducing a new field to syscall.Credential named NoSetGroups, and setgroups(2) is only called if it is false. We make this field with inverted logic to preserve backward compatibility. RELNOTES=yes Change-Id: I3cff1f21c117a1430834f640ef21fd4e87e06804 Reviewed-on: https://go-review.googlesource.com/36697 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-17 14:36:27 +00:00
Keith Randall	708ba22a0c	cmd/compile: move constant divide strength reduction to SSA rules Currently the conversion from constant divides to multiplies is mostly done during the walk pass. This is suboptimal because SSA can determine that the value being divided by is constant more often (e.g. after inlining). Change-Id: If1a9b993edd71be37396b9167f77da271966f85f Reviewed-on: https://go-review.googlesource.com/37015 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-17 06:16:44 +00:00
Matthew Dempsky	794f1ebff7	cmd/compile: simplify needwritebarrier Currently, whether we need a write barrier is simply a property of the pointer slot being written to. The only optimization we currently apply using the value being written is that pointers to stack variables can omit write barriers because they're only written to stack slots... but we already omit write barriers for all writes to the stack anyway. Passes toolstash -cmp. Change-Id: I7f16b71ff473899ed96706232d371d5b2b7ae789 Reviewed-on: https://go-review.googlesource.com/37109 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-16 22:42:36 +00:00
Shenghou Ma	211102c85f	math: fix typos in Bessel function docs While we're at it, also document Yn(0, 0) = -Inf for completeness. Fixes #18823. Change-Id: Ib6db68f76d29cc2373c12ebdf3fab129cac8c167 Reviewed-on: https://go-review.googlesource.com/35970 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-16 22:41:34 +00:00
Robert Griesemer	661e2179e5	math/bits: added package for bit-level counting and manipulation Initial platform-independent implementation. For #18616. Change-Id: I4585c55b963101af9059c06c1b8a866cb384754c Reviewed-on: https://go-review.googlesource.com/36315 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-16 21:54:59 +00:00
Robert Griesemer	1693e7b6f2	cmd/compile/internal/syntax: better errors and recovery for invalid character literals Fixes #15611. Change-Id: I352b145026466cafef8cf87addafbd30716bda24 Reviewed-on: https://go-review.googlesource.com/37138 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-16 21:46:43 +00:00
Russ Cox	990124da2a	runtime: use balanced tree for addr lookup in semaphore implementation CL 36792 fixed #17953, a linear scan caused by n goroutines piling into two different locks that hashed to the same bucket in the semaphore table. In that CL, n goroutines contending for 2 unfortunately chosen locks went from O(n²) to O(n). This CL fixes a different linear scan, when n goroutines are contending for n/2 different locks that all hash to the same bucket in the semaphore table. In this CL, n goroutines contending for n/2 unfortunately chosen locks goes from O(n²) to O(n log n). This case is much less likely, but any linear scan eventually hurts, so we might as well fix it while the problem is fresh in our minds. The new test in this CL checks for both linear scans. The effect of this CL on the sync benchmarks is negligible (but it fixes the new test). name old time/op new time/op delta Cond1-48 576ns ±10% 575ns ±13% ~ (p=0.679 n=71+71) Cond2-48 1.59µs ± 8% 1.61µs ± 9% ~ (p=0.107 n=73+69) Cond4-48 4.56µs ± 7% 4.55µs ± 7% ~ (p=0.670 n=74+72) Cond8-48 9.87µs ± 9% 9.90µs ± 7% ~ (p=0.507 n=69+73) Cond16-48 20.4µs ± 7% 20.4µs ±10% ~ (p=0.588 n=69+71) Cond32-48 45.4µs ±10% 45.4µs ±14% ~ (p=0.944 n=73+73) UncontendedSemaphore-48 19.7ns ±12% 19.7ns ± 8% ~ (p=0.589 n=65+63) ContendedSemaphore-48 55.4ns ±26% 54.9ns ±32% ~ (p=0.441 n=75+75) MutexUncontended-48 0.63ns ± 0% 0.63ns ± 0% ~ (all equal) Mutex-48 210ns ± 6% 213ns ±10% +1.30% (p=0.035 n=70+74) MutexSlack-48 210ns ± 7% 211ns ± 9% ~ (p=0.184 n=71+72) MutexWork-48 299ns ± 5% 300ns ± 5% ~ (p=0.678 n=73+75) MutexWorkSlack-48 302ns ± 6% 300ns ± 5% ~ (p=0.149 n=74+72) MutexNoSpin-48 135ns ± 6% 135ns ±10% ~ (p=0.788 n=67+75) MutexSpin-48 693ns ± 5% 689ns ± 6% ~ (p=0.092 n=65+74) Once-48 0.22ns ±25% 0.22ns ±24% ~ (p=0.882 n=74+73) Pool-48 5.88ns ±36% 5.79ns ±24% ~ (p=0.655 n=69+69) PoolOverflow-48 4.79µs ±18% 4.87µs ±20% ~ (p=0.233 n=75+75) SemaUncontended-48 0.80ns ± 1% 0.82ns ± 8% +2.46% (p=0.000 n=60+74) SemaSyntNonblock-48 103ns ± 4% 102ns ± 5% -1.11% (p=0.003 n=75+75) SemaSyntBlock-48 104ns ± 4% 104ns ± 5% ~ (p=0.231 n=71+75) SemaWorkNonblock-48 128ns ± 4% 129ns ± 6% +1.51% (p=0.000 n=63+75) SemaWorkBlock-48 129ns ± 8% 130ns ± 7% ~ (p=0.072 n=75+74) RWMutexUncontended-48 2.35ns ± 1% 2.35ns ± 0% ~ (p=0.144 n=70+55) RWMutexWrite100-48 139ns ±18% 141ns ±21% ~ (p=0.071 n=75+73) RWMutexWrite10-48 145ns ± 9% 145ns ± 8% ~ (p=0.553 n=75+75) RWMutexWorkWrite100-48 297ns ±13% 297ns ±15% ~ (p=0.519 n=75+74) RWMutexWorkWrite10-48 588ns ± 7% 585ns ± 5% ~ (p=0.173 n=73+70) WaitGroupUncontended-48 0.87ns ± 0% 0.87ns ± 0% ~ (all equal) WaitGroupAddDone-48 63.2ns ± 4% 62.7ns ± 4% -0.82% (p=0.027 n=72+75) WaitGroupAddDoneWork-48 109ns ± 5% 109ns ± 4% ~ (p=0.233 n=75+75) WaitGroupWait-48 0.17ns ± 0% 0.16ns ±16% -8.55% (p=0.000 n=56+75) WaitGroupWaitWork-48 1.78ns ± 1% 2.08ns ± 5% +16.92% (p=0.000 n=74+70) WaitGroupActuallyWait-48 52.0ns ± 3% 50.6ns ± 5% -2.70% (p=0.000 n=71+69) https://perf.golang.org/search?q=upload:20170215.1 Change-Id: Ia29a8bd006c089e401ec4297c3038cca656bcd0a Reviewed-on: https://go-review.googlesource.com/37103 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-16 17:52:15 +00:00
Matthew Dempsky	fc456c7f7b	cmd/compile/internal/gc: drop unused src.XPos params in SSA builder Passes toolstash -cmp. Change-Id: I037278404ebf762482557e2b6867cbc595074a83 Reviewed-on: https://go-review.googlesource.com/37023 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-16 17:34:39 +00:00
Russ Cox	58d762176a	runtime: run mutexevent profiling without holding semaRoot lock Suggested by Dmitry in CL 36792 review. Clearly safe since there are many different semaRoots that could all have profiled sudogs calling mutexevent. Change-Id: I45eed47a5be3e513b2dad63b60afcd94800e16d1 Reviewed-on: https://go-review.googlesource.com/37104 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2017-02-16 17:16:41 +00:00
Russ Cox	83f95b85de	sync: deflake TestWaitGroupMisuse2 Also runs 100X faster on average, because it takes so many fewer attempts to trigger the failure. Fixes #11443. Change-Id: I8c39ee48bb3ff6c36fa63083e04076771b65a80d Reviewed-on: https://go-review.googlesource.com/36841 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2017-02-16 16:55:54 +00:00
Chris Broadfoot	863035efce	doc: document go1.8 Change-Id: Ie2144d001c6b4b2293d07b2acf62d7e3cd0b46a7 Reviewed-on: https://go-review.googlesource.com/37130 Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-16 16:36:59 +00:00
Alex Brainman	0ad247c6f0	cmd/link: delay calculating pe file parameters after Linkmode is set For #10776. Change-Id: Id64a7e35c7cdcd9be16cbe3358402fa379090e36 Reviewed-on: https://go-review.googlesource.com/36975 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-16 04:35:36 +00:00
Alex Brainman	e31144f128	cmd/link: set pe section and file alignment to 0 during external linking This is what gcc does when it generates object files. And it is easier to count everything, when it starts from 0. Make go linker do the same. gcc also does not output IMAGE_OPTIONAL_HEADER or PE64_IMAGE_OPTIONAL_HEADER for object files. Perhaps we should do the same, but not in this CL. For #10776. Change-Id: I9789c337648623b6cfaa7d18d1ac9cef32e180dc Reviewed-on: https://go-review.googlesource.com/36974 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-16 04:33:17 +00:00
Alex Brainman	64c02460d7	debug/pe: add test to check dwarf info For #10776. Change-Id: I7931558257c1f6b895e4d44b46d320a54de0d677 Reviewed-on: https://go-review.googlesource.com/36973 Run-TryBot: Alex Brainman <alex.brainman@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-16 00:05:51 +00:00
Matthew Dempsky	a6b3331236	cmd/compile/internal/gc: skip useless loads for non-SSA params Change-Id: I78ca43a0f0a6a162a2ade1352e2facb29432d4ac Reviewed-on: https://go-review.googlesource.com/37102 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-15 23:12:43 +00:00
Matthew Dempsky	862fde81fc	cmd/compile/internal/gc: document (*state).checkgoto No behavior change. Change-Id: I595c15ee976adf21bdbabdf24edf203c9e446185 Reviewed-on: https://go-review.googlesource.com/36958 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-15 22:59:55 +00:00
Ian Lance Taylor	45a5f79c24	internal/poll: define PollDescriptor on plan9 Fixes #19114. Change-Id: I352add53d6ee8bf78792564225099f8537ac6b46 Reviewed-on: https://go-review.googlesource.com/37106 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David du Colombier <0intro@gmail.com>	2017-02-15 22:43:19 +00:00
Sarah Adams	025dfb130a	doc: update Code of Conduct wording and scope This change removes the punitive language and anonymous reporting mechanism from the Code of Conduct document. Read on for the rationale. More than a year has passed since the Go Code of Conduct was introduced. In that time, there have been a small number (<30) of reports to the Working Group. Some reports we handled well, with positive outcomes for all involved. A few reports we handled badly, resulting in hurt feelings and a bad experience for all involved. On reflection, the reports that had positive outcomes were ones where the Working Group took the role of advisor/facilitator, listening to complaints and providing suggestions and advice to the parties involved. The reports that had negative outcomes were ones where the subject of the report felt threatened by the Working Group and Code of Conduct. After some discussion among the Working Group, we saw that we are most effective as facilitators, rather than disciplinarians. The various Go spaces already have moderators; this change to the CoC acknowledges their authority and places the group in a purely advisory role. If an incident is reported to the group we may provide information to or make a suggestion the moderators, but the Working Group need not (and should not) have any authority to take disciplinary action. In short, we want it to be clear that the Working Group are here to help resolve conflict, period. The second change made here is the removal of the anonymous reporting mechanism. To date, the quality of anonymous reports has been low, and with no way to reach out to the reporter for more information there is often very little we can do in response. Removing this one-way reporting mechanism strengthens the message that the Working Group are here to facilitate a constructive dialogue. Change-Id: Iee52aff5446accd0dae0c937bb3aa89709ad5fb4 Reviewed-on: https://go-review.googlesource.com/37014 Reviewed-by: Andrew Gerrand <adg@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-15 21:42:39 +00:00
Ian Lance Taylor	ae1d05981f	os: skip TestPipeThreads on Solaris I don't know why it is not working. Filed issue 19111 for this. Fixes build. Update #19111. Change-Id: I76f8d6aafba5951da2f3ad7d10960419cca7dd1f Reviewed-on: https://go-review.googlesource.com/37092 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-15 21:27:59 +00:00
Ian Lance Taylor	0fe62e7575	os: skip TestPipeThreads on Plan 9 It can't work since Plan 9 does not support the runtime poller. Fixes build. Change-Id: I9ec33eb66019d9364c6ff6519b61b32e59498559 Reviewed-on: https://go-review.googlesource.com/37091 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-15 21:27:12 +00:00
Russ Cox	1f77db94f8	runtime: do not call wakep from enlistWorker, to avoid possible deadlock We have seen one instance of a production job suddenly spinning to 100% CPU and becoming unresponsive. In that one instance, a SIGQUIT was sent after 328 minutes of spinning, and the stacks showed a single goroutine in "IO wait (scan)" state. Looking for things that might get stuck if a goroutine got stuck in scanning a stack, we found that injectglist does: lock(&sched.lock) var n int for n = 0; glist != nil; n++ { gp := glist glist = gp.schedlink.ptr() casgstatus(gp, _Gwaiting, _Grunnable) globrunqput(gp) } unlock(&sched.lock) and that casgstatus spins on gp.atomicstatus until the _Gscan bit goes away. Essentially, this code locks sched.lock and then while holding sched.lock, waits to lock gp.atomicstatus. The code that is doing the scan is: if castogscanstatus(gp, s, s\|_Gscan) { if !gp.gcscandone { scanstack(gp, gcw) gp.gcscandone = true } restartg(gp) break loop } More analysis showed that scanstack can, in a rare case, end up calling back into code that acquires sched.lock. For example: runtime.scanstack at proc.go:866 calls runtime.gentraceback at mgcmark.go:842 calls runtime.scanstack$1 at traceback.go:378 calls runtime.scanframeworker at mgcmark.go:819 calls runtime.scanblock at mgcmark.go:904 calls runtime.greyobject at mgcmark.go:1221 calls (runtime.gcWork).put at mgcmark.go:1412 calls (runtime.gcControllerState).enlistWorker at mgcwork.go:127 calls runtime.wakep at mgc.go:632 calls runtime.startm at proc.go:1779 acquires runtime.sched.lock at proc.go:1675 This path was found with an automated deadlock-detecting tool. There are many such paths but they all go through enlistWorker -> wakep. The evidence strongly suggests that one of these paths is what caused the deadlock we observed. We're running those jobs with GOTRACEBACK=crash now to try to get more information if it happens again. Further refinement and analysis shows that if we drop the wakep call from enlistWorker, the remaining few deadlock cycles found by the tool are all false positives caused by not understanding the effect of calls to func variables. The enlistWorker -> wakep call was intended only as a performance optimization, it rarely executes, and if it does execute at just the wrong time it can (and plausibly did) cause the deadlock we saw. Comment it out, to avoid the potential deadlock. Fixes #19112. Unfixes #14179. Change-Id: I6f7e10b890b991c11e79fab7aeefaf70b5d5a07b Reviewed-on: https://go-review.googlesource.com/37093 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-15 21:22:36 +00:00
Hana Kim	8833af3f4b	runtime/pprof: print newly added fields of runtime.MemStats in heap profile with debug mode Change-Id: I3a80d03a4aa556614626067a8fd698b3b00f4290 Reviewed-on: https://go-review.googlesource.com/36962 Reviewed-by: Austin Clements <austin@google.com>	2017-02-15 21:14:37 +00:00
Heschi Kreinick	35a95df571	cmd/compile/internal/ssa: display NamedValues in SSA html output. Change-Id: If268b42b32e6bcd6e7913bffa6e493dc78af40aa Reviewed-on: https://go-review.googlesource.com/36539 TryBot-Result: Gobot Gobot <gobot@golang.org> Run-TryBot: Heschi Kreinick <heschi@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-15 21:11:57 +00:00
Lynn Boger	2ac32b6360	cmd/go: improve stale reason for packages This adds more information to the pkg stale reason for debugging purposes. Change-Id: I7b626db4520baa1127195ae859f4da9b49304636 Reviewed-on: https://go-review.googlesource.com/36944 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-15 21:02:28 +00:00
Ian Lance Taylor	c05b06a12d	os: use poller for file I/O This changes the os package to use the runtime poller for file I/O where possible. When a system call blocks on a pollable descriptor, the goroutine will be blocked on the poller but the thread will be released to run other goroutines. When using a non-pollable descriptor, the os package will continue to use thread-blocking system calls as before. For example, on GNU/Linux, the runtime poller uses epoll. epoll does not support ordinary disk files, so they will continue to use blocking I/O as before. The poller will be used for pipes. Since this means that the poller is used for many more programs, this modifies the runtime to only block waiting for the poller if there is some goroutine that is waiting on the poller. Otherwise, there is no point, as the poller will never make any goroutine ready. This preserves the runtime's current simple deadlock detection. This seems to crash FreeBSD systems, so it is disabled on FreeBSD. This is issue 19093. Using the poller on Windows requires opening the file with FILE_FLAG_OVERLAPPED. We should only do that if we can remove that flag if the program calls the Fd method. This is issue 19098. Update #6817. Update #7903. Update #15021. Update #18507. Update #19093. Update #19098. Change-Id: Ia5197dcefa7c6fbcca97d19a6f8621b2abcbb1fe Reviewed-on: https://go-review.googlesource.com/36800 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-15 19:31:55 +00:00
Dave Cheney	81ec3f6a6c	internal/poll: remove unused poll.pollDesc methods Change-Id: Ic2b20c8238ff0ca5513d32e54ef2945fa4d0c3d2 Reviewed-on: https://go-review.googlesource.com/37033 Run-TryBot: Dave Cheney <dave@cheney.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-15 18:39:43 +00:00
Marcel van Lohuizen	79fab70a63	testing: fix stats bug for sub benchmarks Fixes golang/go#18815. Change-Id: Ic9d5cb640a555c58baedd597ed4ca5dd9f275c97 Reviewed-on: https://go-review.googlesource.com/36990 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-15 09:26:33 +00:00
Robert Griesemer	d390283ff4	cmd/compile/internal/syntax: compiler directives must start at beginning of line - ignore them, if they don't. - added tests Fixes #18393. Change-Id: I13f87b81ac6b9138ab5031bb3dd6bebc4c548156 Reviewed-on: https://go-review.googlesource.com/37020 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-15 06:49:21 +00:00
Alex Brainman	a8dc43edd1	internal/testenv: do not delete target file We did not create it. We should not delete it. Change-Id: If98454ab233ce25367e11a7c68d31b49074537dd Reviewed-on: https://go-review.googlesource.com/37030 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-15 06:03:15 +00:00
Robert Griesemer	2770c507a5	cmd/compile: fix position for "missing type in composite literal" error Fixes #18231. Change-Id: If1615da4db0e6f0516369a1dc37340d80c78f237 Reviewed-on: https://go-review.googlesource.com/37018 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-15 01:33:44 +00:00
Robert Griesemer	5267ac2732	cmd/compile/internal/syntax: establish principled position information Until now, the parser set the position for each Node to the position of the first token belonging to that node. For compatibility with the now defunct gc parser, in many places that position information was modified when the gcCompat flag was set (which it was, by default). Furthermore, in some places, position information was not set at all. This change removes the gcCompat flag and all associated code, and sets position information for all nodes in a more principled way, as proposed by mdempsky (see #16943 for details). Specifically, the position of a node may not be at the very beginning of the respective production. For instance for an Operation `a + b`, the position associated with the node is the position of the `+`. Thus, for `a + b + c` we now get different positions for the two additions. This change does not pass toolstash -cmp because position information recorded in export data and pcline tables is different. There are no other functional changes. Added test suite testing the position of all nodes. Fixes #16943. Change-Id: I3fc02bf096bc3b3d7d2fa655dfd4714a1a0eb90c Reviewed-on: https://go-review.googlesource.com/37017 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-15 01:33:03 +00:00
Daniel Martí	6910756f9b	math/big: simplify bool expression Change-Id: I280c53be455f2fe0474ad577c0f7b7908a4eccb2 Reviewed-on: https://go-review.googlesource.com/36993 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 23:34:25 +00:00
Russ Cox	72aa757ddd	encoding/xml: fix incorrect indirect code in chardata, comment, innerxml fields The new tests in this CL have been checked against Go 1.7 as well and all pass in Go 1.7, with the one exception noted in a comment (an intentional change to omitempty already present before this CL). CL 15684 made the intentional change to omitempty. This CL fixes bugs introduced along the way. Most of these are corner cases that are arguably not that important, but they've always worked all the way back to Go 1, and someone cared enough to file #19063. The most significant problem found while adding tests is that in the case of a nil *string field with `xml:",chardata"`, the existing code silently stops processing not just that field but the entire remainder of the struct. Even if #19063 were not worth fixing, this chardata bug would be. Fixes #19063. Change-Id: I318cf8f9945e1a4615982d9904e109fde577ebf9 Reviewed-on: https://go-review.googlesource.com/36954 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-14 23:23:40 +00:00
Bryan C. Mills	eebd8f51e8	mime: add benchmarks for TypeByExtension and ExtensionsByType These are possible use-cases for sync.Map. Updates golang/go#18177 Change-Id: I5e2a3d1249967c37d3f89a41122bf4a90522db11 Reviewed-on: https://go-review.googlesource.com/36964 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-14 23:02:07 +00:00
Kirill Smelkov	4477fd097f	cmd/compile/internal/ssa: combine 2 byte loads + shifts into word load + rolw 8 on AMD64 ... and same for stores. This does for binary.BigEndian.Uint16() what was already done for Uint32 and Uint64 with BSWAP in `10f75748` (CL 32222). Here is how generated code changes e.g. for the following function (omitting saying the same prologue/epilogue): func get16(b [2]byte) uint16 { return binary.BigEndian.Uint16(b[:]) } "".get16 t=1 size=21 args=0x10 locals=0x0 // before 0x0000 00000 (x.go:15) MOVBLZX "".b+9(FP), AX 0x0005 00005 (x.go:15) MOVBLZX "".b+8(FP), CX 0x000a 00010 (x.go:15) SHLL $8, CX 0x000d 00013 (x.go:15) ORL CX, AX // after 0x0000 00000 (x.go:15) MOVWLZX "".b+8(FP), AX 0x0005 00005 (x.go:15) ROLW $8, AX encoding/binary is speedup overall a bit: name old time/op new time/op delta ReadSlice1000Int32s-4 4.83µs ± 0% 4.83µs ± 0% ~ (p=0.206 n=4+5) ReadStruct-4 1.29µs ± 2% 1.28µs ± 1% -1.27% (p=0.032 n=4+5) ReadInts-4 384ns ± 1% 385ns ± 1% ~ (p=0.968 n=4+5) WriteInts-4 534ns ± 3% 526ns ± 0% -1.54% (p=0.048 n=4+5) WriteSlice1000Int32s-4 5.02µs ± 0% 5.11µs ± 3% ~ (p=0.175 n=4+5) PutUint16-4 0.59ns ± 0% 0.49ns ± 2% -16.95% (p=0.016 n=4+5) PutUint32-4 0.52ns ± 0% 0.52ns ± 0% ~ (all equal) PutUint64-4 0.53ns ± 0% 0.53ns ± 0% ~ (all equal) PutUvarint32-4 19.9ns ± 0% 19.9ns ± 1% ~ (p=0.556 n=4+5) PutUvarint64-4 54.5ns ± 1% 54.2ns ± 0% ~ (p=0.333 n=4+5) name old speed new speed delta ReadSlice1000Int32s-4 829MB/s ± 0% 828MB/s ± 0% ~ (p=0.190 n=4+5) ReadStruct-4 58.0MB/s ± 2% 58.7MB/s ± 1% +1.30% (p=0.032 n=4+5) ReadInts-4 78.0MB/s ± 1% 77.8MB/s ± 1% ~ (p=0.968 n=4+5) WriteInts-4 56.1MB/s ± 3% 57.0MB/s ± 0% ~ (p=0.063 n=4+5) WriteSlice1000Int32s-4 797MB/s ± 0% 783MB/s ± 3% ~ (p=0.190 n=4+5) PutUint16-4 3.37GB/s ± 0% 4.07GB/s ± 2% +20.83% (p=0.016 n=4+5) PutUint32-4 7.73GB/s ± 0% 7.72GB/s ± 0% ~ (p=0.556 n=4+5) PutUint64-4 15.1GB/s ± 0% 15.1GB/s ± 0% ~ (p=0.905 n=4+5) PutUvarint32-4 201MB/s ± 0% 201MB/s ± 0% ~ (p=0.905 n=4+5) PutUvarint64-4 147MB/s ± 1% 147MB/s ± 0% ~ (p=0.286 n=4+5) ( "a bit" only because most of the time is spent in reflection-like things there, not actual bytes decoding. Even for direct PutUint16 benchmark the looping adds overhead and lowers visible benefit. For code-generated encoders / decoders actual effect is more than 20% ) Adding Uint32 and Uint64 raw benchmarks too for completeness. NOTE I had to adjust load-combining rule for bswap case to match first 2 bytes loads as result of "2-bytes load+shift" -> "loadw + rorw 8" rewrite. Reason is: for loads+shift, even e.g. into uint16 var var b []byte var v uin16 v = uint16(b[1]) \| uint16(b[0])<<8 the compiler eventually generates L(ong) shift - SHLLconst [8], probably because it is more straightforward / other reasons to work on the whole register. This way 2 bytes rewriting rule is using SHLLconst (not SHLWconst) in its pattern, and then it always gets matched first, even if 2-byte rule comes syntactically after 4-byte rule in AMD64.rules because 4-bytes rule seemingly needs more applyRewrite() cycles to trigger. If 2-bytes rule gets matched for inner half of var b []byte var v uin32 v = uint32(b[3]) \| uint32(b[2])<<8 \| uint32(b[1])<<16 \| uint32(b[0])<<24 and we keep 4-byte load rule unchanged, the result will be MOVW + RORW $8 and then series of byte loads and shifts - not one MOVL + BSWAPL. There is no such problem for stores: there compiler, since it probably knows store destination is 2 bytes wide, uses SHRWconst 8 (not SHRLconst 8) and thus 2-byte store rule is not a subset of rule for 4-byte stores. Fixes #17151 (int16 was last missing piece there) Change-Id: Idc03ba965bfce2b94fef456b02ff6742194748f6 Reviewed-on: https://go-review.googlesource.com/34636 Reviewed-by: Ilya Tocar <ilya.tocar@intel.com> Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 22:17:08 +00:00
Bryan C. Mills	7ffdb75775	expvar: add benchmarks for steady-state Map Add calls Add a benchmark for setting a String value, which we may want to treat differently from Int or Float due to the need to support Add methods for the latter. Update tests to use only the exported API instead of making (fragile) assumptions about unexported fields. The existing Map benchmarks construct a new Map for each iteration, which focuses the benchmark results on the initial allocation costs for the Map and its entries. This change adds variants of the benchmarks which use a long-lived map in order to measure steady-state performance for Map updates on existing keys. Updates #18177 Change-Id: I62c920991d17d5898c592446af382cd5c04c528a Reviewed-on: https://go-review.googlesource.com/36959 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-14 22:11:35 +00:00
Michael Munday	d2fea0447f	math/big: fix s390x test build tags The tests failed to compile when using the math_big_pure_go tag on s390x. Change-Id: I2a09f53ff6562ab9bc9b886cffc0f6205bbfcfbb Reviewed-on: https://go-review.googlesource.com/36956 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-14 19:44:35 +00:00

1 2 3 4 5 ...

31516 Commits