1
0
mirror of https://github.com/golang/go synced 2024-11-18 10:54:40 -07:00
Commit Graph

23195 Commits

Author SHA1 Message Date
Russ Cox
653d56075d cmd/internal/gc: inline writeBarrierEnabled check before calling writebarrierptr
I believe the benchmarks that get slower are under register pressure,
and not making the call unconditionally makes the pressure worse,
and the register allocator doesn't do a great job. But part of the point
of this sequence is to get the write barriers out of the way so I can work
on the register allocator, so that's okay.

name                                       old                     new          delta
BenchmarkBinaryTree17              17.9s × (1.00,1.01)     18.0s × (0.99,1.01)  ~
BenchmarkFannkuch11                4.43s × (1.00,1.00)     4.43s × (1.00,1.00)  ~
BenchmarkFmtFprintfEmpty           110ns × (1.00,1.06)     114ns × (0.95,1.05)  ~
BenchmarkFmtFprintfString          487ns × (0.99,1.00)     468ns × (0.99,1.01)  -4.00%
BenchmarkFmtFprintfInt             450ns × (0.99,1.00)     433ns × (1.00,1.01)  -3.88%
BenchmarkFmtFprintfIntInt          762ns × (1.00,1.00)     748ns × (0.99,1.01)  -1.84%
BenchmarkFmtFprintfPrefixedInt     584ns × (0.99,1.01)     547ns × (0.99,1.01)  -6.26%
BenchmarkFmtFprintfFloat           738ns × (1.00,1.00)     756ns × (1.00,1.01)  +2.37%
BenchmarkFmtManyArgs              2.80µs × (1.00,1.01)    2.79µs × (1.00,1.01)  ~
BenchmarkGobDecode                39.0ms × (0.99,1.00)    39.6ms × (0.99,1.00)  +1.54%
BenchmarkGobEncode                37.8ms × (0.98,1.01)    37.6ms × (1.00,1.01)  ~
BenchmarkGzip                      661ms × (0.99,1.01)     663ms × (0.99,1.02)  ~
BenchmarkGunzip                    142ms × (1.00,1.00)     142ms × (1.00,1.00)  ~
BenchmarkHTTPClientServer          132µs × (0.99,1.01)     132µs × (0.99,1.01)  ~
BenchmarkJSONEncode               56.3ms × (0.99,1.01)    56.2ms × (0.99,1.01)  ~
BenchmarkJSONDecode                138ms × (0.99,1.01)     138ms × (1.00,1.00)  ~
BenchmarkMandelbrot200            6.01ms × (1.00,1.00)    6.03ms × (1.00,1.01)  +0.23%
BenchmarkGoParse                  10.2ms × (0.87,1.05)     9.8ms × (0.93,1.10)  ~
BenchmarkRegexpMatchEasy0_32       208ns × (1.00,1.00)     207ns × (1.00,1.00)  ~
BenchmarkRegexpMatchEasy0_1K       588ns × (1.00,1.00)     581ns × (1.00,1.01)  -1.27%
BenchmarkRegexpMatchEasy1_32       182ns × (0.99,1.01)     185ns × (0.99,1.01)  +1.65%
BenchmarkRegexpMatchEasy1_1K       986ns × (1.00,1.01)     975ns × (1.00,1.01)  -1.17%
BenchmarkRegexpMatchMedium_32      323ns × (1.00,1.01)     328ns × (0.99,1.00)  +1.55%
BenchmarkRegexpMatchMedium_1K     89.9µs × (1.00,1.00)    88.6µs × (1.00,1.01)  -1.38%
BenchmarkRegexpMatchHard_32       4.72µs × (0.95,1.01)    4.69µs × (0.95,1.03)  ~
BenchmarkRegexpMatchHard_1K        133µs × (1.00,1.01)     133µs × (1.00,1.01)  ~
BenchmarkRevcomp                   900ms × (1.00,1.05)     902ms × (0.99,1.05)  ~
BenchmarkTemplate                  168ms × (0.99,1.01)     174ms × (0.99,1.01)  +3.30%
BenchmarkTimeParse                 637ns × (1.00,1.00)     639ns × (1.00,1.00)  +0.31%
BenchmarkTimeFormat                738ns × (1.00,1.00)     736ns × (1.00,1.01)  ~

Change-Id: I03ce152852edec404538f6c20eb650fac82e2aa2
Reviewed-on: https://go-review.googlesource.com/9224
Reviewed-by: Austin Clements <austin@google.com>
2015-04-28 01:38:06 +00:00
Russ Cox
32d6fbcb4f runtime: replace needwb() with writeBarrierEnabled
Reduce the write barrier check to a single load and compare
so that it can be inlined into write barrier use sites.
Makes the standard write barrier a little faster too.

name                                       old                     new          delta
BenchmarkBinaryTree17              17.9s × (0.99,1.01)     17.9s × (1.00,1.01)  ~
BenchmarkFannkuch11                4.35s × (1.00,1.00)     4.43s × (1.00,1.00)  +1.81%
BenchmarkFmtFprintfEmpty           120ns × (0.93,1.06)     110ns × (1.00,1.06)  -7.92%
BenchmarkFmtFprintfString          479ns × (0.99,1.00)     487ns × (0.99,1.00)  +1.67%
BenchmarkFmtFprintfInt             452ns × (0.99,1.02)     450ns × (0.99,1.00)  ~
BenchmarkFmtFprintfIntInt          766ns × (0.99,1.01)     762ns × (1.00,1.00)  ~
BenchmarkFmtFprintfPrefixedInt     576ns × (0.98,1.01)     584ns × (0.99,1.01)  ~
BenchmarkFmtFprintfFloat           730ns × (1.00,1.01)     738ns × (1.00,1.00)  +1.16%
BenchmarkFmtManyArgs              2.84µs × (0.99,1.00)    2.80µs × (1.00,1.01)  -1.22%
BenchmarkGobDecode                39.3ms × (0.98,1.01)    39.0ms × (0.99,1.00)  ~
BenchmarkGobEncode                39.5ms × (0.99,1.01)    37.8ms × (0.98,1.01)  -4.33%
BenchmarkGzip                      663ms × (1.00,1.01)     661ms × (0.99,1.01)  ~
BenchmarkGunzip                    143ms × (1.00,1.00)     142ms × (1.00,1.00)  ~
BenchmarkHTTPClientServer          132µs × (0.99,1.01)     132µs × (0.99,1.01)  ~
BenchmarkJSONEncode               57.4ms × (0.99,1.01)    56.3ms × (0.99,1.01)  -1.96%
BenchmarkJSONDecode                139ms × (0.99,1.00)     138ms × (0.99,1.01)  ~
BenchmarkMandelbrot200            6.03ms × (1.00,1.00)    6.01ms × (1.00,1.00)  ~
BenchmarkGoParse                  10.3ms × (0.89,1.14)    10.2ms × (0.87,1.05)  ~
BenchmarkRegexpMatchEasy0_32       209ns × (1.00,1.00)     208ns × (1.00,1.00)  ~
BenchmarkRegexpMatchEasy0_1K       591ns × (0.99,1.00)     588ns × (1.00,1.00)  ~
BenchmarkRegexpMatchEasy1_32       184ns × (0.99,1.02)     182ns × (0.99,1.01)  ~
BenchmarkRegexpMatchEasy1_1K      1.01µs × (1.00,1.00)    0.99µs × (1.00,1.01)  -2.33%
BenchmarkRegexpMatchMedium_32      330ns × (1.00,1.00)     323ns × (1.00,1.01)  -2.12%
BenchmarkRegexpMatchMedium_1K     92.6µs × (1.00,1.00)    89.9µs × (1.00,1.00)  -2.92%
BenchmarkRegexpMatchHard_32       4.80µs × (0.95,1.00)    4.72µs × (0.95,1.01)  ~
BenchmarkRegexpMatchHard_1K        136µs × (1.00,1.00)     133µs × (1.00,1.01)  -1.86%
BenchmarkRevcomp                   900ms × (0.99,1.04)     900ms × (1.00,1.05)  ~
BenchmarkTemplate                  172ms × (1.00,1.00)     168ms × (0.99,1.01)  -2.07%
BenchmarkTimeParse                 637ns × (1.00,1.00)     637ns × (1.00,1.00)  ~
BenchmarkTimeFormat                744ns × (1.00,1.01)     738ns × (1.00,1.00)  -0.67%

Change-Id: I4ecc925805da1f5ee264377f1f7574f54ee575e7
Reviewed-on: https://go-review.googlesource.com/9321
Reviewed-by: Austin Clements <austin@google.com>
2015-04-28 01:37:53 +00:00
Russ Cox
2050f57141 runtime: change unused argument in fat write barriers from pointer to scalar
The argument is unused, only present for alignment of the
following argument. The compiler today always passes a zero
but I'd rather not write anything there during the call sequence,
so mark it as a scalar so the garbage collector won't look at it.

As expected, no significant performance change.

name                                       old                     new          delta
BenchmarkBinaryTree17              17.9s × (0.99,1.00)     17.9s × (0.99,1.01)  ~
BenchmarkFannkuch11                4.35s × (1.00,1.00)     4.35s × (1.00,1.00)  ~
BenchmarkFmtFprintfEmpty           120ns × (0.94,1.05)     120ns × (0.93,1.06)  ~
BenchmarkFmtFprintfString          477ns × (1.00,1.00)     479ns × (0.99,1.00)  ~
BenchmarkFmtFprintfInt             450ns × (0.99,1.01)     452ns × (0.99,1.02)  ~
BenchmarkFmtFprintfIntInt          765ns × (0.99,1.01)     766ns × (0.99,1.01)  ~
BenchmarkFmtFprintfPrefixedInt     569ns × (0.99,1.01)     576ns × (0.98,1.01)  ~
BenchmarkFmtFprintfFloat           728ns × (1.00,1.00)     730ns × (1.00,1.01)  ~
BenchmarkFmtManyArgs              2.82µs × (0.99,1.01)    2.84µs × (0.99,1.00)  ~
BenchmarkGobDecode                39.1ms × (0.99,1.01)    39.3ms × (0.98,1.01)  ~
BenchmarkGobEncode                39.4ms × (0.99,1.01)    39.5ms × (0.99,1.01)  ~
BenchmarkGzip                      661ms × (0.99,1.01)     663ms × (1.00,1.01)  ~
BenchmarkGunzip                    143ms × (1.00,1.00)     143ms × (1.00,1.00)  ~
BenchmarkHTTPClientServer          133µs × (0.99,1.01)     132µs × (0.99,1.01)  ~
BenchmarkJSONEncode               57.3ms × (0.99,1.04)    57.4ms × (0.99,1.01)  ~
BenchmarkJSONDecode                139ms × (0.99,1.00)     139ms × (0.99,1.00)  ~
BenchmarkMandelbrot200            6.02ms × (1.00,1.00)    6.03ms × (1.00,1.00)  ~
BenchmarkGoParse                  9.72ms × (0.92,1.11)   10.31ms × (0.89,1.14)  ~
BenchmarkRegexpMatchEasy0_32       209ns × (1.00,1.01)     209ns × (1.00,1.00)  ~
BenchmarkRegexpMatchEasy0_1K       592ns × (0.99,1.00)     591ns × (0.99,1.00)  ~
BenchmarkRegexpMatchEasy1_32       183ns × (0.98,1.01)     184ns × (0.99,1.02)  ~
BenchmarkRegexpMatchEasy1_1K      1.01µs × (1.00,1.01)    1.01µs × (1.00,1.00)  ~
BenchmarkRegexpMatchMedium_32      330ns × (1.00,1.00)     330ns × (1.00,1.00)  ~
BenchmarkRegexpMatchMedium_1K     92.4µs × (1.00,1.00)    92.6µs × (1.00,1.00)  ~
BenchmarkRegexpMatchHard_32       4.77µs × (0.95,1.01)    4.80µs × (0.95,1.00)  ~
BenchmarkRegexpMatchHard_1K        136µs × (1.00,1.00)     136µs × (1.00,1.00)  ~
BenchmarkRevcomp                   906ms × (0.99,1.05)     900ms × (0.99,1.04)  ~
BenchmarkTemplate                  171ms × (0.99,1.01)     172ms × (1.00,1.00)  ~
BenchmarkTimeParse                 638ns × (1.00,1.00)     637ns × (1.00,1.00)  ~
BenchmarkTimeFormat                745ns × (0.99,1.02)     744ns × (1.00,1.01)  ~

Change-Id: I0aeac5dc7adfd75e2223e3aabfedc7818d339f9b
Reviewed-on: https://go-review.googlesource.com/9320
Reviewed-by: Austin Clements <austin@google.com>
2015-04-28 01:37:45 +00:00
Russ Cox
6c328efc4c cmd/internal/gc: accept comma-separated list of name=value for -d
This should obviously have no performance impact.
Listing numbers just as a sanity check for the benchmark
comparison program: it should (and does) find nothing
to report.

name                                       old                     new          delta
BenchmarkBinaryTree17              18.0s × (0.99,1.01)     17.9s × (0.99,1.00)  ~
BenchmarkFannkuch11                4.36s × (1.00,1.00)     4.35s × (1.00,1.00)  ~
BenchmarkFmtFprintfEmpty           120ns × (0.99,1.06)     120ns × (0.94,1.05)  ~
BenchmarkFmtFprintfString          480ns × (0.99,1.01)     477ns × (1.00,1.00)  ~
BenchmarkFmtFprintfInt             451ns × (0.99,1.01)     450ns × (0.99,1.01)  ~
BenchmarkFmtFprintfIntInt          766ns × (0.99,1.01)     765ns × (0.99,1.01)  ~
BenchmarkFmtFprintfPrefixedInt     569ns × (0.99,1.01)     569ns × (0.99,1.01)  ~
BenchmarkFmtFprintfFloat           728ns × (1.00,1.01)     728ns × (1.00,1.00)  ~
BenchmarkFmtManyArgs              2.81µs × (1.00,1.01)    2.82µs × (0.99,1.01)  ~
BenchmarkGobDecode                39.4ms × (0.99,1.01)    39.1ms × (0.99,1.01)  ~
BenchmarkGobEncode                39.4ms × (0.99,1.00)    39.4ms × (0.99,1.01)  ~
BenchmarkGzip                      660ms × (1.00,1.01)     661ms × (0.99,1.01)  ~
BenchmarkGunzip                    143ms × (1.00,1.00)     143ms × (1.00,1.00)  ~
BenchmarkHTTPClientServer          132µs × (0.99,1.01)     133µs × (0.99,1.01)  ~
BenchmarkJSONEncode               57.1ms × (0.99,1.01)    57.3ms × (0.99,1.04)  ~
BenchmarkJSONDecode                138ms × (1.00,1.01)     139ms × (0.99,1.00)  ~
BenchmarkMandelbrot200            6.02ms × (1.00,1.00)    6.02ms × (1.00,1.00)  ~
BenchmarkGoParse                  9.79ms × (0.92,1.07)    9.72ms × (0.92,1.11)  ~
BenchmarkRegexpMatchEasy0_32       210ns × (1.00,1.01)     209ns × (1.00,1.01)  ~
BenchmarkRegexpMatchEasy0_1K       593ns × (0.99,1.01)     592ns × (0.99,1.00)  ~
BenchmarkRegexpMatchEasy1_32       182ns × (0.99,1.01)     183ns × (0.98,1.01)  ~
BenchmarkRegexpMatchEasy1_1K      1.01µs × (1.00,1.01)    1.01µs × (1.00,1.01)  ~
BenchmarkRegexpMatchMedium_32      331ns × (1.00,1.00)     330ns × (1.00,1.00)  ~
BenchmarkRegexpMatchMedium_1K     92.6µs × (1.00,1.01)    92.4µs × (1.00,1.00)  ~
BenchmarkRegexpMatchHard_32       4.58µs × (0.99,1.05)    4.77µs × (0.95,1.01)  ~
BenchmarkRegexpMatchHard_1K        136µs × (1.00,1.01)     136µs × (1.00,1.00)  ~
BenchmarkRevcomp                   900ms × (0.99,1.06)     906ms × (0.99,1.05)  ~
BenchmarkTemplate                  171ms × (1.00,1.01)     171ms × (0.99,1.01)  ~
BenchmarkTimeParse                 637ns × (1.00,1.00)     638ns × (1.00,1.00)  ~
BenchmarkTimeFormat                742ns × (1.00,1.00)     745ns × (0.99,1.02)  ~

Change-Id: I59ec875715cb176bbffa709546370a6a7fc5a75d
Reviewed-on: https://go-review.googlesource.com/9309
Reviewed-by: Austin Clements <austin@google.com>
2015-04-28 01:37:34 +00:00
Russ Cox
0f0bc0f016 cmd/internal/gc: use MOV R0, R1 instead of LEA 0(R0), R1 in Agen
Minor code generation optimization I've been meaning to do
for a while and noticed while working on the emitted write
barrier code. Using MOV lets the compiler and maybe the
processor do copy propagation.

name                                       old                     new          delta
BenchmarkBinaryTree17              17.9s × (0.99,1.01)     18.0s × (0.99,1.01)  ~
BenchmarkFannkuch11                4.42s × (1.00,1.00)     4.36s × (1.00,1.00)  -1.39%
BenchmarkFmtFprintfEmpty           118ns × (0.96,1.02)     120ns × (0.99,1.06)  ~
BenchmarkFmtFprintfString          486ns × (0.99,1.01)     480ns × (0.99,1.01)  -1.34%
BenchmarkFmtFprintfInt             457ns × (0.99,1.01)     451ns × (0.99,1.01)  -1.31%
BenchmarkFmtFprintfIntInt          768ns × (1.00,1.01)     766ns × (0.99,1.01)  ~
BenchmarkFmtFprintfPrefixedInt     584ns × (0.99,1.03)     569ns × (0.99,1.01)  -2.57%
BenchmarkFmtFprintfFloat           739ns × (0.99,1.00)     728ns × (1.00,1.01)  -1.49%
BenchmarkFmtManyArgs              2.77µs × (1.00,1.00)    2.81µs × (1.00,1.01)  +1.53%
BenchmarkGobDecode                39.3ms × (0.99,1.01)    39.4ms × (0.99,1.01)  ~
BenchmarkGobEncode                39.4ms × (0.99,1.00)    39.4ms × (0.99,1.00)  ~
BenchmarkGzip                      661ms × (0.99,1.01)     660ms × (1.00,1.01)  ~
BenchmarkGunzip                    142ms × (1.00,1.00)     143ms × (1.00,1.00)  +0.20%
BenchmarkHTTPClientServer          133µs × (0.98,1.01)     132µs × (0.99,1.01)  ~
BenchmarkJSONEncode               56.5ms × (0.99,1.01)    57.1ms × (0.99,1.01)  +0.94%
BenchmarkJSONDecode                143ms × (1.00,1.00)     138ms × (1.00,1.01)  -3.22%
BenchmarkMandelbrot200            6.01ms × (1.00,1.00)    6.02ms × (1.00,1.00)  ~
BenchmarkGoParse                  9.63ms × (0.94,1.07)    9.79ms × (0.92,1.07)  ~
BenchmarkRegexpMatchEasy0_32       210ns × (1.00,1.00)     210ns × (1.00,1.01)  ~
BenchmarkRegexpMatchEasy0_1K       596ns × (0.99,1.01)     593ns × (0.99,1.01)  ~
BenchmarkRegexpMatchEasy1_32       184ns × (0.99,1.01)     182ns × (0.99,1.01)  ~
BenchmarkRegexpMatchEasy1_1K      1.01µs × (0.99,1.01)    1.01µs × (1.00,1.01)  ~
BenchmarkRegexpMatchMedium_32      327ns × (1.00,1.01)     331ns × (1.00,1.00)  +1.22%
BenchmarkRegexpMatchMedium_1K     93.0µs × (1.00,1.02)    92.6µs × (1.00,1.01)  ~
BenchmarkRegexpMatchHard_32       4.76µs × (0.95,1.01)    4.58µs × (0.99,1.05)  ~
BenchmarkRegexpMatchHard_1K        136µs × (1.00,1.01)     136µs × (1.00,1.01)  ~
BenchmarkRevcomp                   892ms × (1.00,1.01)     900ms × (0.99,1.06)  ~
BenchmarkTemplate                  175ms × (0.99,1.00)     171ms × (1.00,1.01)  -2.36%
BenchmarkTimeParse                 638ns × (1.00,1.00)     637ns × (1.00,1.00)  ~
BenchmarkTimeFormat                772ns × (1.00,1.00)     742ns × (1.00,1.00)  -3.95%

Change-Id: I6504e310cb9cf48a73d539c478b4dbcacde208b2
Reviewed-on: https://go-review.googlesource.com/9308
Reviewed-by: Austin Clements <austin@google.com>
2015-04-28 01:37:33 +00:00
Russ Cox
0ad4f8b1f7 cmd/internal/gc: emit write barriers at lower level
This is primarily preparation for inlining, not an optimization by itself,
but it still helps some.

name                                       old                     new          delta
BenchmarkBinaryTree17              18.2s × (0.99,1.01)     17.9s × (0.99,1.01)  -1.57%
BenchmarkFannkuch11                4.44s × (1.00,1.00)     4.42s × (1.00,1.00)  -0.40%
BenchmarkFmtFprintfEmpty           119ns × (0.95,1.02)     118ns × (0.96,1.02)  ~
BenchmarkFmtFprintfString          501ns × (0.99,1.02)     486ns × (0.99,1.01)  -2.89%
BenchmarkFmtFprintfInt             474ns × (0.99,1.00)     457ns × (0.99,1.01)  -3.59%
BenchmarkFmtFprintfIntInt          792ns × (1.00,1.00)     768ns × (1.00,1.01)  -3.03%
BenchmarkFmtFprintfPrefixedInt     574ns × (1.00,1.01)     584ns × (0.99,1.03)  +1.83%
BenchmarkFmtFprintfFloat           749ns × (1.00,1.00)     739ns × (0.99,1.00)  -1.34%
BenchmarkFmtManyArgs              2.94µs × (1.00,1.01)    2.77µs × (1.00,1.00)  -5.76%
BenchmarkGobDecode                39.5ms × (0.99,1.01)    39.3ms × (0.99,1.01)  ~
BenchmarkGobEncode                39.4ms × (1.00,1.01)    39.4ms × (0.99,1.00)  ~
BenchmarkGzip                      658ms × (1.00,1.01)     661ms × (0.99,1.01)  ~
BenchmarkGunzip                    142ms × (1.00,1.00)     142ms × (1.00,1.00)  +0.22%
BenchmarkHTTPClientServer          134µs × (0.99,1.01)     133µs × (0.98,1.01)  ~
BenchmarkJSONEncode               57.1ms × (0.99,1.01)    56.5ms × (0.99,1.01)  ~
BenchmarkJSONDecode                141ms × (1.00,1.00)     143ms × (1.00,1.00)  +1.09%
BenchmarkMandelbrot200            6.01ms × (1.00,1.00)    6.01ms × (1.00,1.00)  ~
BenchmarkGoParse                  10.1ms × (0.91,1.09)     9.6ms × (0.94,1.07)  ~
BenchmarkRegexpMatchEasy0_32       207ns × (1.00,1.01)     210ns × (1.00,1.00)  +1.45%
BenchmarkRegexpMatchEasy0_1K       592ns × (0.99,1.00)     596ns × (0.99,1.01)  +0.68%
BenchmarkRegexpMatchEasy1_32       184ns × (0.99,1.01)     184ns × (0.99,1.01)  ~
BenchmarkRegexpMatchEasy1_1K      1.01µs × (1.00,1.00)    1.01µs × (0.99,1.01)  ~
BenchmarkRegexpMatchMedium_32      327ns × (0.99,1.00)     327ns × (1.00,1.01)  ~
BenchmarkRegexpMatchMedium_1K     92.5µs × (1.00,1.00)    93.0µs × (1.00,1.02)  +0.48%
BenchmarkRegexpMatchHard_32       4.79µs × (0.95,1.00)    4.76µs × (0.95,1.01)  ~
BenchmarkRegexpMatchHard_1K        136µs × (1.00,1.00)     136µs × (1.00,1.01)  ~
BenchmarkRevcomp                   900ms × (0.99,1.01)     892ms × (1.00,1.01)  ~
BenchmarkTemplate                  170ms × (0.99,1.01)     175ms × (0.99,1.00)  +2.95%
BenchmarkTimeParse                 645ns × (1.00,1.00)     638ns × (1.00,1.00)  -1.16%
BenchmarkTimeFormat                740ns × (1.00,1.00)     772ns × (1.00,1.00)  +4.39%

Change-Id: I0be905e32791e0cb70ff01f169c4b309a971d981
Reviewed-on: https://go-review.googlesource.com/9159
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-04-28 01:37:05 +00:00
Josh Bleecher Snyder
673bd18805 test: gofmt run.go
Clean up after CL 5310.

Change-Id: Ib870e7b9d26eb118eefdaa3e76dcec4a4d459584
Reviewed-on: https://go-review.googlesource.com/9398
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-04-28 00:08:50 +00:00
Josh Bleecher Snyder
1fb948a029 test: set GOMAXPROCS=1 in fixedbugs/issue9110
With this fix,

GOMAXPROCS=8 ./all.bash

passes, at least on my machine.

Fixes #10216.

Change-Id: Ib5991950892a1399ec81aced0a52b435e6f83fdf
Reviewed-on: https://go-review.googlesource.com/9392
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-04-28 00:06:13 +00:00
Mikio Hara
9bef5cfb9b net: don't miss testing server teardowns when test fails early
Change-Id: I9fa678e43b4ae3970323cac474b5f86d4d933997
Reviewed-on: https://go-review.googlesource.com/9382
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-04-28 00:01:33 +00:00
Matthew Dempsky
8a413752fb test: reenable syntax tests
These were fixed a little while ago, but overlooked when reenabling
disabled tests.

Update #9968.

Change-Id: I301ef587e580c517a170ad08ff897118b58cedec
Reviewed-on: https://go-review.googlesource.com/9347
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2015-04-27 23:48:00 +00:00
Rob Pike
181e81cfe4 doc/go1.5.txt: go doc
Change-Id: I883017b67e8fa76b6f123e8f9bcec3d6f820bbb3
Reviewed-on: https://go-review.googlesource.com/9348
Reviewed-by: Rob Pike <r@golang.org>
2015-04-27 23:24:27 +00:00
Rob Pike
a5de54a870 cmd/go,cmd/doc: add "go doc"
Add the new go doc command to the go command, installed in
the tool directory.

(Still to do: tests)

Fix cmd/dist to remove old "package documentation" code that was
stopping it from including cmd/go/doc.go in the build.

Implement the doc command. Here is the help info from "go help doc":

===
usage: go doc [-u] [package|[package.]symbol[.method]]

Doc accepts at most one argument, indicating either a package, a symbol within a
package, or a method of a symbol.

	go doc
	go doc <pkg>
	go doc <sym>[.<method>]
	go doc [<pkg>].<sym>[.<method>]

Doc interprets the argument to see what it represents, determined by its syntax
and which packages and symbols are present in the source directories of GOROOT and
GOPATH.

The first item in this list that succeeds is the one whose documentation is printed.
For packages, the order of scanning is determined by the file system, however the
GOROOT tree is always scanned before GOPATH.

If there is no package specified or matched, the package in the current directory
is selected, so "go doc" shows the documentation for the current package and
"go doc Foo" shows the documentation for symbol Foo in the current package.

Doc prints the documentation comments associated with the top-level item the
argument identifies (package, type, method) followed by a one-line summary of each
of the first-level items "under" that item (package-level declarations for a
package, methods for a type, etc.)

The package paths must be either a qualified path or a proper suffix of a path
(see examples below). The go tool's usual package mechanism does not apply: package
path elements like . and ...  are not implemented by go doc.

When matching symbols, lower-case letters match either case but upper-case letters
match exactly.

Examples:
	go doc
		Show documentation for current package.
	go doc Foo
		Show documentation for Foo in the current package.
		(Foo starts with a capital letter so it cannot match a package path.)
	go doc json
		Show documentation for the encoding/json package.
	go doc json
		Shorthand for encoding/json assuming only one json package
		is present in the tree.
	go doc json.Number (or go doc json.number)
		Show documentation and method summary for json.Number.
	go doc json.Number.Int64 (or go doc json.number.int64)
		Show documentation for the Int64 method of json.Number.

Flags:
	-u
		Show documentation for unexported as well as exported
		symbols and methods.

===

Still to do:

Tests.
Disambiguation when there is both foo and Foo.
Flag for case-sensitive matching.

Change-Id: I83d409a68688a5445f54297a7e7c745f749b9e66
Reviewed-on: https://go-review.googlesource.com/9227
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-27 23:22:26 +00:00
Austin Clements
02ba71e547 runtime/race: fix failing tests
Some race tests were sensitive to the goroutine scheduling order.
When this changed in commit e870f06, these tests started to fail.

Fix TestRaceHeapParam by ensuring that the racing goroutine has
run before the test exits. Fix TestRaceRWMutexMultipleReaders by
adding a third reader to ensure that two readers wind up on the
same side of the writer (and race with each other) regardless of
the schedule. Fix TestRaceRange by ensuring that the racing
goroutine runs before the main goroutine exits the loop it races
with.

Change-Id: Iaf002f8730ea42227feaf2f3c51b9a1e57ccffdd
Reviewed-on: https://go-review.googlesource.com/9402
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-27 23:12:00 +00:00
Russ Cox
f774e6a1f8 runtime/race: stop listening to external network addresses
This makes the OS X firewall box pop up.
Not run during all.bash so hasn't been noticed before.

Change-Id: I78feb4fd3e1d3c983ae3419085048831c04de3da
Reviewed-on: https://go-review.googlesource.com/9401
Reviewed-by: Austin Clements <austin@google.com>
2015-04-27 23:11:45 +00:00
Austin Clements
7c7cd69591 runtime: fix stack use accounting
ReadMemStats accounts for stacks slightly differently than the runtime
does internally. Internally, only stacks allocated by newosproc0 are
accounted in memstats.stacks_sys and other stacks are accounted in
heap_sys. readmemstats_m shuffles the statistics so all stacks are
accounted in StackSys rather than HeapSys.

However, currently, readmemstats_m assumes StackSys will be zero when
it does this shuffle. This was true until commit 6ad33be. If it isn't
(e.g., if something called newosproc0), StackSys+HeapSys will be
different before and after this shuffle, and the Sys sum that was
computed earlier will no longer agree with the sum of its components.

Fix this by making the shuffle in readmemstats_m not assume that
StackSys is zero.

Fixes #10585.

Change-Id: If13991c8de68bd7b85e1b613d3f12b4fd6fd5813
Reviewed-on: https://go-review.googlesource.com/9366
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-27 23:09:39 +00:00
David Crawshaw
d707a6e0e2 runtime: remove unnecessary noescape to fix netbsd
I introduced this build failure in golang.org/cl/9302 but failed to
notice due to the other failures on the dashboard.

Change-Id: I84bf00f664ba572c1ca722e0136d8a2cf21613ca
Reviewed-on: https://go-review.googlesource.com/9363
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
2015-04-27 23:04:38 +00:00
Josh Bleecher Snyder
00d4a6b35d cmd/internal/gc, cmd/internal/ld: add memprofilerate flag
Also call runtime.GC before exit to ensure
that the profiler picks up all allocations.

Fixes #10537.

Change-Id: Ibfbfc88652ac0ce30a6d1ae392f919df6c1e8126
Reviewed-on: https://go-review.googlesource.com/9261
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-27 22:21:40 +00:00
Austin Clements
23ce80efeb runtime/race: fix benchmark deadlock
Currently TestRaceCrawl fails to wg.Done for every wg.Adds if the
depth ever reaches 0. This causes the test to deadlock. Under the race
detector, this deadlock is not detected, so the test eventually times
out.

This only recently became a problem. Prior to commit e870f06 the depth
would never reach 0 because the strict round-robin goroutine schedule
ensured that all of the URLs were already "seen" by depth 2. Now that
the runtime prefers scheduling the most recently started goroutine,
the test is able to reach depth 0 and trigger this deadlock.

Change-Id: I5176302a89614a344c84d587073b364833af6590
Reviewed-on: https://go-review.googlesource.com/9344
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-27 20:54:34 +00:00
Dmitry Savintsev
8cb9c21cce regexp: trivial change in comments to update code.google.com link
Replaced code.google.com/p/re2/ with github.com/google/re2/ and
updated the file names (re2-exhaustive.txt.bz2 not re2.txt.gz)
as well as the re2 make command (make log).

Change-Id: I15937b0b8a898d78d45366857ed86421c8d69960
Reviewed-on: https://go-review.googlesource.com/9372
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-04-27 20:18:25 +00:00
Russ Cox
42da270024 runtime: fix race in BenchmarkPingPongHog
The master goroutine was returning before
the child goroutine had done its final i < b.N
(the one that fails and causes it to exit the loop)
and then the benchmark harness was updating
b.N, causing a read+write race on b.N.

Change-Id: I2504270a0de30544736f6c32161337a25b505c3e
Reviewed-on: https://go-review.googlesource.com/9368
Reviewed-by: Austin Clements <austin@google.com>
2015-04-27 20:10:11 +00:00
Austin Clements
33e0f3d853 runtime: fix some out of date comments and typos
Change-Id: I061057414c722c5a0f03c709528afc8554114db6
Reviewed-on: https://go-review.googlesource.com/9367
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-27 20:08:38 +00:00
Josh Bleecher Snyder
9a0fd97ff3 runtime: remove a modulus calculation from pollorder
This is a follow-up to CL 9269, as suggested
by dvyukov.

There is probably even more that can be done
to speed up this shuffle. It will matter more
once CL 7570 (fine-grained locking in select)
is in and can be revisited then, with benchmarks.

Change-Id: Ic13a27d11cedd1e1f007951214b3bb56b1644f02
Reviewed-on: https://go-review.googlesource.com/9393
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-04-27 19:36:37 +00:00
Austin Clements
1b01910c06 runtime: rename gcController.findRunnable to findRunnableGCWorker
This avoids confusion with the main findrunnable in the scheduler.

Change-Id: I8cf40657557a8610a2fe5a2f74598518256ca7f0
Reviewed-on: https://go-review.googlesource.com/9305
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-27 19:26:42 +00:00
Austin Clements
bb6320535d runtime: replace STW for enabling write barriers with ragged barrier
Currently, we use a full stop-the-world around enabling write
barriers. This is to ensure that all Gs have enabled write barriers
before any blackening occurs (either in gcBgMarkWorker() or in
gcAssistAlloc()).

However, there's no need to bring the whole world to a synchronous
stop to ensure this. This change replaces the STW with a ragged
barrier that ensures each P has individually observed that write
barriers should be enabled before GC performs any blackening.

Change-Id: If2f129a6a55bd8bdd4308067af2b739f3fb41955
Reviewed-on: https://go-review.googlesource.com/8207
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-27 19:26:37 +00:00
Austin Clements
57afa76471 runtime: add ragged global barrier function
This adds forEachP, which performs a general-purpose ragged global
barrier. forEachP takes a callback and invokes it for every P at a GC
safe point.

Ps that are idle or in a syscall are considered to be at a continuous
safe point. forEachP ensures that these Ps do not change state by
forcing all syscall Ps into idle and holding the sched.lock.

To ensure that Ps do not enter syscall or idle without running the
safe-point function, this adds checks for a pending callback every
place there is currently a gcwaiting check.

We'll use forEachP to replace the STW around enabling the write
barrier and to replace the current asynchronous per-M wbuf cache with
a cooperatively managed per-P gcWork cache.

Change-Id: Ie944f8ce1fead7c79bf271d2f42fcd61a41bb3cc
Reviewed-on: https://go-review.googlesource.com/8206
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-27 19:26:33 +00:00
Josh Bleecher Snyder
81c2233b4a Revert "cmd/dist: consolidate runtime CPU tests"
This reverts commit a9e50a6b35.

Change-Id: I3c5e459f1030e36bc249910facdae12303a44151
Reviewed-on: https://go-review.googlesource.com/9394
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2015-04-27 17:56:04 +00:00
Josh Bleecher Snyder
a9e50a6b35 cmd/dist: consolidate runtime CPU tests
Instead of running:

go test -short runtime -cpu=1
go test -short runtime -cpu=2
go test -short runtime -cpu=4

Run just:

go test -short runtime -cpu=1,2,4

This is a return to the Go 1.4.2 behavior.

We lose incremental display of progress and
per-cpu timing information, but we don't have
to recompile and relink the runtime test,
which is slow.

This cuts about 10s off all.bash.

Updates #10571.

Change-Id: I6e8c7149780d47439f8bcfa888e6efc84290c60a
Reviewed-on: https://go-review.googlesource.com/9350
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-04-27 17:36:18 +00:00
Josh Bleecher Snyder
2692f48330 cmd/internal/ld: remove pointless allocs
Reduces allocs linking cmd/go and runtime.test
by ~13%. No functional changes.

The most easily addressed sources of allocations
after this are expandpkg, rdstring, and symbuf
string conversion.

These can be reduced by interning strings,
but that increases the overall memory footprint.

Change-Id: Ifedefc9f2a0403bcc75460d6b139e8408374e058
Reviewed-on: https://go-review.googlesource.com/9391
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-04-27 17:08:56 +00:00
Roger Peppe
4a3e000a48 encoding/xml: do not escape newlines
There is no need to escape newlines in char data -
it makes the XML larger and harder to read.

Change-Id: I1c1fcee1bdffc705c7428f89ca90af8085d6fb73
Reviewed-on: https://go-review.googlesource.com/9310
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-27 15:38:04 +00:00
Austin Clements
b0b1a66052 runtime: reset spinning in mspinning if work was ready()ed
This fixes a bug where the runtime ready()s a goroutine while setting
up a new M that's initially marked as spinning, causing the scheduler
to later panic when it finds work in the run queue of a P associated
with a spinning M. Specifically, the sequence of events that can lead
to this is:

1) sysmon calls handoffp to hand off a P stolen from a syscall.

2) handoffp sees no pending work on the P, so it calls startm with
   spinning set.

3) startm calls newm, which in turn calls allocm to allocate a new M.

4) allocm "borrows" the P we're handing off in order to do allocation
   and performs this allocation.

5) This allocation may assist the garbage collector, and this assist
   may detect the end of concurrent mark and ready() the main GC
   goroutine to signal this.

6) This ready()ing puts the GC goroutine on the run queue of the
   borrowed P.

7) newm starts the OS thread, which runs mstart and subsequently
   mstart1, which marks the M spinning because startm was called with
   spinning set.

8) mstart1 enters the scheduler, which panics because there's work on
   the run queue, but the M is marked spinning.

To fix this, before marking the M spinning in step 7, add a check to
see if work was been added to the P's run queue. If this is the case,
undo the spinning instead.

Fixes #10573.

Change-Id: I4670495ae00582144a55ce88c45ae71de597cfa5
Reviewed-on: https://go-review.googlesource.com/9332
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
2015-04-27 12:49:54 +00:00
Austin Clements
2a46f55b35 runtime: panic when idling a P with runnable Gs
This adds a check that we never put a P on the idle list when it has
work on its local run queue.

Change-Id: Ifcfab750de60c335148a7f513d4eef17be03b6a7
Reviewed-on: https://go-review.googlesource.com/9324
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-04-27 12:49:49 +00:00
Josh Bleecher Snyder
fd5540e7e5 runtime: tighten select permutation generation
This is the optimization made to math/rand in CL 21030043.

Change-Id: I231b24fa77cac1fe74ba887db76313b5efaab3e8
Reviewed-on: https://go-review.googlesource.com/9269
Reviewed-by: Minux Ma <minux@golang.org>
2015-04-27 02:36:24 +00:00
John Dethridge
3787950a92 debug/dwarf: update class_string.go to add ClassReferenceSig using stringer.
Change-Id: I677a5ee273a4d285a8adff71ffcfeac34afc887f
Reviewed-on: https://go-review.googlesource.com/9235
Reviewed-by: Austin Clements <austin@google.com>
2015-04-27 02:05:20 +00:00
Adam Langley
cba882ea9b crypto/tls: call GetCertificate if Certificates is empty.
This change causes the GetCertificate callback to be called if
Certificates is empty. Previously this configuration would result in an
error.

This allows people to have servers that depend entirely on dynamic
certificate selection, even when the client doesn't send SNI.

Fixes #9208.

Change-Id: I2f5a5551215958b88b154c64a114590300dfc461
Reviewed-on: https://go-review.googlesource.com/8792
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>
2015-04-26 22:00:35 +00:00
Jonathan Rudenberg
ac2bf8ad06 crypto/tls: add OCSP response to ConnectionState
The OCSP response is currently only exposed via a method on Conn,
which makes it inaccessible when using wrappers like net/http. The
ConnectionState structure is typically available even when using
wrappers and contains many of the other handshake details, so this
change exposes the stapled OCSP response in that structure.

Change-Id: If8dab49292566912c615d816321b4353e711f71f
Reviewed-on: https://go-review.googlesource.com/9361
Reviewed-by: Adam Langley <agl@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>
2015-04-26 22:00:13 +00:00
David Leon Gil
d86b8d34d0 crypto/elliptic: don't unmarshal points that are off the curve
At present, Unmarshal does not check that the point it unmarshals
is actually *on* the curve. (It may be on the curve's twist.)

This can, as Daniel Bernstein has pointed out at great length,
lead to quite devastating attacks. And 3 out of the 4 curves
supported by crypto/elliptic have twists with cofactor != 1;
P-224, in particular, has a sufficiently large cofactor that it
is likely that conventional dlog attacks might be useful.

This closes #2445, filed by Watson Ladd.

To explain why this was (partially) rejected before being accepted:

In the general case, for curves with cofactor != 1, verifying subgroup
membership is required. (This is expensive and hard-to-implement.)
But, as recent discussion during the CFRG standardization process
has brought out, small-subgroup attacks are much less damaging than
a twist attack.

Change-Id: I284042eb9954ff9b7cde80b8b693b1d468c7e1e8
Reviewed-on: https://go-review.googlesource.com/2421
Reviewed-by: Adam Langley <agl@golang.org>
2015-04-26 21:11:50 +00:00
Paul van Brouwershaven
54bb4b9fd7 crypto/x509: CertificateRequest signature verification
This implements a method for x509.CertificateRequest to prevent
certain attacks and to allow a CA/RA to properly check the validity
of the binding between an end entity and a key pair, to prove that
it has possession of (i.e., is able to use) the private key
corresponding to the public key for which a certificate is requested.

RFC 2986 section 3 states:

"A certification authority fulfills the request by authenticating the
requesting entity and verifying the entity's signature, and, if the
request is valid, constructing an X.509 certificate from the
distinguished name and public key, the issuer name, and the
certification authority's choice of serial number, validity period,
and signature algorithm."

Change-Id: I37795c3b1dfdfdd455d870e499b63885eb9bda4f
Reviewed-on: https://go-review.googlesource.com/7371
Reviewed-by: Adam Langley <agl@golang.org>
2015-04-26 21:07:10 +00:00
Jonathan Rudenberg
bff1417543 crypto/tls: add support for session ticket key rotation
This change adds a new method to tls.Config, SetSessionTicketKeys, that
changes the key used to encrypt session tickets while the server is
running. Additional keys may be provided that will be used to maintain
continuity while rotating keys. If a ticket encrypted with an old key is
provided by the client, the server will resume the session and provide
the client with a ticket encrypted using the new key.

Fixes #9994

Change-Id: Idbc16b10ff39616109a51ed39a6fa208faad5b4e
Reviewed-on: https://go-review.googlesource.com/9072
Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>
Reviewed-by: Adam Langley <agl@golang.org>
2015-04-26 20:57:28 +00:00
Håvard Haugen
14a4649fe2 cmd/pprof: handle empty profile gracefully
The command "go tool pprof -top $GOROOT/bin/go /dev/null" now logs that
profile is empty instead of panicking.

Fixes #9207

Change-Id: I3d55c179277cb19ad52c8f24f1aca85db53ee08d
Reviewed-on: https://go-review.googlesource.com/2571
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-04-26 20:12:17 +00:00
Jonathan Rudenberg
02e69c4b53 crypto/tls: add support for Certificate Transparency
This change adds support for serving and receiving Signed Certificate
Timestamps as described in RFC 6962.

The server is now capable of serving SCTs listed in the Certificate
structure. The client now asks for SCTs and, if any are received,
they are exposed in the ConnectionState structure.

Fixes #10201

Change-Id: Ib3adae98cb4f173bc85cec04d2bdd3aa0fec70bb
Reviewed-on: https://go-review.googlesource.com/8988
Reviewed-by: Adam Langley <agl@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>
Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>
2015-04-26 16:53:11 +00:00
Justin Nuß
2db58f8f2d encoding/csv: Preallocate records slice
Currently parseRecord will always start with a nil
slice and then resize the slice on append. For input
with a fixed number of fields per record we can preallocate
the slice to avoid having to resize the slice.

This change implements this optimization by using
FieldsPerRecord as capacity if it's > 0 and also adds a
benchmark to better show the differences.

benchmark         old ns/op     new ns/op     delta
BenchmarkRead     19741         17909         -9.28%

benchmark         old allocs     new allocs     delta
BenchmarkRead     59             41             -30.51%

benchmark         old bytes     new bytes     delta
BenchmarkRead     6276          5844          -6.88%

Change-Id: I7c2abc9c80a23571369bcfcc99a8ffc474eae7ab
Reviewed-on: https://go-review.googlesource.com/8880
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-04-26 16:28:51 +00:00
David Crawshaw
a5b693b431 runtime: signal forwarding for darwin/amd64
Follows the linux signal forwarding semantics from
http://golang.org/cl/8712, sharing the implementation of sigfwdgo.
Forwarding for 386, arm, and arm64 will follow.

Change-Id: I6bf30d563d19da39b6aec6900c7fe12d82ed4f62
Reviewed-on: https://go-review.googlesource.com/9302
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-04-26 13:46:13 +00:00
Michael Hudson-Doyle
c20ff36fe2 cmd/internal/ld: R_TLS_LE is fine on Darwin too
Sorry about this.

Fixes #10575

Change-Id: I2de23be68e7d822d182e5a0d6a00c607448d861e
Reviewed-on: https://go-review.googlesource.com/9341
Reviewed-by: Minux Ma <minux@golang.org>
2015-04-26 04:53:51 +00:00
Matt T. Proud
b6a0450bec testing/quick: align tests with reflect.Kind.
This commit is largely cosmetic in the sense that it is the remnants
of a change proposal I had prepared for testing/quick, until I
discovered that 3e9ed27 already implemented the feature I was looking
for: quick.Value() for reflect.Kind Array.  What you see is a merger
and manual cleanup; the cosmetic cleanups are as follows:

(1.) Keeping the TestCheckEqual and its associated input functions
in the same order as type kinds defined in reflect.Kind.  Since
3e9ed27 was committed, the test case began to diverge from the
constant's ordering.

(2.) The `Intptr` derivatives existed to exercise quick.Value with
reflect.Kind's `Ptr` constant.  All `Intptr` (unrelated to `uintptr`)
in the test have been migrated to ensure the parallelism of the
listings and to convey that `Intptr` is not special.

(3.) Correct a misspelling (transposition) of "alias", whereby it is
named as "Alais".

Change-Id: I441450db16b8bb1272c52b0abcda3794dcd0599d
Reviewed-on: https://go-review.googlesource.com/8804
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-26 02:40:40 +00:00
Michael Hudson-Doyle
264858c46e cmd/8l, cmd/internal/ld, cmd/internal/obj/x86: stop incorrectly using the term "inital exec"
The long comment block in obj6.go:progedit talked about the two code sequences
for accessing g as "local exec" and "initial exec", but really they are both forms
of local exec. This stuff is confusing enough without using the wrong words for
things, so rewrite it to talk about 2-instruction and 1-instruction sequences.
Unfortunately the confusion has made it into code, with the R_TLS_IE relocation
now doing double duty as meaning actual initial exec when externally linking and
boring old local exec when linking internally (half of this is my fault). So this
stops using R_TLS_IE in the local exec case. There is a chance this might break
plan9 or windows, but I don't think so. Next step is working out what the heck is
going on on ARM...

Change-Id: I09da4388210cf49dbc99fd25f5172bbe517cee57
Reviewed-on: https://go-review.googlesource.com/9273
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2015-04-25 18:13:15 +00:00
Rick Hudson
ada8cdb9f6 runtime: Fix bug due to elided return.
A previous change to mbitmap.go dropped a return on a
path the seems not to be excersized. This was a mistake that
this CL fixes.

Change-Id: I715ee4ef08f5bf8d9f53cee84e8fb31a237e2d43
Reviewed-on: https://go-review.googlesource.com/9295
Reviewed-by: Austin Clements <austin@google.com>
2015-04-24 21:52:30 +00:00
Michael Hudson-Doyle
ccc76dba60 cmd/internal/ld: fix R_TLS handling now Xsym is not read from object file
I think this should fix the arm build. A proper fix involves making the handling
of tlsg less fragile, I'll try that tomorrow.

Update #10557

Change-Id: I9b1b666737fb40aebb6f284748509afa8483cce5
Reviewed-on: https://go-review.googlesource.com/9272
Reviewed-by: Dave Cheney <dave@cheney.net>
Run-TryBot: Dave Cheney <dave@cheney.net>
2015-04-24 20:57:49 +00:00
Austin Clements
1b4025f4bd runtime: replace per-M workbuf cache with per-P gcWork cache
Currently, each M has a cache of the most recently used *workbuf. This
is used primarily by the write barrier so it doesn't have to access
the global workbuf lists on every write barrier. It's also used by
stack scanning because it's convenient.

This cache is important for write barrier performance, but this
particular approach has several downsides. It's faster than no cache,
but far from optimal (as the benchmarks below show). It's complex:
access to the cache is sprinkled through most of the workbuf list
operations and it requires special care to transform into and back out
of the gcWork cache that's actually used for scanning and marking. It
requires atomic exchanges to take ownership of the cached workbuf and
to return it to the M's cache even though it's almost always used by
only the current M. Since it's per-M, flushing these caches is O(# of
Ms), which may be high. And it has some significant subtleties: for
example, in general the cache shouldn't be used after the
harvestwbufs() in mark termination because it could hide work from
mark termination, but stack scanning can happen after this and *will*
use the cache (but it turns out this is okay because it will always be
followed by a getfull(), which drains the cache).

This change replaces this cache with a per-P gcWork object. This
gcWork cache can be used directly by scanning and marking (as long as
preemption is disabled, which is a general requirement of gcWork).
Since it's per-P, it doesn't require synchronization, which simplifies
things and means the only atomic operations in the write barrier are
occasionally fetching new work buffers and setting a mark bit if the
object isn't already marked. This cache can be flushed in O(# of Ps),
which is generally small. It follows a simple flushing rule: the cache
can be used during any phase, but during mark termination it must be
flushed before allowing preemption. This also makes the dispose during
mutator assist no longer necessary, which eliminates the vast majority
of gcWork dispose calls and reduces contention on the global workbuf
lists. And it's a lot faster on some benchmarks:

benchmark                          old ns/op       new ns/op       delta
BenchmarkBinaryTree17              11963668673     11206112763     -6.33%
BenchmarkFannkuch11                2643217136      2649182499      +0.23%
BenchmarkFmtFprintfEmpty           70.4            70.2            -0.28%
BenchmarkFmtFprintfString          364             307             -15.66%
BenchmarkFmtFprintfInt             317             282             -11.04%
BenchmarkFmtFprintfIntInt          512             483             -5.66%
BenchmarkFmtFprintfPrefixedInt     404             380             -5.94%
BenchmarkFmtFprintfFloat           521             479             -8.06%
BenchmarkFmtManyArgs               2164            1894            -12.48%
BenchmarkGobDecode                 30366146        22429593        -26.14%
BenchmarkGobEncode                 29867472        26663152        -10.73%
BenchmarkGzip                      391236616       396779490       +1.42%
BenchmarkGunzip                    96639491        96297024        -0.35%
BenchmarkHTTPClientServer          100110          70763           -29.31%
BenchmarkJSONEncode                51866051        52511382        +1.24%
BenchmarkJSONDecode                103813138       86094963        -17.07%
BenchmarkMandelbrot200             4121834         4120886         -0.02%
BenchmarkGoParse                   16472789        5879949         -64.31%
BenchmarkRegexpMatchEasy0_32       140             140             +0.00%
BenchmarkRegexpMatchEasy0_1K       394             394             +0.00%
BenchmarkRegexpMatchEasy1_32       120             120             +0.00%
BenchmarkRegexpMatchEasy1_1K       621             614             -1.13%
BenchmarkRegexpMatchMedium_32      209             202             -3.35%
BenchmarkRegexpMatchMedium_1K      54889           55175           +0.52%
BenchmarkRegexpMatchHard_32        2682            2675            -0.26%
BenchmarkRegexpMatchHard_1K        79383           79524           +0.18%
BenchmarkRevcomp                   584116718       584595320       +0.08%
BenchmarkTemplate                  125400565       109620196       -12.58%
BenchmarkTimeParse                 386             387             +0.26%
BenchmarkTimeFormat                580             447             -22.93%

(Best out of 10 runs. The delta of averages is similar.)

This also puts us in a good position to flush these caches when
nearing the end of concurrent marking, which will let us increase the
size of the work buffers while still controlling mark termination
pause time.

Change-Id: I2dd94c8517a19297a98ec280203cccaa58792522
Reviewed-on: https://go-review.googlesource.com/9178
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-24 20:10:14 +00:00
Austin Clements
d1cae6358c runtime: fix check for pending GC work
When findRunnable considers running a fractional mark worker, it first
checks if there's any work to be done; if there isn't there's no point
in running the worker because it will just reschedule immediately.
However, currently findRunnable just checks work.full and
work.partial, whereas getfull can *also* draw work from m.currentwbuf.
As a result, findRunnable may not start a worker even though there
actually is work.

This problem manifests itself in occasional failures of the
test/init1.go test. This test is unusual because it performs a large
amount of allocation without executing any write barriers, which means
there's nothing to force the pointers in currentwbuf out to the
work.partial/full lists where findRunnable can see them.

This change fixes this problem by making findRunnable also check for a
currentwbuf. This aligns findRunnable with trygetfull's notion of
whether or not there's work.

Change-Id: Ic76d22b7b5d040bc4f58a6b5975e9217650e66c4
Reviewed-on: https://go-review.googlesource.com/9299
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-24 20:10:10 +00:00
Austin Clements
26eac917dc runtime: start dedicated mark workers even if there's no work
Currently, findRunnable only considers running a mark worker if
there's work in the work queue. In principle, this can delay the start
of the desired number of dedicated mark workers if there's no work
pending. This is unlikely to occur in practice, since there should be
work queued from the scan phase, but if it were to come up, a CPU hog
mutator could slow down or delay garbage collection.

This check makes sense for fractional mark workers, since they'll just
return to the scheduler immediately if there's no work, but we want
the scheduler to start all of the dedicated mark workers promptly,
even if there's currently no queued work. Hence, this change moves the
pending work check after the check for starting a dedicated worker.

Change-Id: I52b851cc9e41f508a0955b3f905ca80f109ea101
Reviewed-on: https://go-review.googlesource.com/9298
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-24 20:10:05 +00:00