1
0
mirror of https://github.com/golang/go synced 2024-09-30 22:38:33 -06:00
Commit Graph

7 Commits

Author SHA1 Message Date
Keith Randall
0e9f8a21f8 runtime,cmd/compile: pass strings and slices to convT2{E,I} by value
When we pass these types by reference, we usually have to allocate
temporaries on the stack, initialize them, then pass their address
to the conversion functions. It's simpler to pass these types
directly by value.

This particularly applies to conversions needed for fmt.Printf
(to interface{} for constructing a [...]interface{}).

func f(a, b, c string) {
     fmt.Printf("%s %s\n", a, b)
     fmt.Printf("%s %s\n", b, c)
}

This function's stack frame shrinks from 200 to 136 bytes, and
its code shrinks from 535 to 453 bytes.

The go binary shrinks 0.3%.

Update #24286

Aside: for this function f, we don't really need to allocate
temporaries for the convT2E function. We could use the address
of a, b, and c directly. That might get similar (or maybe better?)
improvements. I investigated a bit, but it seemed complicated
to do it safely. This change was much easier.

Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
Reviewed-on: https://go-review.googlesource.com/c/135377
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-14 03:46:51 +00:00
Keith Randall
9a8372f8bd cmd/compile,runtime: remove ambiguously live logic
The previous CL introduced stack objects. This CL removes the old
ambiguously live liveness analysis. After this CL we're relying
on stack objects exclusively.

Update a bunch of liveness tests to reflect the new world.

Fixes #22350

Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31
Reviewed-on: https://go-review.googlesource.com/c/134156
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:16 +00:00
Austin Clements
2010189407 runtime: remove legacy eager write barrier
Now that the buffered write barrier is implemented for all
architectures, we can remove the old eager write barrier
implementation. This CL removes the implementation from the runtime,
support in the compiler for calling it, and updates some compiler
tests that relied on the old eager barrier support. It also makes sure
that all of the useful comments from the old write barrier
implementation still have a place to live.

Fixes #22460.

Updates #21640 since this fixes the layering concerns of the write
barrier (but not the other things in that issue).

Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74
Reviewed-on: https://go-review.googlesource.com/92705
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-13 16:34:46 +00:00
Hugues Bruant
c4b65fa4cc cmd/compile: inline closures with captures
When inlining a closure with captured variables, walk up the
param chain to find the one that is defined inside the scope
into which the function is being inlined, and map occurrences
of the captures to temporary inlvars, similarly to what is
done for function parameters.

No noticeable impact on compilation speed and binary size.

Minor improvements to go1 benchmarks on darwin/amd64

name                     old time/op    new time/op    delta
BinaryTree17-4              2.59s ± 3%     2.58s ± 1%    ~     (p=0.470 n=19+19)
Fannkuch11-4                3.15s ± 2%     3.15s ± 1%    ~     (p=0.647 n=20+19)
FmtFprintfEmpty-4          43.7ns ± 3%    43.4ns ± 4%    ~     (p=0.178 n=18+20)
FmtFprintfString-4         74.0ns ± 2%    77.1ns ± 7%  +4.13%  (p=0.000 n=20+20)
FmtFprintfInt-4            77.2ns ± 3%    79.2ns ± 6%  +2.53%  (p=0.000 n=20+20)
FmtFprintfIntInt-4          112ns ± 4%     112ns ± 2%    ~     (p=0.672 n=20+19)
FmtFprintfPrefixedInt-4     136ns ± 1%     135ns ± 2%    ~     (p=0.827 n=16+20)
FmtFprintfFloat-4           232ns ± 2%     233ns ± 1%    ~     (p=0.194 n=20+20)
FmtManyArgs-4               490ns ± 2%     484ns ± 2%  -1.28%  (p=0.001 n=20+20)
GobDecode-4                6.68ms ± 2%    6.72ms ± 2%    ~     (p=0.113 n=20+19)
GobEncode-4                5.62ms ± 2%    5.71ms ± 2%  +1.64%  (p=0.000 n=20+19)
Gzip-4                      235ms ± 3%     236ms ± 2%    ~     (p=0.607 n=20+19)
Gunzip-4                   37.1ms ± 2%    36.8ms ± 3%    ~     (p=0.060 n=20+20)
HTTPClientServer-4         61.9µs ± 2%    62.7µs ± 4%  +1.24%  (p=0.007 n=18+19)
JSONEncode-4               12.5ms ± 2%    12.4ms ± 3%    ~     (p=0.192 n=20+20)
JSONDecode-4               51.6ms ± 3%    51.0ms ± 3%  -1.19%  (p=0.008 n=20+19)
Mandelbrot200-4            4.12ms ± 6%    4.06ms ± 5%    ~     (p=0.063 n=20+20)
GoParse-4                  3.12ms ± 5%    3.10ms ± 2%    ~     (p=0.402 n=19+19)
RegexpMatchEasy0_32-4      80.7ns ± 2%    75.1ns ± 9%  -6.94%  (p=0.000 n=17+20)
RegexpMatchEasy0_1K-4       197ns ± 2%     186ns ± 2%  -5.43%  (p=0.000 n=20+20)
RegexpMatchEasy1_32-4      77.5ns ± 4%    71.9ns ± 7%  -7.25%  (p=0.000 n=20+18)
RegexpMatchEasy1_1K-4       341ns ± 3%     341ns ± 3%    ~     (p=0.732 n=20+20)
RegexpMatchMedium_32-4      113ns ± 2%     112ns ± 3%    ~     (p=0.102 n=20+20)
RegexpMatchMedium_1K-4     36.6µs ± 2%    35.8µs ± 2%  -2.26%  (p=0.000 n=18+20)
RegexpMatchHard_32-4       1.75µs ± 3%    1.74µs ± 2%    ~     (p=0.473 n=20+19)
RegexpMatchHard_1K-4       52.6µs ± 2%    52.0µs ± 3%  -1.15%  (p=0.005 n=20+20)
Revcomp-4                   381ms ± 4%     377ms ± 2%    ~     (p=0.067 n=20+18)
Template-4                 57.3ms ± 2%    57.7ms ± 2%    ~     (p=0.108 n=20+20)
TimeParse-4                 291ns ± 3%     292ns ± 2%    ~     (p=0.585 n=20+20)
TimeFormat-4                314ns ± 3%     315ns ± 1%    ~     (p=0.681 n=20+20)
[Geo mean]                 47.4µs         47.1µs       -0.73%

name                     old speed      new speed      delta
GobDecode-4               115MB/s ± 2%   114MB/s ± 2%    ~     (p=0.115 n=20+19)
GobEncode-4               137MB/s ± 2%   134MB/s ± 2%  -1.63%  (p=0.000 n=20+19)
Gzip-4                   82.5MB/s ± 3%  82.4MB/s ± 2%    ~     (p=0.612 n=20+19)
Gunzip-4                  523MB/s ± 2%   528MB/s ± 3%    ~     (p=0.060 n=20+20)
JSONEncode-4              155MB/s ± 2%   156MB/s ± 3%    ~     (p=0.192 n=20+20)
JSONDecode-4             37.6MB/s ± 3%  38.1MB/s ± 3%  +1.21%  (p=0.007 n=20+19)
GoParse-4                18.6MB/s ± 4%  18.7MB/s ± 2%    ~     (p=0.405 n=19+19)
RegexpMatchEasy0_32-4     396MB/s ± 2%   426MB/s ± 8%  +7.56%  (p=0.000 n=17+20)
RegexpMatchEasy0_1K-4    5.18GB/s ± 2%  5.48GB/s ± 2%  +5.79%  (p=0.000 n=20+20)
RegexpMatchEasy1_32-4     413MB/s ± 4%   444MB/s ± 6%  +7.46%  (p=0.000 n=20+19)
RegexpMatchEasy1_1K-4    3.00GB/s ± 3%  3.00GB/s ± 3%    ~     (p=0.678 n=20+20)
RegexpMatchMedium_32-4   8.82MB/s ± 2%  8.90MB/s ± 3%  +0.99%  (p=0.044 n=20+20)
RegexpMatchMedium_1K-4   28.0MB/s ± 2%  28.6MB/s ± 2%  +2.32%  (p=0.000 n=18+20)
RegexpMatchHard_32-4     18.3MB/s ± 3%  18.4MB/s ± 2%    ~     (p=0.482 n=20+19)
RegexpMatchHard_1K-4     19.5MB/s ± 2%  19.7MB/s ± 3%  +1.18%  (p=0.004 n=20+20)
Revcomp-4                 668MB/s ± 4%   674MB/s ± 2%    ~     (p=0.066 n=20+18)
Template-4               33.8MB/s ± 2%  33.6MB/s ± 2%    ~     (p=0.104 n=20+20)
[Geo mean]                124MB/s        126MB/s       +1.54%

Updates #15561
Updates #18270

Change-Id: I980086efe28b36aa27f81577065e2a729ff03d4e
Reviewed-on: https://go-review.googlesource.com/72490
Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-05 04:18:05 +00:00
Austin Clements
7e343134d3 cmd/compile: compiler support for buffered write barrier
This CL implements the compiler support for calling the buffered write
barrier added by the previous CL.

Since the buffered write barrier is only implemented on amd64 right
now, this still supports the old, eager write barrier as well. There's
little overhead to supporting both and this way a few tests in
test/fixedbugs that expect to have liveness maps at write barrier
calls can easily opt-in to the old, eager barrier.

This significantly improves the performance of the write barrier:

name             old time/op  new time/op  delta
WriteBarrier-12  73.5ns ±20%  19.2ns ±27%  -73.90%  (p=0.000 n=19+18)

It also reduces the size of binaries because the write barrier call is
more compact:

name        old object-bytes  new object-bytes  delta
Template           398k ± 0%         393k ± 0%  -1.14%  (p=0.008 n=5+5)
Unicode            208k ± 0%         206k ± 0%  -1.00%  (p=0.008 n=5+5)
GoTypes           1.18M ± 0%        1.15M ± 0%  -2.00%  (p=0.008 n=5+5)
Compiler          4.05M ± 0%        3.88M ± 0%  -4.26%  (p=0.008 n=5+5)
SSA               8.25M ± 0%        8.11M ± 0%  -1.59%  (p=0.008 n=5+5)
Flate              228k ± 0%         224k ± 0%  -1.83%  (p=0.008 n=5+5)
GoParser           295k ± 0%         284k ± 0%  -3.62%  (p=0.008 n=5+5)
Reflect           1.00M ± 0%        0.99M ± 0%  -0.70%  (p=0.008 n=5+5)
Tar                339k ± 0%         333k ± 0%  -1.67%  (p=0.008 n=5+5)
XML                404k ± 0%         395k ± 0%  -2.10%  (p=0.008 n=5+5)
[Geo mean]         704k              690k       -2.00%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.05M ± 0%        1.04M ± 0%  -1.55%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20171027.1

(Amusingly, this also reduces compiler allocations by 0.75%, which,
combined with the better write barrier, speeds up the compiler overall
by 2.10%. See the perf link.)

It slightly improves the performance of most of the go1 benchmarks and
improves the performance of the x/benchmarks:

name                      old time/op    new time/op    delta
BinaryTree17-12              2.40s ± 1%     2.47s ± 1%  +2.69%  (p=0.000 n=19+19)
Fannkuch11-12                2.95s ± 0%     2.95s ± 0%  +0.21%  (p=0.000 n=20+19)
FmtFprintfEmpty-12          41.8ns ± 4%    41.4ns ± 2%  -1.03%  (p=0.014 n=20+20)
FmtFprintfString-12         68.7ns ± 2%    67.5ns ± 1%  -1.75%  (p=0.000 n=20+17)
FmtFprintfInt-12            79.0ns ± 3%    77.1ns ± 1%  -2.40%  (p=0.000 n=19+17)
FmtFprintfIntInt-12          127ns ± 1%     123ns ± 3%  -3.42%  (p=0.000 n=20+20)
FmtFprintfPrefixedInt-12     152ns ± 1%     150ns ± 1%  -1.02%  (p=0.000 n=18+17)
FmtFprintfFloat-12           211ns ± 1%     209ns ± 0%  -0.99%  (p=0.000 n=20+16)
FmtManyArgs-12               500ns ± 0%     496ns ± 0%  -0.73%  (p=0.000 n=17+20)
GobDecode-12                6.44ms ± 1%    6.53ms ± 0%  +1.28%  (p=0.000 n=20+19)
GobEncode-12                5.46ms ± 0%    5.46ms ± 1%    ~     (p=0.550 n=19+20)
Gzip-12                      220ms ± 1%     216ms ± 0%  -1.75%  (p=0.000 n=19+19)
Gunzip-12                   38.8ms ± 0%    38.6ms ± 0%  -0.30%  (p=0.000 n=18+19)
HTTPClientServer-12         79.0µs ± 1%    78.2µs ± 1%  -1.01%  (p=0.000 n=20+20)
JSONEncode-12               11.9ms ± 0%    11.9ms ± 0%  -0.29%  (p=0.000 n=20+19)
JSONDecode-12               52.6ms ± 0%    52.2ms ± 0%  -0.68%  (p=0.000 n=19+20)
Mandelbrot200-12            3.69ms ± 0%    3.68ms ± 0%  -0.36%  (p=0.000 n=20+20)
GoParse-12                  3.13ms ± 1%    3.18ms ± 1%  +1.67%  (p=0.000 n=19+20)
RegexpMatchEasy0_32-12      73.2ns ± 1%    72.3ns ± 1%  -1.19%  (p=0.000 n=19+18)
RegexpMatchEasy0_1K-12       241ns ± 0%     239ns ± 0%  -0.83%  (p=0.000 n=17+16)
RegexpMatchEasy1_32-12      68.6ns ± 1%    69.0ns ± 1%  +0.47%  (p=0.015 n=18+16)
RegexpMatchEasy1_1K-12       364ns ± 0%     361ns ± 0%  -0.67%  (p=0.000 n=16+17)
RegexpMatchMedium_32-12      104ns ± 1%     103ns ± 1%  -0.79%  (p=0.001 n=20+15)
RegexpMatchMedium_1K-12     33.8µs ± 3%    34.0µs ± 2%    ~     (p=0.267 n=20+19)
RegexpMatchHard_32-12       1.64µs ± 1%    1.62µs ± 2%  -1.25%  (p=0.000 n=19+18)
RegexpMatchHard_1K-12       49.2µs ± 0%    48.7µs ± 1%  -0.93%  (p=0.000 n=19+18)
Revcomp-12                   391ms ± 5%     396ms ± 7%    ~     (p=0.154 n=19+19)
Template-12                 63.1ms ± 0%    59.5ms ± 0%  -5.76%  (p=0.000 n=18+19)
TimeParse-12                 307ns ± 0%     306ns ± 0%  -0.39%  (p=0.000 n=19+17)
TimeFormat-12                325ns ± 0%     323ns ± 0%  -0.50%  (p=0.000 n=19+19)
[Geo mean]                  47.3µs         46.9µs       -0.67%

https://perf.golang.org/search?q=upload:20171026.1

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.25ms ± 1%  2.20ms ± 1%  -2.31%  (p=0.000 n=18+18)
HTTP-12                    12.6µs ± 0%  12.6µs ± 0%  -0.72%  (p=0.000 n=18+17)
JSON-12                    11.0ms ± 0%  11.0ms ± 1%  -0.68%  (p=0.000 n=17+19)

https://perf.golang.org/search?q=upload:20171026.2

Updates #14951.
Updates #22460.

Change-Id: Id4c0932890a1d41020071bec73b8522b1367d3e7
Reviewed-on: https://go-review.googlesource.com/73712
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-10-30 18:12:46 +00:00
Matthew Dempsky
f22ef70254 cmd/compile: allow := to shadow dot-imported names
Historically, gc optimistically parsed the left-hand side of
assignments as expressions. Later, if it discovered a ":=" assignment,
it rewrote the parsed expressions as declarations.

This failed in the presence of dot imports though, because we lost
information about whether an imported object was named via a bare
identifier "Foo" or a normal qualified "pkg.Foo".

This CL fixes the issue by specially noding the left-hand side of ":="
assignments.

Fixes #22076.

Change-Id: I18190ecdb863112e7d009e1687e6112eec559921
Reviewed-on: https://go-review.googlesource.com/66810
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-05 18:07:37 +00:00
Josh Bleecher Snyder
61336b78c1 cmd/compile: don't update outer variables after capturevars is complete
When compiling concurrently, we walk all functions before compiling
any of them. Walking functions can cause variables to switch from
being non-addrtaken to addrtaken, e.g. to prepare for a runtime call.
Typechecking propagates addrtaken-ness of closure variables to
their outer variables, so that capturevars can decide whether to
pass the variable's value or a pointer to it.

When all functions are compiled immediately, as long as the containing
function is compiled prior to the closure, this propagation has no effect.
When compilation is deferred, though, in rare cases, this results in 
a change in the addrtaken-ness of a variable in the outer function,
which in turn changes the compiler's output.
(This is rare because in a great many cases, a temporary has been
introduced, insulating the outer variable from modification.)
But concurrent compilation must generate identical results.

To fix this, track whether capturevars has run.
If it has, there is no need to update outer variables
when closure variables change.
Capturevars always runs before any functions are walked or compiled.

The remainder of the changes in this CL are to support the test.
In particular, -d=compilelater forces the compiler to walk all
functions before compiling any of them, despite being non-concurrent.
This is useful because -live is fundamentally incompatible with
concurrent compilation, but we want -c=1 to have no behavior changes.

Fixes #20250

Change-Id: I89bcb54268a41e8588af1ac8cc37fbef856a90c2
Reviewed-on: https://go-review.googlesource.com/42853
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-05-14 00:27:25 +00:00