1
0
mirror of https://github.com/golang/go synced 2024-11-05 20:36:10 -07:00
Commit Graph

6 Commits

Author SHA1 Message Date
Dmitry Vyukov
0e80b2e082 cmd/gc: capture variables by value
Language specification says that variables are captured by reference.
And that is what gc compiler does. However, in lots of cases it is
possible to capture variables by value under the hood without
affecting visible behavior of programs. For example, consider
the following typical pattern:

	func (o *Obj) requestMany(urls []string) []Result {
		wg := new(sync.WaitGroup)
		wg.Add(len(urls))
		res := make([]Result, len(urls))
		for i := range urls {
			i := i
			go func() {
				res[i] = o.requestOne(urls[i])
				wg.Done()
			}()
		}
		wg.Wait()
		return res
	}

Currently o, wg, res, and i are captured by reference causing 3+len(urls)
allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap).
But all of them can be captured by value without changing behavior.

This change implements simple strategy for capturing by value:
if a captured variable is not addrtaken and never assigned to,
then it is captured by value (it is effectively const).
This simple strategy turned out to be very effective:
~80% of all captures in std lib are turned into value captures.
The remaining 20% are mostly in defers and non-escaping closures,
that is, they do not cause allocations anyway.

benchmark                                    old allocs     new allocs     delta
BenchmarkCompressedZipGarbage                153            126            -17.65%
BenchmarkEncodeDigitsSpeed1e4                91             69             -24.18%
BenchmarkEncodeDigitsSpeed1e5                178            129            -27.53%
BenchmarkEncodeDigitsSpeed1e6                1510           1051           -30.40%
BenchmarkEncodeDigitsDefault1e4              100            75             -25.00%
BenchmarkEncodeDigitsDefault1e5              193            139            -27.98%
BenchmarkEncodeDigitsDefault1e6              1420           985            -30.63%
BenchmarkEncodeDigitsCompress1e4             100            75             -25.00%
BenchmarkEncodeDigitsCompress1e5             193            139            -27.98%
BenchmarkEncodeDigitsCompress1e6             1420           985            -30.63%
BenchmarkEncodeTwainSpeed1e4                 109            81             -25.69%
BenchmarkEncodeTwainSpeed1e5                 211            151            -28.44%
BenchmarkEncodeTwainSpeed1e6                 1588           1097           -30.92%
BenchmarkEncodeTwainDefault1e4               103            77             -25.24%
BenchmarkEncodeTwainDefault1e5               199            143            -28.14%
BenchmarkEncodeTwainDefault1e6               1324           917            -30.74%
BenchmarkEncodeTwainCompress1e4              103            77             -25.24%
BenchmarkEncodeTwainCompress1e5              190            137            -27.89%
BenchmarkEncodeTwainCompress1e6              1327           919            -30.75%
BenchmarkConcurrentDBExec                    16223          16220          -0.02%
BenchmarkConcurrentStmtQuery                 17687          16182          -8.51%
BenchmarkConcurrentStmtExec                  5191           5186           -0.10%
BenchmarkConcurrentTxQuery                   17665          17661          -0.02%
BenchmarkConcurrentTxExec                    15154          15150          -0.03%
BenchmarkConcurrentTxStmtQuery               17661          16157          -8.52%
BenchmarkConcurrentTxStmtExec                3677           3673           -0.11%
BenchmarkConcurrentRandom                    14000          13614          -2.76%
BenchmarkManyConcurrentQueries               25             22             -12.00%
BenchmarkDecodeComplex128Slice               318            252            -20.75%
BenchmarkDecodeFloat64Slice                  318            252            -20.75%
BenchmarkDecodeInt32Slice                    318            252            -20.75%
BenchmarkDecodeStringSlice                   2318           2252           -2.85%
BenchmarkDecode                              11             8              -27.27%
BenchmarkEncodeGray                          64             56             -12.50%
BenchmarkEncodeNRGBOpaque                    64             56             -12.50%
BenchmarkEncodeNRGBA                         67             58             -13.43%
BenchmarkEncodePaletted                      68             60             -11.76%
BenchmarkEncodeRGBOpaque                     64             56             -12.50%
BenchmarkGoLookupIP                          153            139            -9.15%
BenchmarkGoLookupIPNoSuchHost                508            466            -8.27%
BenchmarkGoLookupIPWithBrokenNameServer      245            226            -7.76%
BenchmarkClientServer                        62             59             -4.84%
BenchmarkClientServerParallel4               62             59             -4.84%
BenchmarkClientServerParallel64              62             59             -4.84%
BenchmarkClientServerParallelTLS4            79             76             -3.80%
BenchmarkClientServerParallelTLS64           112            109            -2.68%
BenchmarkCreateGoroutinesCapture             10             6              -40.00%
BenchmarkAfterFunc                           1006           1005           -0.10%

Fixes #6632.

Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca
Reviewed-on: https://go-review.googlesource.com/3166
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-29 13:07:30 +00:00
Dmitry Vyukov
4ce4d8b2c4 cmd/gc: allocate stack buffer for ORUNESTR
If result of string(i) does not escape,
allocate a [4]byte temp on stack for it.

Change-Id: If31ce9447982929d5b3b963fd0830efae4247c37
Reviewed-on: https://go-review.googlesource.com/3411
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-28 20:37:20 +00:00
Dmitry Vyukov
e6fac08146 cmd/gc: allocate buffers for non-escaped strings on stack
Currently we always allocate string buffers in heap.
For example, in the following code we allocate a temp string
just for comparison:

	if string(byteSlice) == "abc" { ... }

This change extends escape analysis to cover []byte->string
conversions and string concatenation. If the result of operations
does not escape, compiler allocates a small buffer
on stack and passes it to slicebytetostring and concatstrings.
Then runtime uses the buffer if the result fits into it.

Size of the buffer is 32 bytes. There is no fundamental theory
behind this number. Just an observation that on std lib
tests/benchmarks frequency of string allocation is inversely
proportional to string length; and there is significant number
of allocations up to length 32.

benchmark                                    old allocs     new allocs     delta
BenchmarkFprintfBytes                        2              1              -50.00%
BenchmarkDecodeComplex128Slice               318            316            -0.63%
BenchmarkDecodeFloat64Slice                  318            316            -0.63%
BenchmarkDecodeInt32Slice                    318            316            -0.63%
BenchmarkDecodeStringSlice                   2318           2316           -0.09%
BenchmarkStripTags                           11             5              -54.55%
BenchmarkDecodeGray                          111            102            -8.11%
BenchmarkDecodeNRGBAGradient                 200            188            -6.00%
BenchmarkDecodeNRGBAOpaque                   165            152            -7.88%
BenchmarkDecodePaletted                      319            309            -3.13%
BenchmarkDecodeRGB                           166            157            -5.42%
BenchmarkDecodeInterlacing                   279            268            -3.94%
BenchmarkGoLookupIP                          153            135            -11.76%
BenchmarkGoLookupIPNoSuchHost                508            466            -8.27%
BenchmarkGoLookupIPWithBrokenNameServer      245            226            -7.76%
BenchmarkClientServerParallel4               62             61             -1.61%
BenchmarkClientServerParallel64              62             61             -1.61%
BenchmarkClientServerParallelTLS4            79             78             -1.27%
BenchmarkClientServerParallelTLS64           112            111            -0.89%

benchmark                                    old ns/op      new ns/op      delta
BenchmarkFprintfBytes                        381            311            -18.37%
BenchmarkStripTags                           2615           2351           -10.10%
BenchmarkDecodeNRGBAGradient                 3715887        3635096        -2.17%
BenchmarkDecodeNRGBAOpaque                   3047645        2928644        -3.90%
BenchmarkGoLookupIP                          153            135            -11.76%
BenchmarkGoLookupIPNoSuchHost                508            466            -8.27%

Change-Id: I9ec01da816945c3329d7be3c7794b520418c3f99
Reviewed-on: https://go-review.googlesource.com/3120
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-28 20:12:38 +00:00
Dmitry Vyukov
22c16b4b92 cmd/gc: ignore re-slicing in escape analysis
Escape analysis treats everything assigned to OIND/ODOTPTR as escaping.
As the result b escapes in the following code:

	func (b *Buffer) Foo() {
		n, m := ...
		b.buf = b.buf[n:m]
	}

This change recognizes such assignments and ignores them.

Update issue #9043.
Update issue #7921.

There are two similar cases in std lib that benefit from this optimization.
First is in archive/zip:

type readBuf []byte
func (b *readBuf) uint32() uint32 {
	v := binary.LittleEndian.Uint32(*b)
	*b = (*b)[4:]
	return v
}

Second is in time:

type data struct {
	p     []byte
	error bool
}

func (d *data) read(n int) []byte {
	if len(d.p) < n {
		d.p = nil
		d.error = true
		return nil
	}
	p := d.p[0:n]
	d.p = d.p[n:]
	return p
}

benchmark                         old ns/op     new ns/op     delta
BenchmarkCompressedZipGarbage     32431724      32217851      -0.66%

benchmark                         old allocs     new allocs     delta
BenchmarkCompressedZipGarbage     153            143            -6.54%

Change-Id: Ia6cd32744e02e36d6d8c19f402f8451101711626
Reviewed-on: https://go-review.googlesource.com/3162
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-28 17:37:55 +00:00
Dmitry Vyukov
1b87f01239 cmd/gc: improve escape analysis for &T{...}
Currently all PTRLIT element initializers escape. There is no reason for that.
This change links STRUCTLIT to PTRLIT; STRUCTLIT element initializers are
already linked to the STRUCTLIT. As the result, PTRLIT element initializers
escape when PTRLIT itself escapes.

Change-Id: I89ecd8677cbf81addcfd469cd2fd461c0e9bf7dd
Reviewed-on: https://go-review.googlesource.com/3031
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-28 16:59:01 +00:00
Russ Cox
00d2f916ad cmd/gc: run escape analysis always (even in -N mode)
Fixes #8585.
Removes some little-used code paths.

LGTM=josharian
R=golang-codereviews, minux, josharian
CC=golang-codereviews, iant, r
https://golang.org/cl/132970043
2014-09-24 15:20:03 -04:00