1
0
mirror of https://github.com/golang/go synced 2024-11-14 13:50:23 -07:00
go/test/closure2.go

134 lines
1.9 KiB
Go
Raw Normal View History

cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
// run
// Copyright 2015 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Check that these do not use "by value" capturing,
// because changes are made to the value during the closure.
package main
var never bool
cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
func main() {
{
type X struct {
v int
}
var x X
func() {
x.v++
}()
if x.v != 1 {
panic("x.v != 1")
}
cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
type Y struct {
X
}
var y Y
func() {
y.v = 1
}()
if y.v != 1 {
panic("y.v != 1")
}
cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
}
{
type Z struct {
a [3]byte
}
var z Z
func() {
i := 0
for z.a[1] = 1; i < 10; i++ {
}
}()
if z.a[1] != 1 {
panic("z.a[1] != 1")
cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
}
}
{
w := 0
tmp := 0
f := func() {
if w != 1 {
panic("w != 1")
}
cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
}
func() {
tmp = w // force capture of w, but do not write to it yet
_ = tmp
cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
func() {
func() {
w++ // write in a nested closure
}()
cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
}()
}()
f()
}
{
var g func() int
cmd/compile: experimental loop iterator capture semantics change Adds: GOEXPERIMENT=loopvar (expected way of invoking) -d=loopvar={-1,0,1,2,11,12} (for per-package control and/or logging) -d=loopvarhash=... (for hash debugging) loopvar=11,12 are for testing, benchmarking, and debugging. If enabled,for loops of the form `for x,y := range thing`, if x and/or y are addressed or captured by a closure, are transformed by renaming x/y to a temporary and prepending an assignment to the body of the loop x := tmp_x. This changes the loop semantics by making each iteration's instance of x be distinct from the others (currently they are all aliased, and when this matters, it is almost always a bug). 3-range with captured iteration variables are also transformed, though it is a more complex transformation. "Optimized" to do a simpler transformation for 3-clause for where the increment is empty. (Prior optimization of address-taking under Return disabled, because it was incorrect; returns can have loops for children. Restored in a later CL.) Includes support for -d=loopvarhash=<binary string> intended for use with hash search and GOCOMPILEDEBUG=loopvarhash=<binary string> (use `gossahash -e loopvarhash command-that-fails`). Minor feature upgrades to hash-triggered features; clients can specify that file-position hashes use only the most-inline position, and/or that they use only the basenames of source files (not the full directory path). Most-inlined is the right choice for debugging loop-iteration change once the semantics are linked to the package across inlining; basename-only makes it tractable to write tests (which, otherwise, depend on the full pathname of the source file and thus vary). Updates #57969. Change-Id: I180a51a3f8d4173f6210c861f10de23de8a1b1db Reviewed-on: https://go-review.googlesource.com/c/go/+/411904 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-06-12 13:33:57 -06:00
var i int
for i = range [2]int{} {
if i == 0 {
g = func() int {
return i // test that we capture by ref here, i is mutated on every interaction
}
}
}
if g() != 1 {
panic("g() != 1")
}
}
{
var g func() int
q := 0
for range [2]int{} {
q++
g = func() int {
return q // test that we capture by ref here
// q++ must on a different decldepth than q declaration
}
}
if g() != 2 {
panic("g() != 2")
}
}
{
var g func() int
var a [2]int
q := 0
for a[func() int {
q++
return 0
}()] = range [2]int{} {
g = func() int {
return q // test that we capture by ref here
// q++ must on a different decldepth than q declaration
}
}
if g() != 2 {
panic("g() != 2")
}
}
{
var g func() int
q := 0
q, g = 1, func() int { return q }
if never {
g = func() int { return 2 }
}
if g() != 1 {
panic("g() != 1")
}
}
cmd/gc: capture variables by value Language specification says that variables are captured by reference. And that is what gc compiler does. However, in lots of cases it is possible to capture variables by value under the hood without affecting visible behavior of programs. For example, consider the following typical pattern: func (o *Obj) requestMany(urls []string) []Result { wg := new(sync.WaitGroup) wg.Add(len(urls)) res := make([]Result, len(urls)) for i := range urls { i := i go func() { res[i] = o.requestOne(urls[i]) wg.Done() }() } wg.Wait() return res } Currently o, wg, res, and i are captured by reference causing 3+len(urls) allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap). But all of them can be captured by value without changing behavior. This change implements simple strategy for capturing by value: if a captured variable is not addrtaken and never assigned to, then it is captured by value (it is effectively const). This simple strategy turned out to be very effective: ~80% of all captures in std lib are turned into value captures. The remaining 20% are mostly in defers and non-escaping closures, that is, they do not cause allocations anyway. benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 126 -17.65% BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18% BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53% BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40% BenchmarkEncodeDigitsDefault1e4 100 75 -25.00% BenchmarkEncodeDigitsDefault1e5 193 139 -27.98% BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63% BenchmarkEncodeDigitsCompress1e4 100 75 -25.00% BenchmarkEncodeDigitsCompress1e5 193 139 -27.98% BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63% BenchmarkEncodeTwainSpeed1e4 109 81 -25.69% BenchmarkEncodeTwainSpeed1e5 211 151 -28.44% BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92% BenchmarkEncodeTwainDefault1e4 103 77 -25.24% BenchmarkEncodeTwainDefault1e5 199 143 -28.14% BenchmarkEncodeTwainDefault1e6 1324 917 -30.74% BenchmarkEncodeTwainCompress1e4 103 77 -25.24% BenchmarkEncodeTwainCompress1e5 190 137 -27.89% BenchmarkEncodeTwainCompress1e6 1327 919 -30.75% BenchmarkConcurrentDBExec 16223 16220 -0.02% BenchmarkConcurrentStmtQuery 17687 16182 -8.51% BenchmarkConcurrentStmtExec 5191 5186 -0.10% BenchmarkConcurrentTxQuery 17665 17661 -0.02% BenchmarkConcurrentTxExec 15154 15150 -0.03% BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52% BenchmarkConcurrentTxStmtExec 3677 3673 -0.11% BenchmarkConcurrentRandom 14000 13614 -2.76% BenchmarkManyConcurrentQueries 25 22 -12.00% BenchmarkDecodeComplex128Slice 318 252 -20.75% BenchmarkDecodeFloat64Slice 318 252 -20.75% BenchmarkDecodeInt32Slice 318 252 -20.75% BenchmarkDecodeStringSlice 2318 2252 -2.85% BenchmarkDecode 11 8 -27.27% BenchmarkEncodeGray 64 56 -12.50% BenchmarkEncodeNRGBOpaque 64 56 -12.50% BenchmarkEncodeNRGBA 67 58 -13.43% BenchmarkEncodePaletted 68 60 -11.76% BenchmarkEncodeRGBOpaque 64 56 -12.50% BenchmarkGoLookupIP 153 139 -9.15% BenchmarkGoLookupIPNoSuchHost 508 466 -8.27% BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76% BenchmarkClientServer 62 59 -4.84% BenchmarkClientServerParallel4 62 59 -4.84% BenchmarkClientServerParallel64 62 59 -4.84% BenchmarkClientServerParallelTLS4 79 76 -3.80% BenchmarkClientServerParallelTLS64 112 109 -2.68% BenchmarkCreateGoroutinesCapture 10 6 -40.00% BenchmarkAfterFunc 1006 1005 -0.10% Fixes #6632. Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca Reviewed-on: https://go-review.googlesource.com/3166 Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-19 12:59:58 -07:00
}