Currently we always create context objects for closures that capture variables.
However, it is completely unnecessary for direct calls of closures
(whether it is func()(), defer func()() or go func()()).
This change transforms any OCALLFUNC(OCLOSURE) to normal function call.
Closed variables become function arguments.
This transformation is especially beneficial for go func(),
because we do not need to allocate context object on heap.
But it makes direct closure calls a bit faster as well (see BenchmarkClosureCall).
On implementation level it required to introduce yet another compiler pass.
However, the pass iterates only over xtop, so it should not be an issue.
Transformation consists of two parts: closure transformation and call site
transformation. We can't run these parts on different sides of escape analysis,
because tree state is inconsistent. We can do both parts during typecheck,
we don't know how to capture variables and don't have call site.
We can't do both parts during walk of OCALLFUNC, because we can walk
OCLOSURE body earlier.
So now capturevars pass only decides how to capture variables
(this info is required for escape analysis). New transformclosure
pass, that runs just before order/walk, does all transformations
of a closure. And later walk of OCALLFUNC(OCLOSURE) transforms call site.
benchmark old ns/op new ns/op delta
BenchmarkClosureCall 4.89 3.09 -36.81%
BenchmarkCreateGoroutinesCapture 1634 1294 -20.81%
benchmark old allocs new allocs delta
BenchmarkCreateGoroutinesCapture 6 2 -66.67%
benchmark old bytes new bytes delta
BenchmarkCreateGoroutinesCapture 176 48 -72.73%
Change-Id: Ic85e1706e18c3235cc45b3c0c031a9c1cdb7a40e
Reviewed-on: https://go-review.googlesource.com/4050
Reviewed-by: Russ Cox <rsc@golang.org>
Consider an interface value i of type I and concrete value c of type C.
Prior to this CL, i==c was evaluated as
I(c) == i
Evaluating I(c) can allocate.
This CL changes the evaluation of i==c to
x, ok := i.(C); ok && x == c
The new generated code is shorter and does not allocate directly.
If C is small, as it is in every instance in the stdlib,
the new code also uses less stack space
and makes one runtime call instead of two.
If C is very large, the original implementation is used.
The cutoff for "very large" is 1<<16,
following the stack vs heap cutoff used elsewhere.
This kind of comparison occurs in 38 places in the stdlib,
mostly in the net and os packages.
benchmark old ns/op new ns/op delta
BenchmarkEqEfaceConcrete 29.5 7.92 -73.15%
BenchmarkEqIfaceConcrete 32.1 7.90 -75.39%
BenchmarkNeEfaceConcrete 29.9 7.90 -73.58%
BenchmarkNeIfaceConcrete 35.9 7.90 -77.99%
Fixes#9370.
Change-Id: I7c4555950bcd6406ee5c613be1f2128da2c9a2b7
Reviewed-on: https://go-review.googlesource.com/2096
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Support the following conversions in escape analysis:
[]rune("foo")
[]byte("foo")
string([]rune{})
If the result does not escape, allocate temp buffer on stack
and pass it to runtime functions.
Change-Id: I1d075907eab8b0109ad7ad1878104b02b3d5c690
Reviewed-on: https://go-review.googlesource.com/3590
Reviewed-by: Russ Cox <rsc@golang.org>
Ordinary switch statements are rewritten
into a sequence of if statements.
Staticly dead cases were not being eliminated
because the rewrite introduced a temporary,
which hid the fact that the case was a constant.
Stop doing that.
This eliminates dead code in the standard library at:
runtime/cgocall.go:219
runtime/cgocall.go:269
debug/gosym/pclntab.go:175
debug/macho/file.go:208
math/big/nat.go:635
math/big/nat.go:850
math/big/nat.go:1058
cmd/pprof/internal/commands/commands.go:86
net/sock_bsd.go:19
cmd/go/build.go:2657
cmd/go/env.go:90
Fixes#9608.
Change-Id: Ic23a05dfbb1ad91d5f62a6506b35a13e51b33e38
Reviewed-on: https://go-review.googlesource.com/3980
Reviewed-by: Keith Randall <khr@golang.org>
Only documentation / comment changes. Update references to
point to golang.org permalinks or go.googlesource.com/go.
References in historical release notes under doc are left as is.
Change-Id: Icfc14e4998723e2c2d48f9877a91c5abef6794ea
Reviewed-on: https://go-review.googlesource.com/4060
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The main issue is that the misc/cgo/{stdio,life} tests are silently
getting skipped when invoked from run.bash.
run.go should ignore any build tags after the first blank line in
source file. It already checks for test actions only upto the first
blank line. Build tags must be specified in the same block.
See http://golang.org/cl/3675 for background.
Change-Id: Id8abf000119e3335f7250d8ef34aac7811fc9dff
Reviewed-on: https://go-review.googlesource.com/3812
Reviewed-by: Minux Ma <minux@golang.org>
issue9355 generated a file a.[568] in test/ directory and left it there.
For tests like these, it is best to chdir to a test specific directory
before generating any temporary files, since the tests are running
in parallel and might otherwise race with each other for the same files.
Change-Id: I58d96256d4d8ee3fda70d81077f19006064a7425
Reviewed-on: https://go-review.googlesource.com/3813
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Kindly detected by race builders by failing TestRaceRange.
ORANGE typecheck does not increment decldepth around body.
Change-Id: I0df5f310cb3370a904c94d9647a9cf0f15729075
Reviewed-on: https://go-review.googlesource.com/3507
Reviewed-by: Russ Cox <rsc@golang.org>
Type switch variables was not typechecked.
Previously it lead only to a minor consequence:
switch unsafe.Sizeof = x.(type) {
generated an inconsistent error message.
But capturing by value functionality now requries typechecking of all ONAMEs.
Fixes#9731
Change-Id: If037883cba53d85028fb97b1328696091b3b7ddd
Reviewed-on: https://go-review.googlesource.com/3600
Reviewed-by: Russ Cox <rsc@golang.org>
This adds a "framepointer" GOEXPERIMENT that that makes the amd64
toolchain maintain base pointer chains in the same way that gcc
-fno-omit-frame-pointer does. Go doesn't use these saved base
pointers, but this does enable external tools like Linux perf and
VTune to unwind Go stacks when collecting system-wide profiles.
This requires support in the compilers to not clobber BP, support in
liblink for generating the BP-saving function prologue and unwinding
epilogue, and support in the runtime to save BPs across preemption, to
skip saved BPs during stack unwinding and, and to adjust saved BPs
during stack moving.
As with other GOEXPERIMENTs, everything from the toolchain to the
runtime must be compiled with this experiment enabled. To do this,
run make.bash (or all.bash) with GOEXPERIMENT=framepointer.
Change-Id: I4024853beefb9539949e5ca381adfdd9cfada544
Reviewed-on: https://go-review.googlesource.com/2992
Reviewed-by: Russ Cox <rsc@golang.org>
cmd/go doesn't complain (this is an open issue), but go/types does
Change-Id: I2caec1f7aec991a9500d2c3504c29e4ab718c138
Reviewed-on: https://go-review.googlesource.com/3541
Reviewed-by: Alan Donovan <adonovan@google.com>
Language specification says that variables are captured by reference.
And that is what gc compiler does. However, in lots of cases it is
possible to capture variables by value under the hood without
affecting visible behavior of programs. For example, consider
the following typical pattern:
func (o *Obj) requestMany(urls []string) []Result {
wg := new(sync.WaitGroup)
wg.Add(len(urls))
res := make([]Result, len(urls))
for i := range urls {
i := i
go func() {
res[i] = o.requestOne(urls[i])
wg.Done()
}()
}
wg.Wait()
return res
}
Currently o, wg, res, and i are captured by reference causing 3+len(urls)
allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap).
But all of them can be captured by value without changing behavior.
This change implements simple strategy for capturing by value:
if a captured variable is not addrtaken and never assigned to,
then it is captured by value (it is effectively const).
This simple strategy turned out to be very effective:
~80% of all captures in std lib are turned into value captures.
The remaining 20% are mostly in defers and non-escaping closures,
that is, they do not cause allocations anyway.
benchmark old allocs new allocs delta
BenchmarkCompressedZipGarbage 153 126 -17.65%
BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18%
BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53%
BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40%
BenchmarkEncodeDigitsDefault1e4 100 75 -25.00%
BenchmarkEncodeDigitsDefault1e5 193 139 -27.98%
BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63%
BenchmarkEncodeDigitsCompress1e4 100 75 -25.00%
BenchmarkEncodeDigitsCompress1e5 193 139 -27.98%
BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63%
BenchmarkEncodeTwainSpeed1e4 109 81 -25.69%
BenchmarkEncodeTwainSpeed1e5 211 151 -28.44%
BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92%
BenchmarkEncodeTwainDefault1e4 103 77 -25.24%
BenchmarkEncodeTwainDefault1e5 199 143 -28.14%
BenchmarkEncodeTwainDefault1e6 1324 917 -30.74%
BenchmarkEncodeTwainCompress1e4 103 77 -25.24%
BenchmarkEncodeTwainCompress1e5 190 137 -27.89%
BenchmarkEncodeTwainCompress1e6 1327 919 -30.75%
BenchmarkConcurrentDBExec 16223 16220 -0.02%
BenchmarkConcurrentStmtQuery 17687 16182 -8.51%
BenchmarkConcurrentStmtExec 5191 5186 -0.10%
BenchmarkConcurrentTxQuery 17665 17661 -0.02%
BenchmarkConcurrentTxExec 15154 15150 -0.03%
BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52%
BenchmarkConcurrentTxStmtExec 3677 3673 -0.11%
BenchmarkConcurrentRandom 14000 13614 -2.76%
BenchmarkManyConcurrentQueries 25 22 -12.00%
BenchmarkDecodeComplex128Slice 318 252 -20.75%
BenchmarkDecodeFloat64Slice 318 252 -20.75%
BenchmarkDecodeInt32Slice 318 252 -20.75%
BenchmarkDecodeStringSlice 2318 2252 -2.85%
BenchmarkDecode 11 8 -27.27%
BenchmarkEncodeGray 64 56 -12.50%
BenchmarkEncodeNRGBOpaque 64 56 -12.50%
BenchmarkEncodeNRGBA 67 58 -13.43%
BenchmarkEncodePaletted 68 60 -11.76%
BenchmarkEncodeRGBOpaque 64 56 -12.50%
BenchmarkGoLookupIP 153 139 -9.15%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServer 62 59 -4.84%
BenchmarkClientServerParallel4 62 59 -4.84%
BenchmarkClientServerParallel64 62 59 -4.84%
BenchmarkClientServerParallelTLS4 79 76 -3.80%
BenchmarkClientServerParallelTLS64 112 109 -2.68%
BenchmarkCreateGoroutinesCapture 10 6 -40.00%
BenchmarkAfterFunc 1006 1005 -0.10%
Fixes#6632.
Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca
Reviewed-on: https://go-review.googlesource.com/3166
Reviewed-by: Russ Cox <rsc@golang.org>
If result of string(i) does not escape,
allocate a [4]byte temp on stack for it.
Change-Id: If31ce9447982929d5b3b963fd0830efae4247c37
Reviewed-on: https://go-review.googlesource.com/3411
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we always allocate string buffers in heap.
For example, in the following code we allocate a temp string
just for comparison:
if string(byteSlice) == "abc" { ... }
This change extends escape analysis to cover []byte->string
conversions and string concatenation. If the result of operations
does not escape, compiler allocates a small buffer
on stack and passes it to slicebytetostring and concatstrings.
Then runtime uses the buffer if the result fits into it.
Size of the buffer is 32 bytes. There is no fundamental theory
behind this number. Just an observation that on std lib
tests/benchmarks frequency of string allocation is inversely
proportional to string length; and there is significant number
of allocations up to length 32.
benchmark old allocs new allocs delta
BenchmarkFprintfBytes 2 1 -50.00%
BenchmarkDecodeComplex128Slice 318 316 -0.63%
BenchmarkDecodeFloat64Slice 318 316 -0.63%
BenchmarkDecodeInt32Slice 318 316 -0.63%
BenchmarkDecodeStringSlice 2318 2316 -0.09%
BenchmarkStripTags 11 5 -54.55%
BenchmarkDecodeGray 111 102 -8.11%
BenchmarkDecodeNRGBAGradient 200 188 -6.00%
BenchmarkDecodeNRGBAOpaque 165 152 -7.88%
BenchmarkDecodePaletted 319 309 -3.13%
BenchmarkDecodeRGB 166 157 -5.42%
BenchmarkDecodeInterlacing 279 268 -3.94%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServerParallel4 62 61 -1.61%
BenchmarkClientServerParallel64 62 61 -1.61%
BenchmarkClientServerParallelTLS4 79 78 -1.27%
BenchmarkClientServerParallelTLS64 112 111 -0.89%
benchmark old ns/op new ns/op delta
BenchmarkFprintfBytes 381 311 -18.37%
BenchmarkStripTags 2615 2351 -10.10%
BenchmarkDecodeNRGBAGradient 3715887 3635096 -2.17%
BenchmarkDecodeNRGBAOpaque 3047645 2928644 -3.90%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
Change-Id: I9ec01da816945c3329d7be3c7794b520418c3f99
Reviewed-on: https://go-review.googlesource.com/3120
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Escape analysis treats everything assigned to OIND/ODOTPTR as escaping.
As the result b escapes in the following code:
func (b *Buffer) Foo() {
n, m := ...
b.buf = b.buf[n:m]
}
This change recognizes such assignments and ignores them.
Update issue #9043.
Update issue #7921.
There are two similar cases in std lib that benefit from this optimization.
First is in archive/zip:
type readBuf []byte
func (b *readBuf) uint32() uint32 {
v := binary.LittleEndian.Uint32(*b)
*b = (*b)[4:]
return v
}
Second is in time:
type data struct {
p []byte
error bool
}
func (d *data) read(n int) []byte {
if len(d.p) < n {
d.p = nil
d.error = true
return nil
}
p := d.p[0:n]
d.p = d.p[n:]
return p
}
benchmark old ns/op new ns/op delta
BenchmarkCompressedZipGarbage 32431724 32217851 -0.66%
benchmark old allocs new allocs delta
BenchmarkCompressedZipGarbage 153 143 -6.54%
Change-Id: Ia6cd32744e02e36d6d8c19f402f8451101711626
Reviewed-on: https://go-review.googlesource.com/3162
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently all PTRLIT element initializers escape. There is no reason for that.
This change links STRUCTLIT to PTRLIT; STRUCTLIT element initializers are
already linked to the STRUCTLIT. As the result, PTRLIT element initializers
escape when PTRLIT itself escapes.
Change-Id: I89ecd8677cbf81addcfd469cd2fd461c0e9bf7dd
Reviewed-on: https://go-review.googlesource.com/3031
Reviewed-by: Russ Cox <rsc@golang.org>
The compiler has a phase ordering problem. Escape analysis runs
before wrapper generation. When a generated wrapper calls a method
defined in a different package, if that call is inlined, there will be
no escape information for the variables defined in the inlined call.
Those variables will be placed on the stack, which fails if they
actually do escape.
There are probably various complex ways to fix this. This is a simple
way to avoid it: when a generated wrapper calls a method defined in a
different package, treat all local variables as escaping.
Fixes#9537.
Change-Id: I530f39346de16ad173371c6c3f69cc189351a4e9
Reviewed-on: https://go-review.googlesource.com/3092
Reviewed-by: Russ Cox <rsc@golang.org>
We were failing ^uint16(0xffff) == 0, as we computed 0xffff0000 instead.
I could only trigger a failure for the above case, the other two tests
^uint16(0xfffe) == 1 and -uint16(0xffff) == 1 didn't seem to fail
previously. Somehow they get MOVHUs inserted for other reasons (used
by CMP instead of TST?). I fixed OMINUS anyway, better safe than
sorry.
Fixes#9604
Change-Id: I4c2d5bdc667742873ac029fdbe3db0cf12893c27
Reviewed-on: https://go-review.googlesource.com/2940
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
The various files are confusingly named and their operation
not easy to see. Add a comment to cmplxdivide.c, one of the few
C files that will endure in the repository, to explain how to build
and run the test.
Change-Id: I1fd5c564a14217e1b9815b09bc24cc43c54c096f
Reviewed-on: https://go-review.googlesource.com/2850
Reviewed-by: Russ Cox <rsc@golang.org>
Recognize loops of the form
for i := range a {
a[i] = zero
}
in which the evaluation of a is free from side effects.
Replace these loops with calls to memclr.
This occurs in the stdlib in 18 places.
The motivating example is clearing a byte slice:
benchmark old ns/op new ns/op delta
BenchmarkGoMemclr5 3.31 3.26 -1.51%
BenchmarkGoMemclr16 13.7 3.28 -76.06%
BenchmarkGoMemclr64 50.8 4.14 -91.85%
BenchmarkGoMemclr256 157 6.02 -96.17%
Update #5373.
Change-Id: I99d3e6f5f268e8c6499b7e661df46403e5eb83e4
Reviewed-on: https://go-review.googlesource.com/2520
Reviewed-by: Keith Randall <khr@golang.org>
run GC in its own background goroutine making the
caller runnable if resources are available. This is
critical in single goroutine applications.
Allow goroutines that allocate a lot to help out
the GC and in doing so throttle their own allocation.
Adjust test so that it only detects that a GC is run
during init calls and not whether the GC is memory
efficient. Memory efficiency work will happen later
in 1.5.
Change-Id: I4306f5e377bb47c69bda1aedba66164f12b20c2b
Reviewed-on: https://go-review.googlesource.com/2349
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This test was taking a long time, reduce its zealousness.
Change-Id: Ib824247b84b0039a9ec690f72336bef3738d4c44
Reviewed-on: https://go-review.googlesource.com/2502
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Minux Ma <minux@golang.org>
The compiler converts 'val, ok = m[key]' to
tmp, ok = <runtime call>
val = *tmp
For lookups of the form '_, ok = m[key]',
the second statement is unnecessary.
By not generating it we save a nil check.
Change-Id: I21346cc195cb3c62e041af8b18770c0940358695
Reviewed-on: https://go-review.googlesource.com/1975
Reviewed-by: Russ Cox <rsc@golang.org>
If the user provided a key but no value via -ldflag -X,
another linker flag was used as the value.
Placing the user's flags at the end avoids this problem.
It also provides the user the opportunity to
override existing linker flags.
Fixes#8810.
Change-Id: I96f4190713dc9a9c29142e56658446fba7fb6bc8
Reviewed-on: https://go-review.googlesource.com/2242
Reviewed-by: Minux Ma <minux@golang.org>
These tests were enabled as part of change 1774.
They depend on the errchk tool, which is a Perl
script. However, Perl is not available on Plan 9.
Change-Id: I82707aae16013acc9a3800d39b0084588b852b53
Reviewed-on: https://go-review.googlesource.com/2031
Reviewed-by: Minux Ma <minux@golang.org>
Broken by e7173dfdfd
Fix by simply disabling the relevant tests.
* bug248 and bug345 require errchk, but we can't
rely on perl being available.
* bug369 is disabled anyway.
Change-Id: Idf73ebccb066943e3fe17c2f662b37238ec74dfe
Reviewed-on: https://go-review.googlesource.com/2052
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Gccgo creates a struct to hold the arguments for the deferred
function. In this example the struct holds a type defined in a
different package. The bug was that gccgo tried to create an equality
function for this struct, and it implemented that function by calling
the equality function for the type defined in the other package.
Since that type is not exported, the reference to the equality
function failed at link time. Normally it is impossible for a struct
to directly contain a member that is an unexported type from another
package, but in this specific case it was possible. Fixed in gccgo
with https://codereview.appspot.com/183500043 .
Change-Id: I8ec3a33631225b9ac2a4ac060cb4d10b4635e60b
Reviewed-on: https://go-review.googlesource.com/1690
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
* bug248, bug345, bug369, and bug429 were ported from bash commands to run scripts. bug369 remains disabled.
* bug395 is a test for issue 1909, which is still open. It is marked as skip now and will be usable with compile with run.go when issue 1909 is fixed.
Fixes#4139
Updates #1909
Change-Id: Ibb5fbfb5cf72ddc285829245318eeacd3fb5a636
Reviewed-on: https://go-review.googlesource.com/1774
Reviewed-by: Russ Cox <rsc@golang.org>
When we do y = &x for global variables x and y, y gets initialized
at link time. Do the same for y = &x.f if x is a struct and y=&x[5]
if x is an array.
fixes#9217fixes#9355
Change-Id: Iea3c0ce2ce1b309e2b760e345608fd95460b5713
Reviewed-on: https://go-review.googlesource.com/1691
Reviewed-by: Minux Ma <minux@golang.org>
This test was added in CL 151000043.
It got lost in CL 144630044.
Change-Id: I318ab11be8e3e7489fc1395457c029c8bdb2aa41
Reviewed-on: https://go-review.googlesource.com/1773
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Gccgo incorrectly executed functions multiple times when they appeared
in a composite literal that required a conversion between different
interface types.
Change-Id: I7b40e76ed23fa8440ffa03b262041265c109adf7
Reviewed-on: https://go-review.googlesource.com/1710
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Gccgo failed to create the type descriptor for the type used to
allocate the nil value passed to append as the second argument when
append is called with only one argument. Calling append with only one
argument is unusual but obviously should not cause a compiler crash.
Change-Id: I530821847dfd68f0302de6ca6a84dfbc79653935
Reviewed-on: https://go-review.googlesource.com/1692
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
It shouldn't semacquire() inside an acquirem(), the runtime
thinks that means deadlock. It actually isn't a deadlock, but it
looks like it because acquirem() does m.locks++.
Candidate for inclusion in 1.4.1. runtime.Stack with all=true
is pretty unuseable in GOMAXPROCS>1 environment.
fixes#9321
Change-Id: Iac6b664217d24763b9878c20e49229a1ecffc805
Reviewed-on: https://go-review.googlesource.com/1600
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Make sure dequeueing from a channel queue does not exhibit quadratic time behavior.
Change-Id: Ifb7c709b026f74c7e783610d4914dd92909a441b
Reviewed-on: https://go-review.googlesource.com/1212
Reviewed-by: Russ Cox <rsc@golang.org>