The code generated for x = append(x, v) is roughly:
t := x
if len(t)+1 > cap(t) {
t = grow(t)
}
t[len(t)] = v
len(t)++
x = t
We used to generate this code as Go pseudocode during walk.
Generate it instead as actual instructions during gen.
Doing so lets us apply a few optimizations. The most important
is that when, as in the above example, the source slice and the
destination slice are the same, the code can instead do:
t := x
if len(t)+1 > cap(t) {
t = grow(t)
x = {base(t), len(t)+1, cap(t)}
} else {
len(x)++
}
t[len(t)] = v
That is, in the fast path that does not reallocate the array,
only the updated length needs to be written back to x,
not the array pointer and not the capacity. This is more like
what you'd write by hand in C. It's faster in general, since
the fast path elides two of the three stores, but it's especially
faster when the form of x is such that the base pointer write
would turn into a write barrier. No write, no barrier.
name old mean new mean delta
BinaryTree17 5.68s × (0.97,1.04) 5.81s × (0.98,1.03) +2.35% (p=0.023)
Fannkuch11 4.41s × (0.98,1.03) 4.35s × (1.00,1.00) ~ (p=0.090)
FmtFprintfEmpty 92.7ns × (0.91,1.16) 86.0ns × (0.94,1.11) -7.31% (p=0.038)
FmtFprintfString 281ns × (0.96,1.08) 276ns × (0.98,1.04) ~ (p=0.219)
FmtFprintfInt 288ns × (0.97,1.06) 274ns × (0.98,1.06) -4.94% (p=0.002)
FmtFprintfIntInt 493ns × (0.97,1.04) 506ns × (0.99,1.01) +2.65% (p=0.009)
FmtFprintfPrefixedInt 423ns × (0.97,1.04) 391ns × (0.99,1.01) -7.52% (p=0.000)
FmtFprintfFloat 598ns × (0.99,1.01) 566ns × (0.99,1.01) -5.27% (p=0.000)
FmtManyArgs 1.89µs × (0.98,1.05) 1.91µs × (0.99,1.01) ~ (p=0.231)
GobDecode 14.8ms × (0.98,1.03) 15.3ms × (0.99,1.02) +3.01% (p=0.000)
GobEncode 12.3ms × (0.98,1.01) 11.5ms × (0.97,1.03) -5.93% (p=0.000)
Gzip 656ms × (0.99,1.05) 645ms × (0.99,1.01) ~ (p=0.055)
Gunzip 142ms × (1.00,1.00) 142ms × (1.00,1.00) -0.32% (p=0.034)
HTTPClientServer 91.2µs × (0.97,1.04) 90.5µs × (0.97,1.04) ~ (p=0.468)
JSONEncode 32.6ms × (0.97,1.08) 32.0ms × (0.98,1.03) ~ (p=0.190)
JSONDecode 114ms × (0.97,1.05) 114ms × (0.99,1.01) ~ (p=0.887)
Mandelbrot200 6.11ms × (0.98,1.04) 6.04ms × (1.00,1.01) ~ (p=0.167)
GoParse 6.66ms × (0.97,1.04) 6.47ms × (0.97,1.05) -2.81% (p=0.014)
RegexpMatchEasy0_32 159ns × (0.99,1.00) 171ns × (0.93,1.07) +7.19% (p=0.002)
RegexpMatchEasy0_1K 538ns × (1.00,1.01) 550ns × (0.98,1.01) +2.30% (p=0.000)
RegexpMatchEasy1_32 138ns × (1.00,1.00) 135ns × (0.99,1.02) -1.60% (p=0.000)
RegexpMatchEasy1_1K 869ns × (0.99,1.01) 879ns × (1.00,1.01) +1.08% (p=0.000)
RegexpMatchMedium_32 252ns × (0.99,1.01) 243ns × (1.00,1.00) -3.71% (p=0.000)
RegexpMatchMedium_1K 72.7µs × (1.00,1.00) 70.3µs × (1.00,1.00) -3.34% (p=0.000)
RegexpMatchHard_32 3.85µs × (1.00,1.00) 3.82µs × (1.00,1.01) -0.81% (p=0.000)
RegexpMatchHard_1K 118µs × (1.00,1.00) 117µs × (1.00,1.00) -0.56% (p=0.000)
Revcomp 920ms × (0.97,1.07) 917ms × (0.97,1.04) ~ (p=0.808)
Template 129ms × (0.98,1.03) 114ms × (0.99,1.01) -12.06% (p=0.000)
TimeParse 619ns × (0.99,1.01) 622ns × (0.99,1.01) ~ (p=0.062)
TimeFormat 661ns × (0.98,1.04) 665ns × (0.99,1.01) ~ (p=0.524)
See next CL for combination with a similar optimization for slice.
The benchmarks that are slower in this CL are still faster overall
with the combination of the two.
Change-Id: I2a7421658091b2488c64741b4db15ab6c3b4cb7e
Reviewed-on: https://go-review.googlesource.com/9812
Reviewed-by: David Chase <drchase@google.com>
Today's earlier fix can stay, but it's a band-aid over the real problem,
which is that bad code was slipping through the type checker
into the back end (and luckily causing a type error there).
I discovered this because my new append does not use the same
temporaries and failed the test as written.
Fixes#9521.
Change-Id: I7e33e2ea15743406e15c6f3fdf73e1edecda69bd
Reviewed-on: https://go-review.googlesource.com/9921
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Instead of errors like:
./blank2.go:15: cannot use ~b1 (type []int) as type int in assignment
we now have:
./blank2.go:15: cannot use _ (type []int) as type int in assignment
Less confusing for users.
Fixes#9521
Change-Id: Ieab9859040e8e0df95deeaee7eeb408d3be61c0f
Reviewed-on: https://go-review.googlesource.com/9902
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Try to provide hints for common areas, either *interface
were interface would have been better, and note incorrect
capitalization (but don't be more ambitious than that, at
least not today).
Added code and test for cases
ptrInterface.ExistingMethod
ptrInterface.unexportedMethod
ptrInterface.MissingMethod
ptrInterface.withwRongcASEdMethod
interface.withwRongcASEdMethod
ptrStruct.withwRongcASEdMethod
struct.withwRongcASEdMethod
also included tests for related errors to check for
unintentional changes and consistent wording.
Somewhat simplified from previous versions to avoid second-
guessing user errors, yet also biased to point out most-likely
root cause.
Fixes#10700
Change-Id: I16693e93cc8d8ca195e7742a222d640c262105b4
Reviewed-on: https://go-review.googlesource.com/9731
Reviewed-by: Russ Cox <rsc@golang.org>
Noopt builds get a larger stack guard. This test must take that into account.
Change-Id: I1b5cbafdbbfee8c369ae1bebd0b900524ebf0d7d
Reviewed-on: https://go-review.googlesource.com/9610
Reviewed-by: Russ Cox <rsc@golang.org>
This includes the following information in the per-function summary:
outK = paramJ encoded in outK bits for paramJ
outK = *paramJ encoded in outK bits for paramJ
heap = paramJ EscHeap
heap = *paramJ EscContentEscapes
Note that (currently) if the address of a parameter is taken and
returned, necessarily a heap allocation occurred to contain that
reference, and the heap can never refer to stack, therefore the
parameter and everything downstream from it escapes to the heap.
The per-function summary information now has a tuneable number of bits
(2 is probably noticeably better than 1, 3 is likely overkill, but it
is now easy to check and the -m debugging output includes information
that allows you to figure out if more would be better.)
A new test was added to check pointer flow through struct-typed and
*struct-typed parameters and returns; some of these are sensitive to
the number of summary bits, and ought to yield better results with a
more competent escape analysis algorithm. Another new test checks
(some) correctness with array parameters, results, and operations.
The old analysis inferred a piece of plan9 runtime was non-escaping by
counteracting overconservative analysis with buggy analysis; with the
bug fixed, the result was too conservative (and it's not easy to fix
in this framework) so the source code was tweaked to get the desired
result. A test was added against the discovered bug.
The escape analysis was further improved splitting the "level" into
3 parts, one tracking the conventional "level" and the other two
computing the highest-level-suffix-from-copy, which is used to
generally model the cancelling effect of indirection applied to
address-of.
With the improved escape analysis enabled, it was necessary to
modify one of the runtime tests because it now attempts to allocate
too much on the (small, fixed-size) G0 (system) stack and this
failed the test.
Compiling src/std after touching src/runtime/*.go with -m logging
turned on shows 420 fewer heap allocation sites (10538 vs 10968).
Profiling allocations in src/html/template with
for i in {1..5} ;
do go tool 6g -memprofile=mastx.${i}.prof -memprofilerate=1 *.go;
go tool pprof -alloc_objects -text mastx.${i}.prof ;
done
showed a 15% reduction in allocations performed by the compiler.
Update #3753
Update #4720Fixes#10466
Change-Id: I0fd97d5f5ac527b45f49e2218d158a6e89951432
Reviewed-on: https://go-review.googlesource.com/8202
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Clean up after CL 5310.
Change-Id: Ib870e7b9d26eb118eefdaa3e76dcec4a4d459584
Reviewed-on: https://go-review.googlesource.com/9398
Reviewed-by: Ian Lance Taylor <iant@golang.org>
With this fix,
GOMAXPROCS=8 ./all.bash
passes, at least on my machine.
Fixes#10216.
Change-Id: Ib5991950892a1399ec81aced0a52b435e6f83fdf
Reviewed-on: https://go-review.googlesource.com/9392
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
These were fixed a little while ago, but overlooked when reenabling
disabled tests.
Update #9968.
Change-Id: I301ef587e580c517a170ad08ff897118b58cedec
Reviewed-on: https://go-review.googlesource.com/9347
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
We can expand the test cases as we discover problems.
This is some basic tests plus all the things I got wrong
in some recent work.
Change-Id: Id875fcfaf74eb087ae42b441fe47a34c5b8ccb39
Reviewed-on: https://go-review.googlesource.com/9158
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
In https://golang.org/cl/7797 I attempted to use myimportpath to set the value
of the go.importpath.$foo. symbol for the module being compiled, but I messed
it up and only set the name (which the linker rewrites anyway). This lead to
the importpath for the module being compiled being "". This was hard to notice,
because all modules that import another define the importpath for their
imported modules correctly -- but main is not imported, and this meant that the
reflect module saw all fields of all types defined in the main module as
exported.
The fix is to do what I meant to do the first time, add a test and change the
go tool to compile main packages with -p main and not -p
command-line-arguments.
Fixes#10332
Change-Id: I5fc6e9b1dc2b26f058641e382f9a56a526eca291
Reviewed-on: https://go-review.googlesource.com/8481
Reviewed-by: Russ Cox <rsc@golang.org>
The flag updates error annotations in test files from actual compiler output.
This is useful when doing compiler changes that add/remove/change lots of errors,
or when adding lots of new tests.
Also I noticed at least 2 cases where annotation were sub-optimal:
1. The annotation was "leaking param p" when the actual error is
"leaking param p to result ~r1".
2. The annotation was "leaking param m" when the actual errors
are "leaking param m" and "leaking param mv1".
For now it works only for errorcheck mode.
Also, apply the update to escape and liveness tests.
Some files have gccgo-specific errors of the form "gc error|gccgo error",
so it is risky to run update on all files. Gccgo-specific error
does not necessary contain '|', it can be just truncated.
Change-Id: Iaaae767f859dcb8321a8cb4970b2b70969e8a345
Reviewed-on: https://go-review.googlesource.com/5310
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This CL extends cmd/yacc to expose a yyErrorVerbose variable that
changes the error messages from just "syntax error" to "syntax error:
unexpected ${tokname}".
It also moves the yyToknames table generation to after rules have been
processed so that entries can be generated for tokens that aren't
mentioned in the preamble (e.g., '.' in the case of go.y).
Lastly, it restores gc's old code for applying yytfix to yyToknames,
except that substituting "LLITERAL" with litbuf happens in Yyerror.
Fixes#9968.
Change-Id: Icec188d11fdabc1dae31b8a471c35b5c7f6deec7
Reviewed-on: https://go-review.googlesource.com/8432
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
These registers are not available for programs to use. Prior to this
change, the compiler would crash attempting to use ZR as a general
purpose register. Other programs would compile but on execution would
overwrite the G register and cause havoc.
Fixes linux/arm64 build.
Fixes#10304Fixes#10320
Change-Id: I5cf51d3b77cfe3db7dd6377324950cafb02f8d8b
Reviewed-on: https://go-review.googlesource.com/8456
Reviewed-by: Minux Ma <minux@golang.org>
The original implementation used 16 int "words" but only 29 bits per word
for a total of 16*29 = 464 bits, with a space consumption of 16*64 = 1024
bits on a 64 bit machine. Switching to 512 bits increases precision while
still using (in the worst case) half the amount of memory per mp value on
a 64 bit machine.
Also: Decreased permitted number of least-significant mantissa bits which
may be incorrect when considering if a precise floating-point constant is
an integer from 29 to 16 bits.
Change-Id: Iee9287056f0e9aa4f06ceac0724ff4674f710c53
Reviewed-on: https://go-review.googlesource.com/8429
Reviewed-by: Russ Cox <rsc@golang.org>
All multi-precision arithmetic is now based on math/big.
- passes all.bash
- added test cases for fixed bugs
Fixes#7740.
Fixes#6866.
Change-Id: I67268b91766970ced3b928260053ccdce8753d58
Reviewed-on: https://go-review.googlesource.com/7912
Reviewed-by: Russ Cox <rsc@golang.org>
This restores go.errors from before 3af0d79 along with a fixed up
version of the bisonerrors AWK script, translated to Go.
However, this means Yyerror needs access to the yacc parser's state,
which is currently private. To workaround that, add a "state"
accessor method like the Lookahead method added in c7fa3c6.
Update issue #9968.
Change-Id: Ib868789e92fdb7d135442120a392457923e50121
Reviewed-on: https://go-review.googlesource.com/7270
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
On arm64, CMP $foo, R is encoded as from=$foo, reg=R, not as from=$foo,
to=R. The progtable entry for ACMP incorrectly described the latter
form. Because of this, the registerizer was not accounting the registers
used in CMP instructions and was incorrectly re-assigning those registers.
This was an old problem, but it only became apparent after b115c35
(cmd/internal/gc: move cgen, regalloc, et al to portable code). Previous
to this commit, the compiler used a slightly larger register set for the
temps than it used for register variables. Since it had plenty registers
dedicated to temps, the registers used in CMP instruction never clashed
with registers assigned to register variables.
Fixes#10253
Change-Id: Iedf4bd882bd59440dff310ac0f81e0f53d80d7ed
Reviewed-on: https://go-review.googlesource.com/8387
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Extend escape analysis to convT2E and conT2I. If the interface value
does not escape supply runtime with a stack buffer for the object copy.
This is a straight port from .c to .go of Dmitry's patch
Change-Id: Ic315dd50d144d94dd3324227099c116be5ca70b6
Reviewed-on: https://go-review.googlesource.com/8201
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
The false positives (var incorrectly escapes) are marked with BAD.
Change-Id: If64fabb6ea96de44a1177d9ab12e2ccc579fe0c4
Reviewed-on: https://go-review.googlesource.com/5294
Reviewed-by: Keith Randall <khr@golang.org>
10 false positives (var incorrectly escapes to heap) are marked with BAD.
Change-Id: I773b13a18ff55aaa499a2a28a979118422cc5322
Reviewed-on: https://go-review.googlesource.com/5293
Reviewed-by: Keith Randall <khr@golang.org>
Fix build after http://golang.org/cl/5297
The compiler was changed to not print implicit map capacity in error messages.
Change-Id: I852f668680c3c69c5eecc7964e46202a97014d6a
Reviewed-on: https://go-review.googlesource.com/8212
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
The false positive (var incorrectly escapes to heap) is marked with BAD.
Change-Id: I11877fa8e976094b31a221abd88ae32d351c85ee
Reviewed-on: https://go-review.googlesource.com/5292
Reviewed-by: Keith Randall <khr@golang.org>
These can be implemented with just a compare and a move instruction.
Do so, avoiding the overhead of a call into the runtime.
These assertions are a significant cost in Go code that uses interface{}
as a safe alternative to C's void* (or unsafe.Pointer), such as the
current version of the Go compiler.
*T here includes pointer to T but also any Go type represented as
a single pointer (chan, func, map). It does not include [1]*T or struct{*int}.
That requires more work in other parts of the compiler; there is a TODO.
Change-Id: I7ff681c20d2c3eb6ad11dd7b3a37b1f3dda23965
Reviewed-on: https://go-review.googlesource.com/7862
Reviewed-by: Rob Pike <r@golang.org>
Fix recover4.go to work on 64kb systems.
Change-Id: I211cb048de1268a8bbac77c6f3a1e0b8c8277594
Reviewed-on: https://go-review.googlesource.com/7673
Reviewed-by: Minux Ma <minux@golang.org>
Updates #10180
Temporarily disable this test on ppc64 systems as all our builders use 64k page size.
We need a portable way to get the page size of the host so we can correctly size the mmap hole.
Change-Id: Ibd36ebe2f54cf75a44667e2070c385f0daaca481
Reviewed-on: https://go-review.googlesource.com/7652
Reviewed-by: Andrew Gerrand <adg@golang.org>
This came up in private mail.
It works today and I want to make sure it stays working.
Change-Id: I13ebdc2dfadb3c72d7f179be89883137320c05d0
Reviewed-on: https://go-review.googlesource.com/7390
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rob Pike <r@golang.org>
Fixes#10135.
Change-Id: Ic4c5ab15bcb7b9c3fcc685a788d3b59c60c26e1e
Signed-off-by: Shenghou Ma <minux@golang.org>
Reviewed-on: https://go-review.googlesource.com/7400
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change-Id: Ia5115b15a79e1b2b53036646f1ed4b08225b220f
Reviewed-on: https://go-review.googlesource.com/7051
Run-TryBot: Chris Manghane <cmang@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Inlining refuses to inline bodies containing an actual function call, so that
if that call or a child uses runtime.Caller it cannot observe
the inlining.
However, inlining was also refusing to inline bodies that contained
function calls that were themselves inlined away. For example:
func f() int {
return f1()
}
func f1() int {
return f2()
}
func f2() int {
return 2
}
The f2 call in f1 would be inlined, but the f1 call in f would not,
because f1's call to f2 blocked the inlining, despite itself eventually
being inlined away.
Account properly for this kind of transitive inlining and enable.
Also bump the inlining budget a bit, so that the runtime's
heapBits.next is inlined.
This reduces the time for '6g *.go' in html/template by around 12% (!).
(For what it's worth, closing Chrome reduces the time by about 17%.)
Change-Id: If1aa673bf3e583082dcfb5f223e67355c984bfc1
Reviewed-on: https://go-review.googlesource.com/5952
Reviewed-by: Austin Clements <austin@google.com>
These don't work with the new compiler, because the
new compiler doesn't have the custom syntax errors
that I built for the old compiler. It will, just not yet.
(Issue #9968.)
Change-Id: I658f7dab2c7f855340a501f9ae4479c097b28cd3
Reviewed-on: https://go-review.googlesource.com/5632
Reviewed-by: Rob Pike <r@golang.org>
They use too much memory in the current Go compiler draft.
This should fix some builders.
Reenabling is #9933.
Change-Id: Ib5ef348b2c55d2012ffed765f2a6df99dec171f4
Reviewed-on: https://go-review.googlesource.com/5302
Reviewed-by: Russ Cox <rsc@golang.org>
Per the comment at top, this test is about whether the GC runs during
init, but it was testing more than that, and testing how much the GC
collected in a certain amount of time.
Instead, loosen this test to just see whether it ran at all and not
how well it did.
Fixes#9848
Change-Id: I31da7dd769140d7b49aa6c149a543fae6076aa5e
Reviewed-on: https://go-review.googlesource.com/4820
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we always create context objects for closures that capture variables.
However, it is completely unnecessary for direct calls of closures
(whether it is func()(), defer func()() or go func()()).
This change transforms any OCALLFUNC(OCLOSURE) to normal function call.
Closed variables become function arguments.
This transformation is especially beneficial for go func(),
because we do not need to allocate context object on heap.
But it makes direct closure calls a bit faster as well (see BenchmarkClosureCall).
On implementation level it required to introduce yet another compiler pass.
However, the pass iterates only over xtop, so it should not be an issue.
Transformation consists of two parts: closure transformation and call site
transformation. We can't run these parts on different sides of escape analysis,
because tree state is inconsistent. We can do both parts during typecheck,
we don't know how to capture variables and don't have call site.
We can't do both parts during walk of OCALLFUNC, because we can walk
OCLOSURE body earlier.
So now capturevars pass only decides how to capture variables
(this info is required for escape analysis). New transformclosure
pass, that runs just before order/walk, does all transformations
of a closure. And later walk of OCALLFUNC(OCLOSURE) transforms call site.
benchmark old ns/op new ns/op delta
BenchmarkClosureCall 4.89 3.09 -36.81%
BenchmarkCreateGoroutinesCapture 1634 1294 -20.81%
benchmark old allocs new allocs delta
BenchmarkCreateGoroutinesCapture 6 2 -66.67%
benchmark old bytes new bytes delta
BenchmarkCreateGoroutinesCapture 176 48 -72.73%
Change-Id: Ic85e1706e18c3235cc45b3c0c031a9c1cdb7a40e
Reviewed-on: https://go-review.googlesource.com/4050
Reviewed-by: Russ Cox <rsc@golang.org>
Consider an interface value i of type I and concrete value c of type C.
Prior to this CL, i==c was evaluated as
I(c) == i
Evaluating I(c) can allocate.
This CL changes the evaluation of i==c to
x, ok := i.(C); ok && x == c
The new generated code is shorter and does not allocate directly.
If C is small, as it is in every instance in the stdlib,
the new code also uses less stack space
and makes one runtime call instead of two.
If C is very large, the original implementation is used.
The cutoff for "very large" is 1<<16,
following the stack vs heap cutoff used elsewhere.
This kind of comparison occurs in 38 places in the stdlib,
mostly in the net and os packages.
benchmark old ns/op new ns/op delta
BenchmarkEqEfaceConcrete 29.5 7.92 -73.15%
BenchmarkEqIfaceConcrete 32.1 7.90 -75.39%
BenchmarkNeEfaceConcrete 29.9 7.90 -73.58%
BenchmarkNeIfaceConcrete 35.9 7.90 -77.99%
Fixes#9370.
Change-Id: I7c4555950bcd6406ee5c613be1f2128da2c9a2b7
Reviewed-on: https://go-review.googlesource.com/2096
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Support the following conversions in escape analysis:
[]rune("foo")
[]byte("foo")
string([]rune{})
If the result does not escape, allocate temp buffer on stack
and pass it to runtime functions.
Change-Id: I1d075907eab8b0109ad7ad1878104b02b3d5c690
Reviewed-on: https://go-review.googlesource.com/3590
Reviewed-by: Russ Cox <rsc@golang.org>
Ordinary switch statements are rewritten
into a sequence of if statements.
Staticly dead cases were not being eliminated
because the rewrite introduced a temporary,
which hid the fact that the case was a constant.
Stop doing that.
This eliminates dead code in the standard library at:
runtime/cgocall.go:219
runtime/cgocall.go:269
debug/gosym/pclntab.go:175
debug/macho/file.go:208
math/big/nat.go:635
math/big/nat.go:850
math/big/nat.go:1058
cmd/pprof/internal/commands/commands.go:86
net/sock_bsd.go:19
cmd/go/build.go:2657
cmd/go/env.go:90
Fixes#9608.
Change-Id: Ic23a05dfbb1ad91d5f62a6506b35a13e51b33e38
Reviewed-on: https://go-review.googlesource.com/3980
Reviewed-by: Keith Randall <khr@golang.org>
Only documentation / comment changes. Update references to
point to golang.org permalinks or go.googlesource.com/go.
References in historical release notes under doc are left as is.
Change-Id: Icfc14e4998723e2c2d48f9877a91c5abef6794ea
Reviewed-on: https://go-review.googlesource.com/4060
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The main issue is that the misc/cgo/{stdio,life} tests are silently
getting skipped when invoked from run.bash.
run.go should ignore any build tags after the first blank line in
source file. It already checks for test actions only upto the first
blank line. Build tags must be specified in the same block.
See http://golang.org/cl/3675 for background.
Change-Id: Id8abf000119e3335f7250d8ef34aac7811fc9dff
Reviewed-on: https://go-review.googlesource.com/3812
Reviewed-by: Minux Ma <minux@golang.org>
issue9355 generated a file a.[568] in test/ directory and left it there.
For tests like these, it is best to chdir to a test specific directory
before generating any temporary files, since the tests are running
in parallel and might otherwise race with each other for the same files.
Change-Id: I58d96256d4d8ee3fda70d81077f19006064a7425
Reviewed-on: https://go-review.googlesource.com/3813
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Kindly detected by race builders by failing TestRaceRange.
ORANGE typecheck does not increment decldepth around body.
Change-Id: I0df5f310cb3370a904c94d9647a9cf0f15729075
Reviewed-on: https://go-review.googlesource.com/3507
Reviewed-by: Russ Cox <rsc@golang.org>
Type switch variables was not typechecked.
Previously it lead only to a minor consequence:
switch unsafe.Sizeof = x.(type) {
generated an inconsistent error message.
But capturing by value functionality now requries typechecking of all ONAMEs.
Fixes#9731
Change-Id: If037883cba53d85028fb97b1328696091b3b7ddd
Reviewed-on: https://go-review.googlesource.com/3600
Reviewed-by: Russ Cox <rsc@golang.org>
This adds a "framepointer" GOEXPERIMENT that that makes the amd64
toolchain maintain base pointer chains in the same way that gcc
-fno-omit-frame-pointer does. Go doesn't use these saved base
pointers, but this does enable external tools like Linux perf and
VTune to unwind Go stacks when collecting system-wide profiles.
This requires support in the compilers to not clobber BP, support in
liblink for generating the BP-saving function prologue and unwinding
epilogue, and support in the runtime to save BPs across preemption, to
skip saved BPs during stack unwinding and, and to adjust saved BPs
during stack moving.
As with other GOEXPERIMENTs, everything from the toolchain to the
runtime must be compiled with this experiment enabled. To do this,
run make.bash (or all.bash) with GOEXPERIMENT=framepointer.
Change-Id: I4024853beefb9539949e5ca381adfdd9cfada544
Reviewed-on: https://go-review.googlesource.com/2992
Reviewed-by: Russ Cox <rsc@golang.org>
cmd/go doesn't complain (this is an open issue), but go/types does
Change-Id: I2caec1f7aec991a9500d2c3504c29e4ab718c138
Reviewed-on: https://go-review.googlesource.com/3541
Reviewed-by: Alan Donovan <adonovan@google.com>
Language specification says that variables are captured by reference.
And that is what gc compiler does. However, in lots of cases it is
possible to capture variables by value under the hood without
affecting visible behavior of programs. For example, consider
the following typical pattern:
func (o *Obj) requestMany(urls []string) []Result {
wg := new(sync.WaitGroup)
wg.Add(len(urls))
res := make([]Result, len(urls))
for i := range urls {
i := i
go func() {
res[i] = o.requestOne(urls[i])
wg.Done()
}()
}
wg.Wait()
return res
}
Currently o, wg, res, and i are captured by reference causing 3+len(urls)
allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap).
But all of them can be captured by value without changing behavior.
This change implements simple strategy for capturing by value:
if a captured variable is not addrtaken and never assigned to,
then it is captured by value (it is effectively const).
This simple strategy turned out to be very effective:
~80% of all captures in std lib are turned into value captures.
The remaining 20% are mostly in defers and non-escaping closures,
that is, they do not cause allocations anyway.
benchmark old allocs new allocs delta
BenchmarkCompressedZipGarbage 153 126 -17.65%
BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18%
BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53%
BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40%
BenchmarkEncodeDigitsDefault1e4 100 75 -25.00%
BenchmarkEncodeDigitsDefault1e5 193 139 -27.98%
BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63%
BenchmarkEncodeDigitsCompress1e4 100 75 -25.00%
BenchmarkEncodeDigitsCompress1e5 193 139 -27.98%
BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63%
BenchmarkEncodeTwainSpeed1e4 109 81 -25.69%
BenchmarkEncodeTwainSpeed1e5 211 151 -28.44%
BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92%
BenchmarkEncodeTwainDefault1e4 103 77 -25.24%
BenchmarkEncodeTwainDefault1e5 199 143 -28.14%
BenchmarkEncodeTwainDefault1e6 1324 917 -30.74%
BenchmarkEncodeTwainCompress1e4 103 77 -25.24%
BenchmarkEncodeTwainCompress1e5 190 137 -27.89%
BenchmarkEncodeTwainCompress1e6 1327 919 -30.75%
BenchmarkConcurrentDBExec 16223 16220 -0.02%
BenchmarkConcurrentStmtQuery 17687 16182 -8.51%
BenchmarkConcurrentStmtExec 5191 5186 -0.10%
BenchmarkConcurrentTxQuery 17665 17661 -0.02%
BenchmarkConcurrentTxExec 15154 15150 -0.03%
BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52%
BenchmarkConcurrentTxStmtExec 3677 3673 -0.11%
BenchmarkConcurrentRandom 14000 13614 -2.76%
BenchmarkManyConcurrentQueries 25 22 -12.00%
BenchmarkDecodeComplex128Slice 318 252 -20.75%
BenchmarkDecodeFloat64Slice 318 252 -20.75%
BenchmarkDecodeInt32Slice 318 252 -20.75%
BenchmarkDecodeStringSlice 2318 2252 -2.85%
BenchmarkDecode 11 8 -27.27%
BenchmarkEncodeGray 64 56 -12.50%
BenchmarkEncodeNRGBOpaque 64 56 -12.50%
BenchmarkEncodeNRGBA 67 58 -13.43%
BenchmarkEncodePaletted 68 60 -11.76%
BenchmarkEncodeRGBOpaque 64 56 -12.50%
BenchmarkGoLookupIP 153 139 -9.15%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServer 62 59 -4.84%
BenchmarkClientServerParallel4 62 59 -4.84%
BenchmarkClientServerParallel64 62 59 -4.84%
BenchmarkClientServerParallelTLS4 79 76 -3.80%
BenchmarkClientServerParallelTLS64 112 109 -2.68%
BenchmarkCreateGoroutinesCapture 10 6 -40.00%
BenchmarkAfterFunc 1006 1005 -0.10%
Fixes#6632.
Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca
Reviewed-on: https://go-review.googlesource.com/3166
Reviewed-by: Russ Cox <rsc@golang.org>
If result of string(i) does not escape,
allocate a [4]byte temp on stack for it.
Change-Id: If31ce9447982929d5b3b963fd0830efae4247c37
Reviewed-on: https://go-review.googlesource.com/3411
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we always allocate string buffers in heap.
For example, in the following code we allocate a temp string
just for comparison:
if string(byteSlice) == "abc" { ... }
This change extends escape analysis to cover []byte->string
conversions and string concatenation. If the result of operations
does not escape, compiler allocates a small buffer
on stack and passes it to slicebytetostring and concatstrings.
Then runtime uses the buffer if the result fits into it.
Size of the buffer is 32 bytes. There is no fundamental theory
behind this number. Just an observation that on std lib
tests/benchmarks frequency of string allocation is inversely
proportional to string length; and there is significant number
of allocations up to length 32.
benchmark old allocs new allocs delta
BenchmarkFprintfBytes 2 1 -50.00%
BenchmarkDecodeComplex128Slice 318 316 -0.63%
BenchmarkDecodeFloat64Slice 318 316 -0.63%
BenchmarkDecodeInt32Slice 318 316 -0.63%
BenchmarkDecodeStringSlice 2318 2316 -0.09%
BenchmarkStripTags 11 5 -54.55%
BenchmarkDecodeGray 111 102 -8.11%
BenchmarkDecodeNRGBAGradient 200 188 -6.00%
BenchmarkDecodeNRGBAOpaque 165 152 -7.88%
BenchmarkDecodePaletted 319 309 -3.13%
BenchmarkDecodeRGB 166 157 -5.42%
BenchmarkDecodeInterlacing 279 268 -3.94%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServerParallel4 62 61 -1.61%
BenchmarkClientServerParallel64 62 61 -1.61%
BenchmarkClientServerParallelTLS4 79 78 -1.27%
BenchmarkClientServerParallelTLS64 112 111 -0.89%
benchmark old ns/op new ns/op delta
BenchmarkFprintfBytes 381 311 -18.37%
BenchmarkStripTags 2615 2351 -10.10%
BenchmarkDecodeNRGBAGradient 3715887 3635096 -2.17%
BenchmarkDecodeNRGBAOpaque 3047645 2928644 -3.90%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
Change-Id: I9ec01da816945c3329d7be3c7794b520418c3f99
Reviewed-on: https://go-review.googlesource.com/3120
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Escape analysis treats everything assigned to OIND/ODOTPTR as escaping.
As the result b escapes in the following code:
func (b *Buffer) Foo() {
n, m := ...
b.buf = b.buf[n:m]
}
This change recognizes such assignments and ignores them.
Update issue #9043.
Update issue #7921.
There are two similar cases in std lib that benefit from this optimization.
First is in archive/zip:
type readBuf []byte
func (b *readBuf) uint32() uint32 {
v := binary.LittleEndian.Uint32(*b)
*b = (*b)[4:]
return v
}
Second is in time:
type data struct {
p []byte
error bool
}
func (d *data) read(n int) []byte {
if len(d.p) < n {
d.p = nil
d.error = true
return nil
}
p := d.p[0:n]
d.p = d.p[n:]
return p
}
benchmark old ns/op new ns/op delta
BenchmarkCompressedZipGarbage 32431724 32217851 -0.66%
benchmark old allocs new allocs delta
BenchmarkCompressedZipGarbage 153 143 -6.54%
Change-Id: Ia6cd32744e02e36d6d8c19f402f8451101711626
Reviewed-on: https://go-review.googlesource.com/3162
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently all PTRLIT element initializers escape. There is no reason for that.
This change links STRUCTLIT to PTRLIT; STRUCTLIT element initializers are
already linked to the STRUCTLIT. As the result, PTRLIT element initializers
escape when PTRLIT itself escapes.
Change-Id: I89ecd8677cbf81addcfd469cd2fd461c0e9bf7dd
Reviewed-on: https://go-review.googlesource.com/3031
Reviewed-by: Russ Cox <rsc@golang.org>
The compiler has a phase ordering problem. Escape analysis runs
before wrapper generation. When a generated wrapper calls a method
defined in a different package, if that call is inlined, there will be
no escape information for the variables defined in the inlined call.
Those variables will be placed on the stack, which fails if they
actually do escape.
There are probably various complex ways to fix this. This is a simple
way to avoid it: when a generated wrapper calls a method defined in a
different package, treat all local variables as escaping.
Fixes#9537.
Change-Id: I530f39346de16ad173371c6c3f69cc189351a4e9
Reviewed-on: https://go-review.googlesource.com/3092
Reviewed-by: Russ Cox <rsc@golang.org>
We were failing ^uint16(0xffff) == 0, as we computed 0xffff0000 instead.
I could only trigger a failure for the above case, the other two tests
^uint16(0xfffe) == 1 and -uint16(0xffff) == 1 didn't seem to fail
previously. Somehow they get MOVHUs inserted for other reasons (used
by CMP instead of TST?). I fixed OMINUS anyway, better safe than
sorry.
Fixes#9604
Change-Id: I4c2d5bdc667742873ac029fdbe3db0cf12893c27
Reviewed-on: https://go-review.googlesource.com/2940
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
The various files are confusingly named and their operation
not easy to see. Add a comment to cmplxdivide.c, one of the few
C files that will endure in the repository, to explain how to build
and run the test.
Change-Id: I1fd5c564a14217e1b9815b09bc24cc43c54c096f
Reviewed-on: https://go-review.googlesource.com/2850
Reviewed-by: Russ Cox <rsc@golang.org>
Recognize loops of the form
for i := range a {
a[i] = zero
}
in which the evaluation of a is free from side effects.
Replace these loops with calls to memclr.
This occurs in the stdlib in 18 places.
The motivating example is clearing a byte slice:
benchmark old ns/op new ns/op delta
BenchmarkGoMemclr5 3.31 3.26 -1.51%
BenchmarkGoMemclr16 13.7 3.28 -76.06%
BenchmarkGoMemclr64 50.8 4.14 -91.85%
BenchmarkGoMemclr256 157 6.02 -96.17%
Update #5373.
Change-Id: I99d3e6f5f268e8c6499b7e661df46403e5eb83e4
Reviewed-on: https://go-review.googlesource.com/2520
Reviewed-by: Keith Randall <khr@golang.org>
run GC in its own background goroutine making the
caller runnable if resources are available. This is
critical in single goroutine applications.
Allow goroutines that allocate a lot to help out
the GC and in doing so throttle their own allocation.
Adjust test so that it only detects that a GC is run
during init calls and not whether the GC is memory
efficient. Memory efficiency work will happen later
in 1.5.
Change-Id: I4306f5e377bb47c69bda1aedba66164f12b20c2b
Reviewed-on: https://go-review.googlesource.com/2349
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This test was taking a long time, reduce its zealousness.
Change-Id: Ib824247b84b0039a9ec690f72336bef3738d4c44
Reviewed-on: https://go-review.googlesource.com/2502
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Minux Ma <minux@golang.org>
The compiler converts 'val, ok = m[key]' to
tmp, ok = <runtime call>
val = *tmp
For lookups of the form '_, ok = m[key]',
the second statement is unnecessary.
By not generating it we save a nil check.
Change-Id: I21346cc195cb3c62e041af8b18770c0940358695
Reviewed-on: https://go-review.googlesource.com/1975
Reviewed-by: Russ Cox <rsc@golang.org>
If the user provided a key but no value via -ldflag -X,
another linker flag was used as the value.
Placing the user's flags at the end avoids this problem.
It also provides the user the opportunity to
override existing linker flags.
Fixes#8810.
Change-Id: I96f4190713dc9a9c29142e56658446fba7fb6bc8
Reviewed-on: https://go-review.googlesource.com/2242
Reviewed-by: Minux Ma <minux@golang.org>
These tests were enabled as part of change 1774.
They depend on the errchk tool, which is a Perl
script. However, Perl is not available on Plan 9.
Change-Id: I82707aae16013acc9a3800d39b0084588b852b53
Reviewed-on: https://go-review.googlesource.com/2031
Reviewed-by: Minux Ma <minux@golang.org>
Broken by e7173dfdfd
Fix by simply disabling the relevant tests.
* bug248 and bug345 require errchk, but we can't
rely on perl being available.
* bug369 is disabled anyway.
Change-Id: Idf73ebccb066943e3fe17c2f662b37238ec74dfe
Reviewed-on: https://go-review.googlesource.com/2052
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Gccgo creates a struct to hold the arguments for the deferred
function. In this example the struct holds a type defined in a
different package. The bug was that gccgo tried to create an equality
function for this struct, and it implemented that function by calling
the equality function for the type defined in the other package.
Since that type is not exported, the reference to the equality
function failed at link time. Normally it is impossible for a struct
to directly contain a member that is an unexported type from another
package, but in this specific case it was possible. Fixed in gccgo
with https://codereview.appspot.com/183500043 .
Change-Id: I8ec3a33631225b9ac2a4ac060cb4d10b4635e60b
Reviewed-on: https://go-review.googlesource.com/1690
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
* bug248, bug345, bug369, and bug429 were ported from bash commands to run scripts. bug369 remains disabled.
* bug395 is a test for issue 1909, which is still open. It is marked as skip now and will be usable with compile with run.go when issue 1909 is fixed.
Fixes#4139
Updates #1909
Change-Id: Ibb5fbfb5cf72ddc285829245318eeacd3fb5a636
Reviewed-on: https://go-review.googlesource.com/1774
Reviewed-by: Russ Cox <rsc@golang.org>
When we do y = &x for global variables x and y, y gets initialized
at link time. Do the same for y = &x.f if x is a struct and y=&x[5]
if x is an array.
fixes#9217fixes#9355
Change-Id: Iea3c0ce2ce1b309e2b760e345608fd95460b5713
Reviewed-on: https://go-review.googlesource.com/1691
Reviewed-by: Minux Ma <minux@golang.org>
This test was added in CL 151000043.
It got lost in CL 144630044.
Change-Id: I318ab11be8e3e7489fc1395457c029c8bdb2aa41
Reviewed-on: https://go-review.googlesource.com/1773
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Gccgo incorrectly executed functions multiple times when they appeared
in a composite literal that required a conversion between different
interface types.
Change-Id: I7b40e76ed23fa8440ffa03b262041265c109adf7
Reviewed-on: https://go-review.googlesource.com/1710
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Gccgo failed to create the type descriptor for the type used to
allocate the nil value passed to append as the second argument when
append is called with only one argument. Calling append with only one
argument is unusual but obviously should not cause a compiler crash.
Change-Id: I530821847dfd68f0302de6ca6a84dfbc79653935
Reviewed-on: https://go-review.googlesource.com/1692
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
It shouldn't semacquire() inside an acquirem(), the runtime
thinks that means deadlock. It actually isn't a deadlock, but it
looks like it because acquirem() does m.locks++.
Candidate for inclusion in 1.4.1. runtime.Stack with all=true
is pretty unuseable in GOMAXPROCS>1 environment.
fixes#9321
Change-Id: Iac6b664217d24763b9878c20e49229a1ecffc805
Reviewed-on: https://go-review.googlesource.com/1600
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Make sure dequeueing from a channel queue does not exhibit quadratic time behavior.
Change-Id: Ifb7c709b026f74c7e783610d4914dd92909a441b
Reviewed-on: https://go-review.googlesource.com/1212
Reviewed-by: Russ Cox <rsc@golang.org>
Usage:
fibo <n> compute fibonacci(n), n must be >= 0
fibo -bench benchmark fibonacci computation (takes about 1 min)
Additional flags:
-half add values using two half-digit additions
-opt optimize memory allocation through reuse
-short only print the first 10 digits of very large fibonacci numbers
This change was reviewed in detail as https://codereview.appspot.com/168480043 .
Change-Id: I7c86d49c5508532ea6206d00f424cf2117d2fe41
Reviewed-on: https://go-review.googlesource.com/1211
Reviewed-by: Russ Cox <rsc@golang.org>
When we start work on Gerrit, ppc64 and garbage collection
work will continue in the master branch, not the dev branches.
(We may still use dev branches for other things later, but
these are ready to be merged, and doing it now, before moving
to Git means we don't have to have dev branches working
in the Gerrit workflow on day one.)
TBR=rlh
CC=golang-codereviews
https://golang.org/cl/183140043
640 bytes ought to be enough for anybody.
We'll bring this back down before Go 1.5. That's issue 9214.
TBR=rlh
CC=golang-codereviews
https://golang.org/cl/188730043
The SudoG used to sit on the stack, so it was cheap to allocated
and didn't need to be cleaned up when finished.
For the conversion to Go, we had to move sudog off the stack
for a few reasons, so we added a cache of recently used sudogs
to keep allocation cheap. But we didn't add any of the necessary
cleanup before adding a SudoG to the new cache, and so the cached
SudoGs had stale pointers inside them that have caused all sorts
of awful, hard to debug problems.
CL 155760043 made sure SudoG.elem is cleaned up.
CL 150520043 made sure SudoG.selectdone is cleaned up.
This CL makes sure SudoG.next, SudoG.prev, and SudoG.waitlink
are cleaned up. I should have done this when I did the other two
fields; instead I wasted a week tracking down a leak they caused.
A dangling SudoG.waitlink can point into a sudogcache list that
has been "forgotten" in order to let the GC collect it, but that
dangling .waitlink keeps the list from being collected.
And then the list holding the SudoG with the dangling waitlink
can find itself in the same situation, and so on. We end up
with lists of lists of unusable SudoGs that are still linked into
the object graph and never collected (given the right mix of
non-trivial selects and non-channel synchronization).
More details in golang.org/issue/9110.
Fixes#9110.
LGTM=r
R=r
CC=dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/177870043
This is to reduce the delta between dev.cc and dev.garbage to just garbage collector changes.
These are the files that had merge conflicts and have been edited by hand:
malloc.go
mem_linux.go
mgc.go
os1_linux.go
proc1.go
panic1.go
runtime1.go
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/174180043
Now the only difference between dev.cc and dev.garbage
is the runtime conversion on the one side and the
garbage collection on the other. They both have the
same set of changes from default and dev.power64.
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/172570043
The remaining run-only tests will be migrated to run.go in another CL.
This CL will break the build due to issues 8746 and 8806.
Update #4139
Update #8746
Update #8806
LGTM=rsc
R=rsc, bradfitz, iant
CC=golang-codereviews
https://golang.org/cl/144630044
Now each C printf, Go print, or Go println is guaranteed
not to be interleaved with other calls of those functions.
This should help when debugging concurrent failures.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/169120043
One failing case this removes is:
var bytes = []byte("hello, world")
var copy_bytes = bytes
We could handle this in the compiler, but it requires special
case for a variable that is initialized to the value of a
variable that is initialized to a string literal converted to
[]byte. This seems an unlikely case--it never occurs in the
standrd library--and it seems unnecessary to write the code to
handle it.
If we do want to support this case, one approach is
https://golang.org/cl/171840043.
The other failing cases are of the form
var bx bool
var copy_bx = bx
The compiler used to initialize copy_bx to false. However,
that led to issue 7665, since bx may be initialized in non-Go
code. The compiler no longer assumes that bx must be false,
so copy_bx can not be statically initialized.
We can fix these with https://golang.org/cl/169040043
if we also pass -complete to the compiler as part of this
test. This is OK but it's too late in the release cycle.
Fixes#8746.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/165400043
On power64x, this one line in live.go reports that t is live
because of missing optimization passes. This isn't what this
test is trying to test, so shuffle bad40 so that it still
accomplishes the intent of the test without also depending on
optimization.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/167110043
The remaining failures in this test are because of incomplete
optimization support on power64x. Tracked in issue 9058.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/168130043
All three cases of clearfat were wrong on power64x.
The cases that handle 1032 bytes and up and 32 bytes and up
both use MOVDU (one directly generated in a loop and the other
via duffzero), which leaves the pointer register pointing at
the *last written* address. The generated code was not
accounting for this, so the byte fill loop was re-zeroing the
last zeroed dword, rather than the bytes following the last
zeroed dword. Fix this by simply adding an additional 8 byte
offset to the byte zeroing loop.
The case that handled under 32 bytes was also wrong. It
didn't update the pointer register at all, so the byte zeroing
loop was simply re-zeroing the beginning of region. Again,
the fix is to add an offset to the byte zeroing loop to
account for this.
LGTM=dave, bradfitz
R=rsc, dave, bradfitz
CC=golang-codereviews
https://golang.org/cl/168870043
Originally traceback was only used for printing the stack
when an unexpected signal came in. In that case, the
initial PC is taken from the signal and should be used
unaltered. For the callers, the PC is the return address,
which might be on the line after the call; we subtract 1
to get to the CALL instruction.
Traceback is now used for a variety of things, and for
almost all of those the initial PC is a return address,
whether from getcallerpc, or gp->sched.pc, or gp->syscallpc.
In those cases, we need to subtract 1 from this initial PC,
but the traceback code had a hard rule "never subtract 1
from the initial PC", left over from the signal handling days.
Change gentraceback to take a flag that specifies whether
we are tracing a trap.
Change traceback to default to "starting with a return PC",
which is the overwhelmingly common case.
Add tracebacktrap, like traceback but starting with a trap PC.
Use tracebacktrap in signal handlers.
Fixes#7690.
LGTM=iant, r
R=r, iant
CC=golang-codereviews
https://golang.org/cl/167810044
The test just doubled a certain number of times
and then gave up. On a mostly fast but occasionally
slow machine this may never make the test run
long enough to see the linear growth.
Change test to keep doubling until the first round
takes at least a full second, to reduce the effect of
occasional scheduling or other jitter.
The failure we saw had a time for the first round
of around 100ms.
Note that this test still passes once it sees a linear
effect, even with a very small total time.
The timeout here only applies to how long the execution
must be to support a reported failure.
LGTM=khr
R=khr
CC=golang-codereviews, rlh
https://golang.org/cl/164070043
This brings dev.power64 up-to-date with the current tip of
default. go_bootstrap is still panicking with a bad defer
when initializing the runtime (even on amd64).
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/152570049
This also removes pkg/runtime/traceback_lr.c, which was ported
to Go in an earlier commit and then moved to
runtime/traceback.go.
Reviewer: rsc@golang.org
rsc: LGTM
test16 used to fail with gccgo. The withoutRecoverRecursive
test would have failed in some possible implementations.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/151630043
This brings cmd/gc in line with the spec on this question.
It might break existing code, but that code was not conformant
with the spec.
Credit to Rémy for finding the broken code.
Fixes#6366.
LGTM=r
R=golang-codereviews, r
CC=adonovan, golang-codereviews, gri
https://golang.org/cl/129550043
https://golang.org/cl/152700045/ made it possible for struct literals assigned to globals to use <N> as the RHS. Normally, this is to zero out variables on first use. Because globals are already zero (or their linker initialized value), we just ignored this.
Now that <N> can occur from non-initialization code, we need to emit this code. We don't use <N> for initialization of globals any more, so this shouldn't cause any excessive zeroing.
Fixes#8961.
LGTM=rsc
R=golang-codereviews, rsc
CC=bradfitz, golang-codereviews
https://golang.org/cl/154540044
This fixes the bug in which the linker reports "missing Go
type information" when a -X option refers to a symbol that is
not used.
Fixes#8821.
LGTM=rsc
R=rsc, r
CC=golang-codereviews
https://golang.org/cl/151000043
If there is a leading ·, assume there is a Go prototype and
attach the Go prototype information to the function.
If the function is not called from Go and does not need a
Go prototype, it can be made file-local instead (using name<>(SB)).
This fixes the current BSD build failures, by giving functions like
sync/atomic.StoreUint32 argument stack map information.
Fixes#8753.
LGTM=khr, iant
R=golang-codereviews, iant, khr, bradfitz
CC=golang-codereviews, r, rlh
https://golang.org/cl/142150043
iterdelete's run time varies; occasionally we get unlucky. To reduce spurious failures, average away some of the variation.
On my machine, 8 of 5000 runs (0.15%) failed before this CL. After this CL, there were no failures after 35,000 runs.
I confirmed that this adjusted test still fails before CL 141270043.
LGTM=khr
R=khr
CC=bradfitz, golang-codereviews
https://golang.org/cl/140610043
During anylit run, nodes such as SLICEARR(statictmp, [:])
may be generated and are expected to be found unchanged by
gen_as_init.
In some walks (in particular walkselect), the statement
may be walked again and lowered to its usual form, leading to a
crash.
Fixes#8017.
Fixes#8024.
Fixes#8058.
LGTM=rsc
R=golang-codereviews, dvyukov, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/112080043
Previously it might happen before calling dowidth and
result in a compiler crash.
Fixes#8060.
LGTM=dvyukov, rsc
R=golang-codereviews, dvyukov, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/110980044
A write *p = x that needs a write barrier (not all do)
now turns into runtime.writebarrierptr(p, x)
or one of the other variants.
The write barrier implementations are trivial.
The goal here is to emit the calls in the correct places
and to incur the cost of those function calls in the Go 1.4 cycle.
Performance on the Go 1 benchmark suite below.
Remember, the goal is to slow things down (and be correct).
We will look into optimizations in separate CLs, as part of
the process of comparing Go 1.3 against tip in order to make
sure Go 1.4 runs at least as fast as Go 1.3.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3118336716 3452876110 +10.73%
BenchmarkFannkuch11 3184497677 3211552284 +0.85%
BenchmarkFmtFprintfEmpty 89.9 107 +19.02%
BenchmarkFmtFprintfString 236 287 +21.61%
BenchmarkFmtFprintfInt 246 278 +13.01%
BenchmarkFmtFprintfIntInt 395 458 +15.95%
BenchmarkFmtFprintfPrefixedInt 343 378 +10.20%
BenchmarkFmtFprintfFloat 477 525 +10.06%
BenchmarkFmtManyArgs 1446 1707 +18.05%
BenchmarkGobDecode 14398047 14685958 +2.00%
BenchmarkGobEncode 12557718 12947104 +3.10%
BenchmarkGzip 453462345 472413285 +4.18%
BenchmarkGunzip 114226016 115127398 +0.79%
BenchmarkHTTPClientServer 114689 112122 -2.24%
BenchmarkJSONEncode 24914536 26135942 +4.90%
BenchmarkJSONDecode 86832877 103620289 +19.33%
BenchmarkMandelbrot200 4833452 4898780 +1.35%
BenchmarkGoParse 4317976 4835474 +11.98%
BenchmarkRegexpMatchEasy0_32 150 166 +10.67%
BenchmarkRegexpMatchEasy0_1K 393 402 +2.29%
BenchmarkRegexpMatchEasy1_32 125 142 +13.60%
BenchmarkRegexpMatchEasy1_1K 1010 1236 +22.38%
BenchmarkRegexpMatchMedium_32 232 301 +29.74%
BenchmarkRegexpMatchMedium_1K 76963 102721 +33.47%
BenchmarkRegexpMatchHard_32 3833 5463 +42.53%
BenchmarkRegexpMatchHard_1K 119668 161614 +35.05%
BenchmarkRevcomp 763449047 706768534 -7.42%
BenchmarkTemplate 124954724 134834549 +7.91%
BenchmarkTimeParse 517 511 -1.16%
BenchmarkTimeFormat 501 514 +2.59%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 53.31 52.26 0.98x
BenchmarkGobEncode 61.12 59.28 0.97x
BenchmarkGzip 42.79 41.08 0.96x
BenchmarkGunzip 169.88 168.55 0.99x
BenchmarkJSONEncode 77.89 74.25 0.95x
BenchmarkJSONDecode 22.35 18.73 0.84x
BenchmarkGoParse 13.41 11.98 0.89x
BenchmarkRegexpMatchEasy0_32 213.30 191.72 0.90x
BenchmarkRegexpMatchEasy0_1K 2603.92 2542.74 0.98x
BenchmarkRegexpMatchEasy1_32 254.00 224.93 0.89x
BenchmarkRegexpMatchEasy1_1K 1013.53 827.98 0.82x
BenchmarkRegexpMatchMedium_32 4.30 3.31 0.77x
BenchmarkRegexpMatchMedium_1K 13.30 9.97 0.75x
BenchmarkRegexpMatchHard_32 8.35 5.86 0.70x
BenchmarkRegexpMatchHard_1K 8.56 6.34 0.74x
BenchmarkRevcomp 332.92 359.62 1.08x
BenchmarkTemplate 15.53 14.39 0.93x
LGTM=rlh
R=rlh
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/136380043
This CL adjusts code referring to src/pkg to refer to src.
Immediately after submitting this CL, I will submit
a change doing 'hg mv src/pkg/* src'.
That change will be too large to review with Rietveld
but will contain only the 'hg mv'.
This CL will break the build.
The followup 'hg mv' will fix it.
For more about the move, see golang.org/s/go14nopkg.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/134570043
Increase NOSPLIT reservation from 192 to 384 bytes.
The problem is that the non-Unix systems (Solaris and Windows)
just can't make system calls in a small amount of space,
and then worse they do things that are complex enough
to warrant calling runtime.throw on failure.
We don't have time to rewrite the code to use less stack.
I'm not happy about this, but it's still a small amount.
The good news is that we're doing this to get to only
using copying stacks for stack growth. Once that is true,
we can drop the default stack size from 8k to 4k, which
should more than make up for the bytes we're losing here.
LGTM=r
R=iant, r, bradfitz, aram.h
CC=golang-codereviews
https://golang.org/cl/140350043
The gp->panicwrap adjustment is just fatally flawed.
Now that there is a Panic.argp field, update that instead.
That can be done on entry only, so that unwinding doesn't
need to worry about undoing anything. The wrappers
emit a few more instructions in the prologue but everything
else in the system gets much simpler.
It also fixes (without trying) a broken test I never checked in.
Fixes#7491.
LGTM=khr
R=khr
CC=dvyukov, golang-codereviews, iant, r
https://golang.org/cl/135490044
newstackcall creates a new stack segment, and we want to
be able to throw away all that code.
LGTM=khr
R=khr, iant
CC=dvyukov, golang-codereviews, r
https://golang.org/cl/139270043
created panic1.go just so diffs were available.
After this CL is in, I'd like to move panic.go -> defer.go
and panic1.go -> panic.go.
LGTM=rsc
R=rsc, khr
CC=golang-codereviews
https://golang.org/cl/133530045
In CL 131450043, which raised it to 160,
I'd raise it to 192 if necessary.
Apparently it is necessary on windows/amd64.
One note for those concerned about the growth:
in the old segmented stack world, we wasted this much
space at the bottom of every stack segment.
In the new contiguous stack world, each goroutine has
only one stack segment, so we only waste this much space
once per goroutine. So even raising the limit further might
still be a net savings.
Fixes windows/amd64 build.
TBR=r
CC=golang-codereviews
https://golang.org/cl/132480043
The Go calling convention uses more stack space than C.
On 64-bit systems we've been right up against the limit
(128 bytes, so only 16 words) and doing awful things to
our source code to work around it. Instead of continuing
to do awful things, raise the limit to 160 bytes.
I am prepared to raise the limit to 192 bytes if necessary,
but I think this will be enough.
Should fix current link-time stack overflow errors on
- nacl/arm
- netbsd/amd64
- openbsd/amd64
- solaris/amd64
- windows/amd64
TBR=r
CC=golang-codereviews, iant
https://golang.org/cl/131450043
Before, a slice with cap=0 or a string with len=0 might have its
base pointer pointing beyond the actual slice/string data into
the next block. The collector had to ignore slices and strings with
cap=0 in order to avoid misinterpreting the base pointer.
Now, a slice with cap=0 or a string with len=0 still has a base
pointer pointing into the actual slice/string data, no matter what.
The collector can now always scan the pointer, which means
strings and slices are no longer special.
Fixes#8404.
LGTM=khr, josharian
R=josharian, khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/112570044
Normally, an expression of the form x.f or *y can be reordered
with function calls and communications.
Select is stricter than normal: each channel expression is evaluated
in source order. If you have case <-x.f and case <-foo(), then if the
evaluation of x.f causes a panic, foo must not have been called.
(This is in contrast to an expression like x.f + foo().)
Enforce this stricter ordering.
Fixes#8336.
LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews, r
https://golang.org/cl/126570043
We need to change the interface value representation for
concurrent garbage collection, so that there is no ambiguity
about whether the data word holds a pointer or scalar.
This CL does NOT make any representation changes.
Instead, it removes representation assumptions from
various pieces of code throughout the tree.
The isdirectiface function in cmd/gc/subr.c is now
the only place that decides that policy.
The policy propagates out from there in the reflect
metadata, as a new flag in the internal kind value.
A follow-up CL will change the representation by
changing the isdirectiface function. If that CL causes
problems, it will be easy to roll back.
Update #8405.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/129090043
Credit to Rémy for finding and writing test case.
Fixes#8325.
LGTM=r
R=golang-codereviews, r
CC=dave, golang-codereviews, iant, remyoudompheng
https://golang.org/cl/124950043
This change introduces gomallocgc, a Go clone of mallocgc.
Only a few uses have been moved over, so there are still
lots of uses from C. Many of these C uses will be moved
over to Go (e.g. in slice.goc), but probably not all.
What should remain of C's mallocgc is an open question.
LGTM=rsc, dvyukov
R=rsc, khr, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/108840046
Following CL 68150047, the goos and goarch
variables are not currently set when the GOOS
and GOARCH environment variables are not set.
This made the content of the build tag to be
ignored in this case.
This CL sets goos and goarch to runtime.GOOS
and runtime.GOARCH when the GOOS and GOARCH
environments variables are not set.
LGTM=aram, bradfitz
R=golang-codereviews, aram, gobot, rsc, dave, bradfitz
CC=golang-codereviews, rsc
https://golang.org/cl/112490043
I'm improving gccgo's detection of variables that are only set
but not used, and it triggers additional errors on this code.
The new gccgo errors are correct; gc seems to suppress them
due to the other, expected, errors. This change uses the
variables so that no compiler will complain.
gccgo change is https://golang.org/cl/119920043 .
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/116050043
benchmark old ns/op new ns/op delta
BenchmarkSelectUncontended 220 165 -25.00%
BenchmarkSelectContended 209 161 -22.97%
BenchmarkSelectProdCons 1042 904 -13.24%
But more importantly this change will allow
to get rid of free function in runtime.
Fixes#6494.
LGTM=rsc, khr
R=golang-codereviews, rsc, dominik.honnef, khr
CC=golang-codereviews, remyoudompheng
https://golang.org/cl/107670043
There is no reason to generate different code for cap and len.
Fixes#8025.
Fixes#8026.
LGTM=rsc
R=rsc, iant, khr
CC=golang-codereviews
https://golang.org/cl/93570044
Fixes#8074.
The issue was not reproduceable by revision
go version devel +e0ad7e329637 Thu Jun 19 22:19:56 2014 -0700 linux/arm
But include the original test case in case the issue reopens itself.
LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews
https://golang.org/cl/107290043
No functional changes.
Generating shorter functions improves compilation time. On my laptop, this test's running time goes from 5.5s to 1.5s; the wall clock time to run all tests goes down 1s. On Raspberry Pi, this CL cuts 50s off the wall clock time to run all tests.
Fixes#7503.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/72590045
There is a hierarchy of location defined by loop depth:
-1 = the heap
0 = function results
1 = local variables (and parameters)
2 = local variable declared inside a loop
3 = local variable declared inside a loop inside a loop
etc
In general if an address from loopdepth n is assigned to
something in loop depth m < n, that indicates an extended
lifetime of some form that requires a heap allocation.
Function results can be local variables too, though, and so
they don't actually fit into the hierarchy very well.
Treat the address of a function result as level 1 so that
if it is written back into a result, the address is treated
as escaping.
Fixes#8185.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/108870044
The analysis for &x was using the loop depth on x set
during x's declaration. A type switch creates a list of
implicit declarations that were not getting initialized
with loop depths.
Fixes#8176.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/108860043
A runtime.Goexit during a panic-invoked deferred call
left the panic stack intact even though all the stack frames
are gone when the goroutine is torn down.
The next goroutine to reuse that struct will have a
bogus panic stack and can cause the traceback routines
to walk into garbage.
Most likely to happen during tests, because t.Fatal might
be called during a deferred func and uses runtime.Goexit.
This "not enough cleared in Goexit" failure mode has
happened to us multiple times now. Clear all the pointers
that don't make sense to keep, not just gp->panic.
Fixes#8158.
LGTM=iant, dvyukov
R=iant, dvyukov
CC=golang-codereviews
https://golang.org/cl/102220043
I am not sure what the rounding here was
trying to do, but it was skipping the first
pointer on native client.
The code above the rounding already checks
that xoffset is widthptr-aligned, so the rnd
was a no-op everywhere but on Native Client.
And on Native Client it was wrong.
Perhaps it was supposed to be rounding down,
not up, but zerorange handles the extra 32 bits
correctly, so the rnd does not seem to be necessary
at all.
This wouldn't be worth doing for Go 1.3 except
that it can affect code on the playground.
Fixes#8155.
LGTM=r, iant
R=golang-codereviews, r, iant
CC=dvyukov, golang-codereviews, khr
https://golang.org/cl/108740047
I introduced this bug when I changed the escape
analysis to run in phases based on call graph
dependency order, in order to be more precise about
inputs escaping back to outputs (functions returning
their arguments).
Given
func f(z **int) *int { return *z }
we were tagging the function as 'z does not escape
and is not returned', which is all true, but not
enough information.
If used as:
var x int
p := &x
q := &p
leak(f(q))
then the compiler might try to keep x, p, and q all
on the stack, since (according to the recorded
information) nothing interesting ends up being
passed to leak.
In fact since f returns *q = p, &x is passed to leak
and x needs to be heap allocated.
To trigger the bug, you need a chain that the
compiler wants to keep on the stack (like x, p, q
above), and you need a function that returns an
indirect of its argument, and you need to pass the
head of the chain to that function. This doesn't
come up very often: this bug has been present since
June 2012 (between Go 1 and Go 1.1) and we haven't
seen it until now. It helps that most functions that
return indirects are getters that are simple enough
to be inlined, avoiding the bug.
Earlier versions of Go also had the benefit that if
&x really wasn't used beyond x's lifetime, nothing
broke if you put &x in a heap-allocated structure
accidentally. With the new stack copying, though,
heap-allocated structures containing &x are not
updated when the stack is copied and x moves,
leading to crashes in Go 1.3 that were not crashes
in Go 1.2 or Go 1.1.
The fix is in two parts.
First, in the analysis of a function, recognize when
a value obtained via indirect of a parameter ends up
being returned. Mark those parameters as having
content escape back to the return results (but we
don't bother to write down which result).
Second, when using the analysis to analyze, say,
f(q), mark parameters with content escaping as
having any indirections escape to the heap. (We
don't bother trying to match the content to the
return value.)
The fix could be less precise (simpler).
In the first part we might mark all content-escaping
parameters as plain escaping, and then the second
part could be dropped. Or we might assume that when
calling f(q) all the things pointed at by q escape
always (for any f and q).
The fix could also be more precise (more complex).
We might record the specific mapping from parameter
to result along with the number of indirects from the
parameter to the thing being returned as the result,
and then at the call sites we could set up exactly the
right graph for the called function. That would make
notleaks(f(q)) be able to keep x on the stack, because
the reuslt of f(q) isn't passed to anything that leaks it.
The less precise the fix, the more stack allocations
become heap allocations.
This fix is exactly as precise as it needs to be so that
none of the current stack allocations in the standard
library turn into heap allocations.
Fixes#8120.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, khr, r
https://golang.org/cl/102040046
The 'address taken' bit in a function variable was not
propagating into the inlined copies, causing incorrect
liveness information.
LGTM=dsymonds, bradfitz
R=golang-codereviews, bradfitz
CC=dsymonds, golang-codereviews, iant, khr, r
https://golang.org/cl/96670046
The 1-byte write was silently clearing a byte on the stack.
If there was another function call with more arguments
in the same stack frame, no harm done.
Otherwise, if the variable at that location was already zero,
no harm done.
Otherwise, problems.
Fixes#8139.
LGTM=dsymonds
R=golang-codereviews, dsymonds
CC=golang-codereviews, iant, r
https://golang.org/cl/100940043
We were requiring that the defer stack and the panic stack
be completely processed, thinking that if any were left over
the stack scan and the defer stack/panic stack must be out
of sync. It turns out that the panic stack may well have
leftover entries in some situations, and that's okay.
Fixes#8132.
LGTM=minux, r
R=golang-codereviews, minux, r
CC=golang-codereviews, iant, khr
https://golang.org/cl/100900044
The 'continuation pc' is where the frame will continue
execution, if anywhere. For a frame that stopped execution
due to a CALL instruction, the continuation pc is immediately
after the CALL. But for a frame that stopped execution due to
a fault, the continuation pc is the pc after the most recent CALL
to deferproc in that frame, or else 0. That is where execution
will continue, if anywhere.
The liveness information is only recorded for CALL instructions.
This change makes sure that we never look for liveness information
except for CALL instructions.
Using a valid PC fixes crashes when a garbage collection or
stack copying tries to process a stack frame that has faulted.
Record continuation pc in heapdump (format change).
Fixes#8048.
LGTM=iant, khr
R=khr, iant, dvyukov
CC=golang-codereviews, r
https://golang.org/cl/100870044
This CL forces the optimizer to preserve some memory stores
that would be redundant except that a stack scan due to garbage
collection or stack copying might look at them during a function call.
As such, it forces additional memory writes and therefore slows
down the execution of some programs, especially garbage-heavy
programs that are already limited by memory bandwidth.
The slowdown can be as much as 7% for end-to-end benchmarks.
These numbers are from running go1.test -test.benchtime=5s three times,
taking the best (lowest) ns/op for each benchmark. I am excluding
benchmarks with time/op < 10us to focus on macro effects.
All benchmarks are on amd64.
Comparing tip (a27f34c771cb) against this CL on an Intel Core i5 MacBook Pro:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3876500413 3856337341 -0.52%
BenchmarkFannkuch11 2965104777 2991182127 +0.88%
BenchmarkGobDecode 8563026 8788340 +2.63%
BenchmarkGobEncode 5050608 5267394 +4.29%
BenchmarkGzip 431191816 434168065 +0.69%
BenchmarkGunzip 107873523 110563792 +2.49%
BenchmarkHTTPClientServer 85036 86131 +1.29%
BenchmarkJSONEncode 22143764 22501647 +1.62%
BenchmarkJSONDecode 79646916 85658808 +7.55%
BenchmarkMandelbrot200 4720421 4700108 -0.43%
BenchmarkGoParse 4651575 4712247 +1.30%
BenchmarkRegexpMatchMedium_1K 71986 73490 +2.09%
BenchmarkRegexpMatchHard_1K 111018 117495 +5.83%
BenchmarkRevcomp 648798723 659352759 +1.63%
BenchmarkTemplate 112673009 112819078 +0.13%
Comparing tip (a27f34c771cb) against this CL on an Intel Xeon E5520:
BenchmarkBinaryTree17 5461110720 5393104469 -1.25%
BenchmarkFannkuch11 4314677151 4327177615 +0.29%
BenchmarkGobDecode 11065853 11235272 +1.53%
BenchmarkGobEncode 6500065 6959837 +7.07%
BenchmarkGzip 647478596 671769097 +3.75%
BenchmarkGunzip 139348579 141096376 +1.25%
BenchmarkHTTPClientServer 69376 73610 +6.10%
BenchmarkJSONEncode 30172320 31796106 +5.38%
BenchmarkJSONDecode 113704905 114239137 +0.47%
BenchmarkMandelbrot200 6032730 6003077 -0.49%
BenchmarkGoParse 6775251 6405995 -5.45%
BenchmarkRegexpMatchMedium_1K 111832 113895 +1.84%
BenchmarkRegexpMatchHard_1K 161112 168420 +4.54%
BenchmarkRevcomp 876363406 892319935 +1.82%
BenchmarkTemplate 146273096 148998339 +1.86%
Just to get a sense of where we are compared to the previous release,
here are the same benchmarks comparing Go 1.2 to this CL.
Comparing Go 1.2 against this CL on an Intel Core i5 MacBook Pro:
BenchmarkBinaryTree17 4370077662 3856337341 -11.76%
BenchmarkFannkuch11 3347052657 2991182127 -10.63%
BenchmarkGobDecode 8791384 8788340 -0.03%
BenchmarkGobEncode 4968759 5267394 +6.01%
BenchmarkGzip 437815669 434168065 -0.83%
BenchmarkGunzip 94604099 110563792 +16.87%
BenchmarkHTTPClientServer 87798 86131 -1.90%
BenchmarkJSONEncode 22818243 22501647 -1.39%
BenchmarkJSONDecode 97182444 85658808 -11.86%
BenchmarkMandelbrot200 4733516 4700108 -0.71%
BenchmarkGoParse 5054384 4712247 -6.77%
BenchmarkRegexpMatchMedium_1K 67612 73490 +8.69%
BenchmarkRegexpMatchHard_1K 107321 117495 +9.48%
BenchmarkRevcomp 733270055 659352759 -10.08%
BenchmarkTemplate 109304977 112819078 +3.21%
Comparing Go 1.2 against this CL on an Intel Xeon E5520:
BenchmarkBinaryTree17 5986953594 5393104469 -9.92%
BenchmarkFannkuch11 4861139174 4327177615 -10.98%
BenchmarkGobDecode 11830997 11235272 -5.04%
BenchmarkGobEncode 6608722 6959837 +5.31%
BenchmarkGzip 661875826 671769097 +1.49%
BenchmarkGunzip 138630019 141096376 +1.78%
BenchmarkHTTPClientServer 71534 73610 +2.90%
BenchmarkJSONEncode 30393609 31796106 +4.61%
BenchmarkJSONDecode 139645860 114239137 -18.19%
BenchmarkMandelbrot200 5988660 6003077 +0.24%
BenchmarkGoParse 6974092 6405995 -8.15%
BenchmarkRegexpMatchMedium_1K 111331 113895 +2.30%
BenchmarkRegexpMatchHard_1K 165961 168420 +1.48%
BenchmarkRevcomp 995049292 892319935 -10.32%
BenchmarkTemplate 145623363 148998339 +2.32%
Fixes#8036.
LGTM=khr
R=golang-codereviews, josharian, khr
CC=golang-codereviews, iant, r
https://golang.org/cl/99660044
[Same as CL 102820043 except applied changes to 6g/gsubr.c
also to 5g/gsubr.c and 8g/gsubr.c. The problem I had last night
trying to do that was that 8g's copy of nodarg has different
(but equivalent) control flow and I was pasting the new code
into the wrong place.]
Description from CL 102820043:
The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.
Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.
The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.
Fixes#8097.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant, khr, r
https://golang.org/cl/103750043
Breaks 386 and arm builds.
The obvious reason is that this CL only edited 6g/gsubr.c
and failed to edit 5g/gsubr.c and 8g/gsubr.c.
However, the obvious CL applying the same edit to those
files (CL 101900043) causes mysterious build failures
in various of the standard package tests, usually involving
reflect. Something deep and subtle is broken but only on
the 32-bit systems.
Undo this CL for now.
««« original CL description
cmd/gc: fix x=x crash
The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.
Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.
The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.
Fixes#8097.
LGTM=r, khr
R=golang-codereviews, r, khr
CC=golang-codereviews, iant
https://golang.org/cl/102820043
»»»
TBR=r
CC=golang-codereviews, khr
https://golang.org/cl/95660043
The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.
Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.
The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.
Fixes#8097.
LGTM=r, khr
R=golang-codereviews, r, khr
CC=golang-codereviews, iant
https://golang.org/cl/102820043
This matters for NaCl, which seems to swamp my 4-core MacBook Pro otherwise.
It's not a correctness problem, just a usability problem.
LGTM=bradfitz
R=bradfitz
CC=golang-codereviews
https://golang.org/cl/98600046
CL 51010045 fixed the first one of these:
cmd/gc: return canonical Node* from temp
For historical reasons, temp was returning a copy
of the created Node*, not the original Node*.
This meant that if analysis recorded information in the
returned node (for example, n->addrtaken = 1), the
analysis would not show up on the original Node*, the
one kept in fn->dcl and consulted during liveness
bitmap creation.
Correct this, and watch for it when setting addrtaken.
Fixes#7083.
R=khr, dave, minux.ma
CC=golang-codereviews
https://golang.org/cl/51010045
CL 53200043 fixed the second:
cmd/gc: fix race build
Missed this case in CL 51010045.
TBR=khr
CC=golang-codereviews
https://golang.org/cl/53200043
This CL fixes the third. There are only three nod(OXXX, ...)
calls in sinit.c, so maybe we're done. Embarassing that it
took three CLs to find all three.
Fixes#8028.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant
https://golang.org/cl/100800046
In the first very rough draft of the reordering code
that was introduced in the Go 1.3 cycle, the pre-allocated
temporary for a ... argument was held in n->right.
It moved to n->alloc but the code avoiding n->right
was left behind in order.c. In copy(x, <-c), the receive
is in n->right and must be processed. Delete the special
case code, removing the bug.
Fixes#8039.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/100820044
The code cannot have worked before, because it was
trying to use the old value in a range check for the new
type, which might have a different representation
(hence the 'internal compiler error').
Fixes#8073.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/98630045
Add nacl.bash, the NaCl version of all.bash.
It's a separate script because it builds a variant of package syscall
with a large zip file embedded in it, containing all the input files
needed for tests.
Disable various tests new since the last round, mostly the ones using os/exec.
Fixes#7945.
LGTM=dave
R=golang-codereviews, remyoudompheng, dave, bradfitz
CC=golang-codereviews
https://golang.org/cl/100590044
I don't know when the bug was fixed, but empirically it was.
Make sure it stays fixed by adding a test.
Fixes#7884.
LGTM=adg
R=golang-codereviews, adg
CC=golang-codereviews
https://golang.org/cl/93500043
The float32 const conversion used to round to float64
and then use the hardware to round to float32.
Even though there was a range check before this
conversion, the double rounding introduced inaccuracy:
the round to float64 might round the value further away
from the float32 range, reaching a float64 value that
could not actually be rounded to float32. The hardware
appears to give us 0 in that case, but it is probably undefined.
Double rounding also meant that the wrong value might
be used for certain border cases.
Do the rounding the float32 ourselves, just as we already
did the rounding to float64. This makes the conversion
precise and also makes the conversion match the range check.
Finally, add some code to print very large (bigger than float64)
floating point constants in decimal floating point notation instead
of falling back to the precise but human-unreadable binary floating
point notation.
Fixes#8015.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/100580044
The temporary-introducing pass was not recursing
into the argumnt of a receive operation.
Fixes#8011.
LGTM=r
R=golang-codereviews, r
CC=golang-codereviews, iant, khr
https://golang.org/cl/91540043
The introduction of temporaries in order.c was not
quite right for two corner cases:
1) The rewrite that pushed new variables on the lhs of
a receive into the body of the case was dropping the
declaration of the variables. If the variables escape,
the declaration is what allocates them.
Caught by escape analysis sanity check.
In fact the declarations should move into the body
always, so that we only allocate if the corresponding
case is selected. Do that. (This is an optimization that
was already present in Go 1.2. The new order code just
made it stop working.)
Fixes#7997.
2) The optimization to turn a single-recv select into
an ordinary receive assumed it could take the address
of the destination; not so if the destination is _.
Fixes#7998.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/100480043
The GC program describing a data structure sometimes trusts the
pointer base type and other times does not (if not, the garbage collector
must fall back on per-allocation type information stored in the heap).
Make the scanning of a pointer in an interface do the same.
This fixes a crash in a particular use of reflect.SliceHeader.
Fixes#8004.
LGTM=khr
R=golang-codereviews, khr
CC=0xe2.0x9a.0x9b, golang-codereviews, iant, r
https://golang.org/cl/100470045
Globals, function arguments, and results are special cases in
registerization.
Globals must be flushed aggressively, because nearly any
operation can cause a panic, and the recovery code must see
the latest values. Globals also must be loaded aggressively,
because nearly any store through a pointer might be updating a
global: the compiler cannot see all the "address of"
operations on globals, especially exported globals. To
accomplish this, mark all globals as having their address
taken, which effectively disables registerization.
If a function contains a defer statement, the function results
must be flushed aggressively, because nearly any operation can
cause a panic, and the deferred code may call recover, causing
the original function to return the current values of its
function results. To accomplish this, mark all function
results as having their address taken if the function contains
any defer statements. This causes not just aggressive flushing
but also aggressive loading. The aggressive loading is
overkill but the best we can do in the current code.
Function arguments must be considered live at all safe points
in a function, because garbage collection always preserves
them: they must be up-to-date in order to be preserved
correctly. Accomplish this by marking them live at all call
sites. An earlier attempt at this marked function arguments as
having their address taken, which disabled registerization
completely, making programs slower. This CL's solution allows
registerization while preserving safety. The benchmark speedup
is caused by being able to registerize again (the earlier CL
lost the same amount).
benchmark old ns/op new ns/op delta
BenchmarkEqualPort32 61.4 56.0 -8.79%
benchmark old MB/s new MB/s speedup
BenchmarkEqualPort32 521.56 570.97 1.09x
Fixes#1304. (again)
Fixes#7944. (again)
Fixes#7984.
Fixes#7995.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant, r
https://golang.org/cl/97500044
The inputs to a function are marked live at all times in the
liveness bitmaps, so that the garbage collector will not free
the things they point at and reuse the pointers, so that the
pointers shown in stack traces are guaranteed not to have
been recycled.
Unfortunately, no one told the register optimizer that the
inputs need to be preserved at all call sites. If a function
is done with a particular input value, the optimizer will stop
preserving it across calls. For single-word values this just
means that the value recorded might be stale. For multi-word
values like slices, the value recorded could be only partially stale:
it can happen that, say, the cap was updated but not the len,
or that the len was updated but not the base pointer.
Either of these possibilities (and others) would make the
garbage collector misinterpret memory, leading to memory
corruption.
This came up in a real program, in which the garbage collector's
'slice len ≤ slice cap' check caught the inconsistency.
Fixes#7944.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, khr
https://golang.org/cl/100370045
This is joint work with Daniel Morsing.
In order for the register allocator to alias two variables, they must have the same width, stack offset, and etype. Code generation was altering a variable's etype in a few places. This prevented the variable from being moved to a register, which in turn prevented peephole optimization. This failure to alias was very common, with almost 23,000 instances just running make.bash.
This phenomenon was not visible in the register allocation debug output because the variables that failed to alias had the same name. The debugging-only change to bits.c fixes this by printing the variable number with its name.
This CL fixes the source of all etype mismatches for 6g, all but one case for 8g, and depressingly few cases for 5g. (I believe that extending CL 6819083 to 5g is a prerequisite.) Fixing the remaining cases in 8g and 5g is work for the future.
The etype mismatch fixes are:
* [gc] Slicing changed the type of the base pointer into a uintptr in order to perform arithmetic on it. Instead, support addition directly on pointers.
* [*g] OSPTR was giving type uintptr to slice base pointers; undo that. This arose, for example, while compiling copy(dst, src).
* [8g] 64 bit float conversion was assigning int64 type during codegen, overwriting the existing uint64 type.
Note that some etype mismatches are appropriate, such as a struct with a single field or an array with a single element.
With these fixes, the number of registerizations that occur while running make.bash for 6g increases ~10%. Hello world binary size shrinks ~1.5%. Running all benchmarks in the standard library show performance improvements ranging from nominal to substantive (>10%); a full comparison using 6g on my laptop is available at https://gist.github.com/josharian/8f9b5beb46667c272064. The microbenchmarks must be taken with a grain of salt; see issue 7920. The few benchmarks that show real regressions are likely due to issue 7920. I manually examined the generated code for the top few regressions and none had any assembly output changes. The few benchmarks that show extraordinary improvements are likely also due to issue 7920.
Performance results from 8g appear similar to 6g.
5g shows no performance improvements. This is not surprising, given the discussion above.
Update #7316
LGTM=rsc
R=rsc, daniel.morsing, bradfitz
CC=dave, golang-codereviews
https://golang.org/cl/91850043
Before we used line 1 of the first source file.
This should be clearer.
Fixes#4388.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/92250044
If the ... element type contained no pointers,
then the escape analysis did not track the ... itself.
This manifested in an escaping ...byte being treated
as non-escaping.
Fixes#7934.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/100310043
The register allocator decides which variables should be placed into registers by charging for each load/store and crediting for each use, and then selecting an allocation with minimal cost. NOPs will be eliminated, however, so using a variable in a NOP should not generate credit.
Issue 7867 arises from attempted registerization of multi-word variables because they are used in NOPs. By not crediting for that use, they will no longer be considered for registerization.
This fix could theoretically lead to better register allocation, but NOPs are rare relative to other instructions.
Fixes#7867.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/94810044
Variables declared with 'var' have no sym->def.
Fixes#7794.
LGTM=rsc
R=golang-codereviews, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/88360043
The new code is adapted from the Go 1.2 nosplit code,
but it does not have the bug reported in issue 7623:
g% go run nosplit.go
g% go1.2 run nosplit.go
BUG
rejected incorrectly:
main 0 call f; f 120
linker output:
# _/tmp/go-test-nosplit021064539
main.main: nosplit stack overflow
120 guaranteed after split check in main.main
112 on entry to main.f
-8 after main.f uses 120
g%
Fixes#6931.
Fixes#7623.
LGTM=iant
R=golang-codereviews, iant, ality
CC=golang-codereviews, r
https://golang.org/cl/88190043
Trying to make GODEBUG=gcdead=1 work with liveness
and in particular ambiguously live variables.
1. In the liveness computation, mark all ambiguously live
variables as live for the entire function, except the entry.
They are zeroed directly after entry, and we need them not
to be poisoned thereafter.
2. In the liveness computation, compute liveness (and deadness)
for all parameters, not just pointer-containing parameters.
Otherwise gcdead poisons untracked scalar parameters and results.
3. Fix liveness debugging print for -live=2 to use correct bitmaps.
(Was not updated for compaction during compaction CL.)
4. Correct varkill during map literal initialization.
Was killing the map itself instead of the inserted value temp.
5. Disable aggressive varkill cleanup for call arguments if
the call appears in a defer or go statement.
6. In the garbage collector, avoid bug scanning empty
strings. An empty string is two zeros. The multiword
code only looked at the first zero and then interpreted
the next two bits in the bitmap as an ordinary word bitmap.
For a string the bits are 11 00, so if a live string was zero
length with a 0 base pointer, the poisoning code treated
the length as an ordinary word with code 00, meaning it
needed poisoning, turning the string into a poison-length
string with base pointer 0. By the same logic I believe that
a live nil slice (bits 11 01 00) will have its cap poisoned.
Always scan full multiword struct.
7. In the runtime, treat both poison words (PoisonGC and
PoisonStack) as invalid pointers that warrant crashes.
Manual testing as follows:
- Create a script called gcdead on your PATH containing:
#!/bin/bash
GODEBUG=gcdead=1 GOGC=10 GOTRACEBACK=2 exec "$@"
- Now you can build a test and then run 'gcdead ./foo.test'.
- More importantly, you can run 'go test -short -exec gcdead std'
to run all the tests.
Fixes#7676.
While here, enable the precise scanning of slices, since that was
disabled due to bugs like these. That now works, both with and
without gcdead.
Fixes#7549.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83410044
1. Use n->alloc, not n->left, to hold the allocated temp being
passed from orderstmt/orderexpr to walk.
2. Treat method values the same as closures.
3. Use killed temporary for composite literal passed to
non-escaping function argument.
4. Clean temporaries promptly in if and for statements.
5. Clean temporaries promptly in select statements.
As part of this, move all the temporary-generating logic
out of select.c into order.c, so that the temporaries can
be reclaimed.
With the new temporaries, can re-enable the 1-entry
select optimization. Fixes issue 7672.
While we're here, fix a 1-line bug in select processing
turned up by the new liveness test (but unrelated; select.c:72).
Fixes#7686.
6. Clean temporaries (but not particularly promptly) in switch
and range statements.
7. Clean temporary used during convT2E/convT2I.
8. Clean temporaries promptly during && and || expressions.
---
CL 81940043 reduced the number of ambiguously live temps
in the godoc binary from 860 to 711.
CL 83090046 reduced the number from 711 to 121.
This CL reduces the number from 121 to 23.
15 the 23 that remain are in fact ambiguously live.
The final 8 could be fixed but are not trivial and
not common enough to warrant work at this point
in the release cycle.
These numbers only count ambiguously live temps,
not ambiguously live user-declared variables.
There are 18 such variables in the godoc binary after this CL,
so a total of 41 ambiguously live temps or user-declared
variables.
The net effect is that zeroing anything on entry to a function
should now be a rare event, whereas earlier it was the
common case.
This is good enough for Go 1.3, and probably good
enough for future releases too.
Fixes#7345.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83000048
1. In functions with heap-allocated result variables or with
defer statements, the return sequence requires more than
just a single RET instruction. There is an optimization that
arranges for all returns to jump to a single copy of the return
epilogue in this case. Unfortunately, that optimization is
fundamentally incompatible with PC-based liveness information:
it takes PCs at many different points in the function and makes
them all land at one PC, making the combined liveness information
at that target PC a mess. Disable this optimization, so that each
return site gets its own copy of the 'call deferreturn' and the
copying of result variables back from the heap.
This removes quite a few spurious 'ambiguously live' variables.
2. Let orderexpr allocate temporaries that are passed by address
to a function call and then die on return, so that we can arrange
an appropriate VARKILL.
2a. Do this for ... slices.
2b. Do this for closure structs.
2c. Do this for runtime.concatstring, which is the implementation
of large string additions. Change representation of OADDSTR to
an explicit list in typecheck to avoid reconstructing list in both
walk and order.
3. Let orderexpr allocate the temporary variable copies used for
range loops, so that they can be killed when the loop is over.
Similarly, let it allocate the temporary holding the map iterator.
CL 81940043 reduced the number of ambiguously live temps
in the godoc binary from 860 to 711.
This CL reduces the number to 121. Still more to do, but another
good checkpoint.
Update #7345
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83090046
The new channel and map runtime routines take pointers
to values, typically temporaries. Without help, the compiler
cannot tell when those temporaries stop being needed,
because it isn't sure what happened to the pointer.
Arrange to insert explicit VARKILL instructions for these
temporaries so that the liveness analysis can avoid seeing
them as "ambiguously live".
The change is made in order.c, which was already in charge of
introducing temporaries to preserve the order-of-evaluation
guarantees. Now its job has expanded to include introducing
temporaries as needed by runtime routines, and then also
inserting the VARKILL annotations for all these temporaries,
so that their lifetimes can be shortened.
In order to do its job for the map runtime routines, order.c arranges
that all map lookups or map assignments have the form:
x = m[k]
x, y = m[k]
m[k] = x
where x, y, and k are simple variables (often temporaries).
Likewise, receiving from a channel is now always:
x = <-c
In order to provide the map guarantee, order.c is responsible for
rewriting x op= y into x = x op y, so that m[k] += z becomes
t = m[k]
t2 = t + z
m[k] = t2
While here, fix a few bugs in order.c's traversal: it was failing to
walk into select and switch case bodies, so order of evaluation
guarantees were not preserved in those situations.
Added tests to test/reorder2.go.
Fixes#7671.
In gc/popt's temporary-merging optimization, allow merging
of temporaries with their address taken as long as the liveness
ranges do not intersect. (There is a good chance of that now
that we have VARKILL annotations to limit the liveness range.)
Explicitly killing temporaries cuts the number of ambiguously
live temporaries that must be zeroed in the godoc binary from
860 to 711, or -17%. There is more work to be done, but this
is a good checkpoint.
Update #7345
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/81940043