Now the only difference between dev.cc and dev.garbage
is the runtime conversion on the one side and the
garbage collection on the other. They both have the
same set of changes from default and dev.power64.
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/172570043
Now each C printf, Go print, or Go println is guaranteed
not to be interleaved with other calls of those functions.
This should help when debugging concurrent failures.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/169120043
On power64x, this one line in live.go reports that t is live
because of missing optimization passes. This isn't what this
test is trying to test, so shuffle bad40 so that it still
accomplishes the intent of the test without also depending on
optimization.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/167110043
A write *p = x that needs a write barrier (not all do)
now turns into runtime.writebarrierptr(p, x)
or one of the other variants.
The write barrier implementations are trivial.
The goal here is to emit the calls in the correct places
and to incur the cost of those function calls in the Go 1.4 cycle.
Performance on the Go 1 benchmark suite below.
Remember, the goal is to slow things down (and be correct).
We will look into optimizations in separate CLs, as part of
the process of comparing Go 1.3 against tip in order to make
sure Go 1.4 runs at least as fast as Go 1.3.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3118336716 3452876110 +10.73%
BenchmarkFannkuch11 3184497677 3211552284 +0.85%
BenchmarkFmtFprintfEmpty 89.9 107 +19.02%
BenchmarkFmtFprintfString 236 287 +21.61%
BenchmarkFmtFprintfInt 246 278 +13.01%
BenchmarkFmtFprintfIntInt 395 458 +15.95%
BenchmarkFmtFprintfPrefixedInt 343 378 +10.20%
BenchmarkFmtFprintfFloat 477 525 +10.06%
BenchmarkFmtManyArgs 1446 1707 +18.05%
BenchmarkGobDecode 14398047 14685958 +2.00%
BenchmarkGobEncode 12557718 12947104 +3.10%
BenchmarkGzip 453462345 472413285 +4.18%
BenchmarkGunzip 114226016 115127398 +0.79%
BenchmarkHTTPClientServer 114689 112122 -2.24%
BenchmarkJSONEncode 24914536 26135942 +4.90%
BenchmarkJSONDecode 86832877 103620289 +19.33%
BenchmarkMandelbrot200 4833452 4898780 +1.35%
BenchmarkGoParse 4317976 4835474 +11.98%
BenchmarkRegexpMatchEasy0_32 150 166 +10.67%
BenchmarkRegexpMatchEasy0_1K 393 402 +2.29%
BenchmarkRegexpMatchEasy1_32 125 142 +13.60%
BenchmarkRegexpMatchEasy1_1K 1010 1236 +22.38%
BenchmarkRegexpMatchMedium_32 232 301 +29.74%
BenchmarkRegexpMatchMedium_1K 76963 102721 +33.47%
BenchmarkRegexpMatchHard_32 3833 5463 +42.53%
BenchmarkRegexpMatchHard_1K 119668 161614 +35.05%
BenchmarkRevcomp 763449047 706768534 -7.42%
BenchmarkTemplate 124954724 134834549 +7.91%
BenchmarkTimeParse 517 511 -1.16%
BenchmarkTimeFormat 501 514 +2.59%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 53.31 52.26 0.98x
BenchmarkGobEncode 61.12 59.28 0.97x
BenchmarkGzip 42.79 41.08 0.96x
BenchmarkGunzip 169.88 168.55 0.99x
BenchmarkJSONEncode 77.89 74.25 0.95x
BenchmarkJSONDecode 22.35 18.73 0.84x
BenchmarkGoParse 13.41 11.98 0.89x
BenchmarkRegexpMatchEasy0_32 213.30 191.72 0.90x
BenchmarkRegexpMatchEasy0_1K 2603.92 2542.74 0.98x
BenchmarkRegexpMatchEasy1_32 254.00 224.93 0.89x
BenchmarkRegexpMatchEasy1_1K 1013.53 827.98 0.82x
BenchmarkRegexpMatchMedium_32 4.30 3.31 0.77x
BenchmarkRegexpMatchMedium_1K 13.30 9.97 0.75x
BenchmarkRegexpMatchHard_32 8.35 5.86 0.70x
BenchmarkRegexpMatchHard_1K 8.56 6.34 0.74x
BenchmarkRevcomp 332.92 359.62 1.08x
BenchmarkTemplate 15.53 14.39 0.93x
LGTM=rlh
R=rlh
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/136380043
created panic1.go just so diffs were available.
After this CL is in, I'd like to move panic.go -> defer.go
and panic1.go -> panic.go.
LGTM=rsc
R=rsc, khr
CC=golang-codereviews
https://golang.org/cl/133530045
We need to change the interface value representation for
concurrent garbage collection, so that there is no ambiguity
about whether the data word holds a pointer or scalar.
This CL does NOT make any representation changes.
Instead, it removes representation assumptions from
various pieces of code throughout the tree.
The isdirectiface function in cmd/gc/subr.c is now
the only place that decides that policy.
The policy propagates out from there in the reflect
metadata, as a new flag in the internal kind value.
A follow-up CL will change the representation by
changing the isdirectiface function. If that CL causes
problems, it will be easy to roll back.
Update #8405.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/129090043
This change introduces gomallocgc, a Go clone of mallocgc.
Only a few uses have been moved over, so there are still
lots of uses from C. Many of these C uses will be moved
over to Go (e.g. in slice.goc), but probably not all.
What should remain of C's mallocgc is an open question.
LGTM=rsc, dvyukov
R=rsc, khr, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/108840046
benchmark old ns/op new ns/op delta
BenchmarkSelectUncontended 220 165 -25.00%
BenchmarkSelectContended 209 161 -22.97%
BenchmarkSelectProdCons 1042 904 -13.24%
But more importantly this change will allow
to get rid of free function in runtime.
Fixes#6494.
LGTM=rsc, khr
R=golang-codereviews, rsc, dominik.honnef, khr
CC=golang-codereviews, remyoudompheng
https://golang.org/cl/107670043
The 'address taken' bit in a function variable was not
propagating into the inlined copies, causing incorrect
liveness information.
LGTM=dsymonds, bradfitz
R=golang-codereviews, bradfitz
CC=dsymonds, golang-codereviews, iant, khr, r
https://golang.org/cl/96670046
[Same as CL 102820043 except applied changes to 6g/gsubr.c
also to 5g/gsubr.c and 8g/gsubr.c. The problem I had last night
trying to do that was that 8g's copy of nodarg has different
(but equivalent) control flow and I was pasting the new code
into the wrong place.]
Description from CL 102820043:
The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.
Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.
The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.
Fixes#8097.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant, khr, r
https://golang.org/cl/103750043
Breaks 386 and arm builds.
The obvious reason is that this CL only edited 6g/gsubr.c
and failed to edit 5g/gsubr.c and 8g/gsubr.c.
However, the obvious CL applying the same edit to those
files (CL 101900043) causes mysterious build failures
in various of the standard package tests, usually involving
reflect. Something deep and subtle is broken but only on
the 32-bit systems.
Undo this CL for now.
««« original CL description
cmd/gc: fix x=x crash
The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.
Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.
The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.
Fixes#8097.
LGTM=r, khr
R=golang-codereviews, r, khr
CC=golang-codereviews, iant
https://golang.org/cl/102820043
»»»
TBR=r
CC=golang-codereviews, khr
https://golang.org/cl/95660043
The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.
Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.
The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.
Fixes#8097.
LGTM=r, khr
R=golang-codereviews, r, khr
CC=golang-codereviews, iant
https://golang.org/cl/102820043
Trying to make GODEBUG=gcdead=1 work with liveness
and in particular ambiguously live variables.
1. In the liveness computation, mark all ambiguously live
variables as live for the entire function, except the entry.
They are zeroed directly after entry, and we need them not
to be poisoned thereafter.
2. In the liveness computation, compute liveness (and deadness)
for all parameters, not just pointer-containing parameters.
Otherwise gcdead poisons untracked scalar parameters and results.
3. Fix liveness debugging print for -live=2 to use correct bitmaps.
(Was not updated for compaction during compaction CL.)
4. Correct varkill during map literal initialization.
Was killing the map itself instead of the inserted value temp.
5. Disable aggressive varkill cleanup for call arguments if
the call appears in a defer or go statement.
6. In the garbage collector, avoid bug scanning empty
strings. An empty string is two zeros. The multiword
code only looked at the first zero and then interpreted
the next two bits in the bitmap as an ordinary word bitmap.
For a string the bits are 11 00, so if a live string was zero
length with a 0 base pointer, the poisoning code treated
the length as an ordinary word with code 00, meaning it
needed poisoning, turning the string into a poison-length
string with base pointer 0. By the same logic I believe that
a live nil slice (bits 11 01 00) will have its cap poisoned.
Always scan full multiword struct.
7. In the runtime, treat both poison words (PoisonGC and
PoisonStack) as invalid pointers that warrant crashes.
Manual testing as follows:
- Create a script called gcdead on your PATH containing:
#!/bin/bash
GODEBUG=gcdead=1 GOGC=10 GOTRACEBACK=2 exec "$@"
- Now you can build a test and then run 'gcdead ./foo.test'.
- More importantly, you can run 'go test -short -exec gcdead std'
to run all the tests.
Fixes#7676.
While here, enable the precise scanning of slices, since that was
disabled due to bugs like these. That now works, both with and
without gcdead.
Fixes#7549.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83410044
1. Use n->alloc, not n->left, to hold the allocated temp being
passed from orderstmt/orderexpr to walk.
2. Treat method values the same as closures.
3. Use killed temporary for composite literal passed to
non-escaping function argument.
4. Clean temporaries promptly in if and for statements.
5. Clean temporaries promptly in select statements.
As part of this, move all the temporary-generating logic
out of select.c into order.c, so that the temporaries can
be reclaimed.
With the new temporaries, can re-enable the 1-entry
select optimization. Fixes issue 7672.
While we're here, fix a 1-line bug in select processing
turned up by the new liveness test (but unrelated; select.c:72).
Fixes#7686.
6. Clean temporaries (but not particularly promptly) in switch
and range statements.
7. Clean temporary used during convT2E/convT2I.
8. Clean temporaries promptly during && and || expressions.
---
CL 81940043 reduced the number of ambiguously live temps
in the godoc binary from 860 to 711.
CL 83090046 reduced the number from 711 to 121.
This CL reduces the number from 121 to 23.
15 the 23 that remain are in fact ambiguously live.
The final 8 could be fixed but are not trivial and
not common enough to warrant work at this point
in the release cycle.
These numbers only count ambiguously live temps,
not ambiguously live user-declared variables.
There are 18 such variables in the godoc binary after this CL,
so a total of 41 ambiguously live temps or user-declared
variables.
The net effect is that zeroing anything on entry to a function
should now be a rare event, whereas earlier it was the
common case.
This is good enough for Go 1.3, and probably good
enough for future releases too.
Fixes#7345.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83000048
1. In functions with heap-allocated result variables or with
defer statements, the return sequence requires more than
just a single RET instruction. There is an optimization that
arranges for all returns to jump to a single copy of the return
epilogue in this case. Unfortunately, that optimization is
fundamentally incompatible with PC-based liveness information:
it takes PCs at many different points in the function and makes
them all land at one PC, making the combined liveness information
at that target PC a mess. Disable this optimization, so that each
return site gets its own copy of the 'call deferreturn' and the
copying of result variables back from the heap.
This removes quite a few spurious 'ambiguously live' variables.
2. Let orderexpr allocate temporaries that are passed by address
to a function call and then die on return, so that we can arrange
an appropriate VARKILL.
2a. Do this for ... slices.
2b. Do this for closure structs.
2c. Do this for runtime.concatstring, which is the implementation
of large string additions. Change representation of OADDSTR to
an explicit list in typecheck to avoid reconstructing list in both
walk and order.
3. Let orderexpr allocate the temporary variable copies used for
range loops, so that they can be killed when the loop is over.
Similarly, let it allocate the temporary holding the map iterator.
CL 81940043 reduced the number of ambiguously live temps
in the godoc binary from 860 to 711.
This CL reduces the number to 121. Still more to do, but another
good checkpoint.
Update #7345
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83090046
The new channel and map runtime routines take pointers
to values, typically temporaries. Without help, the compiler
cannot tell when those temporaries stop being needed,
because it isn't sure what happened to the pointer.
Arrange to insert explicit VARKILL instructions for these
temporaries so that the liveness analysis can avoid seeing
them as "ambiguously live".
The change is made in order.c, which was already in charge of
introducing temporaries to preserve the order-of-evaluation
guarantees. Now its job has expanded to include introducing
temporaries as needed by runtime routines, and then also
inserting the VARKILL annotations for all these temporaries,
so that their lifetimes can be shortened.
In order to do its job for the map runtime routines, order.c arranges
that all map lookups or map assignments have the form:
x = m[k]
x, y = m[k]
m[k] = x
where x, y, and k are simple variables (often temporaries).
Likewise, receiving from a channel is now always:
x = <-c
In order to provide the map guarantee, order.c is responsible for
rewriting x op= y into x = x op y, so that m[k] += z becomes
t = m[k]
t2 = t + z
m[k] = t2
While here, fix a few bugs in order.c's traversal: it was failing to
walk into select and switch case bodies, so order of evaluation
guarantees were not preserved in those situations.
Added tests to test/reorder2.go.
Fixes#7671.
In gc/popt's temporary-merging optimization, allow merging
of temporaries with their address taken as long as the liveness
ranges do not intersect. (There is a good chance of that now
that we have VARKILL annotations to limit the liveness range.)
Explicitly killing temporaries cuts the number of ambiguously
live temporaries that must be zeroed in the godoc binary from
860 to 711, or -17%. There is more work to be done, but this
is a good checkpoint.
Update #7345
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/81940043
1. On entry to a function, only zero the ambiguously live stack variables.
Before, we were zeroing all stack variables containing pointers.
The zeroing is pretty inefficient right now (issue 7624), but there are also
too many stack variables detected as ambiguously live (issue 7345),
and that must be addressed before deciding how to improve the zeroing code.
(Changes in 5g/ggen.c, 6g/ggen.c, 8g/ggen.c, gc/pgen.c)
Fixes#7647.
2. Make the regopt word-based liveness analysis preserve the
whole-variable liveness property expected by the garbage collection
bitmap liveness analysis. That is, if the regopt liveness decides that
one word in a struct needs to be preserved, make sure it preserves
the entire struct. This is particularly important for multiword values
such as strings, slices, and interfaces, in which all the words need
to be present in order to understand the meaning.
(Changes in 5g/reg.c, 6g/reg.c, 8g/reg.c.)
Fixes#7591.
3. Make the regopt word-based liveness analysis treat a variable
as having its address taken - which makes it preserved across
all future calls - whenever n->addrtaken is set, for consistency
with the gc bitmap liveness analysis, even if there is no machine
instruction actually taking the address. In this case n->addrtaken
is incorrect (a nicer way to put it is overconservative), and ideally
there would be no such cases, but they can happen and the two
analyses need to agree.
(Changes in 5g/reg.c, 6g/reg.c, 8g/reg.c; test in bug484.go.)
Fixes crashes found by turning off "zero everything" in step 1.
4. Remove spurious VARDEF annotations. As the comment in
gc/pgen.c explains, the VARDEF must immediately precede
the initialization. It cannot be too early, and it cannot be too late.
In particular, if a function call sits between the VARDEF and the
actual machine instructions doing the initialization, the variable
will be treated as live during that function call even though it is
uninitialized, leading to problems.
(Changes in gc/gen.c; test in live.go.)
Fixes crashes found by turning off "zero everything" in step 1.
5. Do not treat loading the address of a wide value as a signal
that the value must be initialized. Instead depend on the existence
of a VARDEF or the first actual read/write of a word in the value.
If the load is in order to pass the address to a function that does
the actual initialization, treating the load as an implicit VARDEF
causes the same problems as described in step 4.
The alternative is to arrange to zero every such value before
passing it to the real initialization function, but this is a much
easier and more efficient change.
(Changes in gc/plive.c.)
Fixes crashes found by turning off "zero everything" in step 1.
6. Treat wide input parameters with their address taken as
initialized on entry to the function. Otherwise they look
"ambiguously live" and we will try to emit code to zero them.
(Changes in gc/plive.c.)
Fixes crashes found by turning off "zero everything" in step 1.
7. An array of length 0 has no pointers, even if the element type does.
Without this change, the zeroing code complains when asked to
clear a 0-length array.
(Changes in gc/reflect.c.)
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80160044
The VARDEF placement must be before the initialization
but after any final use. If you have something like s = ... using s ...
the rhs must be evaluated, then the VARDEF, then the lhs
assigned.
There is a large comment in pgen.c on gvardef explaining
this in more detail.
This CL also includes Ian's suggestions from earlier CLs,
namely commenting the use of mode in link.h and fixing
the precedence of the ~r check in dcl.c.
This CL enables the check that if liveness analysis decides
a variable is live on entry to the function, that variable must
be a function parameter (not a result, and not a local variable).
If this check fails, it indicates a bug in the liveness analysis or
in the generated code being analyzed.
The race detector generates invalid code for append(x, y...).
The code declares a temporary t and then uses cap(t) before
initializing t. The new liveness check catches this bug and
stops the compiler from writing out the buggy code.
Consequently, this CL disables the race detector tests in
run.bash until the race detector bug can be fixed
(golang.org/issue/7334).
Except for the race detector bug, the liveness analysis check
does not detect any problems (this CL and the previous CLs
fixed all the detected problems).
The net test still fails with GOGC=0 but the rest of the tests
now pass or time out (because GOGC=0 is so slow).
TBR=iant
CC=golang-codereviews
https://golang.org/cl/64170043
When the liveness code doesn't know a function doesn't return
(but the generated code understands that), the liveness analysis
invents a control flow edge that is not really there, which can cause
variables to seem spuriously live. This is particularly bad when the
variables are uninitialized.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/63720043
The registerization code needs the function to end in a RET,
even if that RET is actually unreachable.
The liveness code needs to avoid such unreachable RETs.
It had a special case for final RET after JMP, but no case
for final RET after UNDEF. Instead of expanding the special
cases, let fixjmp - which already knows what is and is not
reachable definitively - mark the unreachable RET so that
the liveness code can identify it.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/63680043
Any initialization of a variable by a block copy or block zeroing
or by multiple assignments (componentwise copying or zeroing
of a multiword variable) needs to emit a VARDEF. These cases were not.
Fixes#7205.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/63650044
Before, an unnamed return value turned into an ONAME node n with n->sym
named ~anon%d, and n->orig == n.
A blank-named return value turned into an ONAME node n with n->sym
named ~anon%d but n->orig == the original blank n. Code generation and
printing uses n->orig, so that this node formatted as _.
But some code does not use n->orig. In particular the liveness code does
not know about the n->orig convention and so mishandles blank identifiers.
It is possible to fix but seemed better to avoid the confusion entirely.
Now the first kind of node is named ~r%d and the second ~b%d; both have
n->orig == n, so that it doesn't matter whether code uses n or n->orig.
After this change the ->orig field is only used for other kinds of expressions,
not for ONAME nodes.
This requires distinguishing ~b from ~r names in a few places that care.
It fixes a liveness analysis bug without actually changing the liveness code.
TBR=ken2
CC=golang-codereviews
https://golang.org/cl/63630043
This CL makes the bitmaps a little more precise about variables
that have their address taken but for which the address does not
escape to the heap, so that the variables are kept in the stack frame
rather than allocated on the heap.
The code before this CL handled these variables by treating every
return statement as using every such variable and depending on
liveness analysis to essentially treat the variable as live during the
entire function. That approach has false positives and (worse) false
negatives. That is, it's both sloppy and buggy:
func f(b1, b2 bool) { // x live here! (sloppy)
if b2 {
print(0) // x live here! (sloppy)
return
}
var z **int
x := new(int)
*x = 42
z = &x
print(**z) // x live here (conservative)
if b2 {
print(1) // x live here (conservative)
return
}
for {
print(**z) // x not live here (buggy)
}
}
The first two liveness annotations (marked sloppy) are clearly
wrong: x cannot be live if it has not yet been declared.
The last liveness annotation (marked buggy) is also wrong:
x is live here as *z, but because there is no return statement
reachable from this point in the code, the analysis treats x as dead.
This CL changes the liveness calculation to mark such variables
live exactly at points in the code reachable from the variable
declaration. This keeps the conservative decisions but fixes
the sloppy and buggy ones.
The CL also detects ambiguously live variables, those that are
being marked live but may not actually have been initialized,
such as in this example:
func f(b1 bool) {
var z **int
if b1 {
x := new(int)
*x = 42
z = &x
} else {
y := new(int)
*y = 54
z = &y
}
print(**z) // x, y live here (conservative)
}
Since the print statement is reachable from the declaration of x,
x must conservatively be marked live. The same goes for y.
Although both x and y are marked live at the print statement,
clearly only one of them has been initialized. They are both
"ambiguously live".
These ambiguously live variables cause problems for garbage
collection: the collector cannot ignore them but also cannot
depend on them to be initialized to valid pointer values.
Ambiguously live variables do not come up too often in real code,
but recent changes to the way map and interface runtime functions
are invoked has created a large number of ambiguously live
compiler-generated temporary variables. The next CL will adjust
the analysis to understand these temporaries better, to make
ambiguously live variables fairly rare.
Once ambiguously live variables are rare enough, another CL will
introduce code at the beginning of a function to zero those
slots on the stack. At that point the garbage collector and the
stack copying routines will be able to depend on the guarantee that
if a slot is marked as live in a liveness bitmap, it is initialized.
R=khr
CC=golang-codereviews, iant
https://golang.org/cl/51810043