Make sure dequeueing from a channel queue does not exhibit quadratic time behavior.
Change-Id: Ifb7c709b026f74c7e783610d4914dd92909a441b
Reviewed-on: https://go-review.googlesource.com/1212
Reviewed-by: Russ Cox <rsc@golang.org>
Usage:
fibo <n> compute fibonacci(n), n must be >= 0
fibo -bench benchmark fibonacci computation (takes about 1 min)
Additional flags:
-half add values using two half-digit additions
-opt optimize memory allocation through reuse
-short only print the first 10 digits of very large fibonacci numbers
This change was reviewed in detail as https://codereview.appspot.com/168480043 .
Change-Id: I7c86d49c5508532ea6206d00f424cf2117d2fe41
Reviewed-on: https://go-review.googlesource.com/1211
Reviewed-by: Russ Cox <rsc@golang.org>
When we start work on Gerrit, ppc64 and garbage collection
work will continue in the master branch, not the dev branches.
(We may still use dev branches for other things later, but
these are ready to be merged, and doing it now, before moving
to Git means we don't have to have dev branches working
in the Gerrit workflow on day one.)
TBR=rlh
CC=golang-codereviews
https://golang.org/cl/183140043
640 bytes ought to be enough for anybody.
We'll bring this back down before Go 1.5. That's issue 9214.
TBR=rlh
CC=golang-codereviews
https://golang.org/cl/188730043
The SudoG used to sit on the stack, so it was cheap to allocated
and didn't need to be cleaned up when finished.
For the conversion to Go, we had to move sudog off the stack
for a few reasons, so we added a cache of recently used sudogs
to keep allocation cheap. But we didn't add any of the necessary
cleanup before adding a SudoG to the new cache, and so the cached
SudoGs had stale pointers inside them that have caused all sorts
of awful, hard to debug problems.
CL 155760043 made sure SudoG.elem is cleaned up.
CL 150520043 made sure SudoG.selectdone is cleaned up.
This CL makes sure SudoG.next, SudoG.prev, and SudoG.waitlink
are cleaned up. I should have done this when I did the other two
fields; instead I wasted a week tracking down a leak they caused.
A dangling SudoG.waitlink can point into a sudogcache list that
has been "forgotten" in order to let the GC collect it, but that
dangling .waitlink keeps the list from being collected.
And then the list holding the SudoG with the dangling waitlink
can find itself in the same situation, and so on. We end up
with lists of lists of unusable SudoGs that are still linked into
the object graph and never collected (given the right mix of
non-trivial selects and non-channel synchronization).
More details in golang.org/issue/9110.
Fixes#9110.
LGTM=r
R=r
CC=dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/177870043
This is to reduce the delta between dev.cc and dev.garbage to just garbage collector changes.
These are the files that had merge conflicts and have been edited by hand:
malloc.go
mem_linux.go
mgc.go
os1_linux.go
proc1.go
panic1.go
runtime1.go
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/174180043
Now the only difference between dev.cc and dev.garbage
is the runtime conversion on the one side and the
garbage collection on the other. They both have the
same set of changes from default and dev.power64.
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/172570043
The remaining run-only tests will be migrated to run.go in another CL.
This CL will break the build due to issues 8746 and 8806.
Update #4139
Update #8746
Update #8806
LGTM=rsc
R=rsc, bradfitz, iant
CC=golang-codereviews
https://golang.org/cl/144630044
Now each C printf, Go print, or Go println is guaranteed
not to be interleaved with other calls of those functions.
This should help when debugging concurrent failures.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/169120043
One failing case this removes is:
var bytes = []byte("hello, world")
var copy_bytes = bytes
We could handle this in the compiler, but it requires special
case for a variable that is initialized to the value of a
variable that is initialized to a string literal converted to
[]byte. This seems an unlikely case--it never occurs in the
standrd library--and it seems unnecessary to write the code to
handle it.
If we do want to support this case, one approach is
https://golang.org/cl/171840043.
The other failing cases are of the form
var bx bool
var copy_bx = bx
The compiler used to initialize copy_bx to false. However,
that led to issue 7665, since bx may be initialized in non-Go
code. The compiler no longer assumes that bx must be false,
so copy_bx can not be statically initialized.
We can fix these with https://golang.org/cl/169040043
if we also pass -complete to the compiler as part of this
test. This is OK but it's too late in the release cycle.
Fixes#8746.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/165400043
On power64x, this one line in live.go reports that t is live
because of missing optimization passes. This isn't what this
test is trying to test, so shuffle bad40 so that it still
accomplishes the intent of the test without also depending on
optimization.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/167110043
The remaining failures in this test are because of incomplete
optimization support on power64x. Tracked in issue 9058.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/168130043
All three cases of clearfat were wrong on power64x.
The cases that handle 1032 bytes and up and 32 bytes and up
both use MOVDU (one directly generated in a loop and the other
via duffzero), which leaves the pointer register pointing at
the *last written* address. The generated code was not
accounting for this, so the byte fill loop was re-zeroing the
last zeroed dword, rather than the bytes following the last
zeroed dword. Fix this by simply adding an additional 8 byte
offset to the byte zeroing loop.
The case that handled under 32 bytes was also wrong. It
didn't update the pointer register at all, so the byte zeroing
loop was simply re-zeroing the beginning of region. Again,
the fix is to add an offset to the byte zeroing loop to
account for this.
LGTM=dave, bradfitz
R=rsc, dave, bradfitz
CC=golang-codereviews
https://golang.org/cl/168870043
Originally traceback was only used for printing the stack
when an unexpected signal came in. In that case, the
initial PC is taken from the signal and should be used
unaltered. For the callers, the PC is the return address,
which might be on the line after the call; we subtract 1
to get to the CALL instruction.
Traceback is now used for a variety of things, and for
almost all of those the initial PC is a return address,
whether from getcallerpc, or gp->sched.pc, or gp->syscallpc.
In those cases, we need to subtract 1 from this initial PC,
but the traceback code had a hard rule "never subtract 1
from the initial PC", left over from the signal handling days.
Change gentraceback to take a flag that specifies whether
we are tracing a trap.
Change traceback to default to "starting with a return PC",
which is the overwhelmingly common case.
Add tracebacktrap, like traceback but starting with a trap PC.
Use tracebacktrap in signal handlers.
Fixes#7690.
LGTM=iant, r
R=r, iant
CC=golang-codereviews
https://golang.org/cl/167810044
The test just doubled a certain number of times
and then gave up. On a mostly fast but occasionally
slow machine this may never make the test run
long enough to see the linear growth.
Change test to keep doubling until the first round
takes at least a full second, to reduce the effect of
occasional scheduling or other jitter.
The failure we saw had a time for the first round
of around 100ms.
Note that this test still passes once it sees a linear
effect, even with a very small total time.
The timeout here only applies to how long the execution
must be to support a reported failure.
LGTM=khr
R=khr
CC=golang-codereviews, rlh
https://golang.org/cl/164070043
This brings dev.power64 up-to-date with the current tip of
default. go_bootstrap is still panicking with a bad defer
when initializing the runtime (even on amd64).
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/152570049
This also removes pkg/runtime/traceback_lr.c, which was ported
to Go in an earlier commit and then moved to
runtime/traceback.go.
Reviewer: rsc@golang.org
rsc: LGTM
test16 used to fail with gccgo. The withoutRecoverRecursive
test would have failed in some possible implementations.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/151630043
This brings cmd/gc in line with the spec on this question.
It might break existing code, but that code was not conformant
with the spec.
Credit to Rémy for finding the broken code.
Fixes#6366.
LGTM=r
R=golang-codereviews, r
CC=adonovan, golang-codereviews, gri
https://golang.org/cl/129550043
https://golang.org/cl/152700045/ made it possible for struct literals assigned to globals to use <N> as the RHS. Normally, this is to zero out variables on first use. Because globals are already zero (or their linker initialized value), we just ignored this.
Now that <N> can occur from non-initialization code, we need to emit this code. We don't use <N> for initialization of globals any more, so this shouldn't cause any excessive zeroing.
Fixes#8961.
LGTM=rsc
R=golang-codereviews, rsc
CC=bradfitz, golang-codereviews
https://golang.org/cl/154540044
This fixes the bug in which the linker reports "missing Go
type information" when a -X option refers to a symbol that is
not used.
Fixes#8821.
LGTM=rsc
R=rsc, r
CC=golang-codereviews
https://golang.org/cl/151000043
If there is a leading ·, assume there is a Go prototype and
attach the Go prototype information to the function.
If the function is not called from Go and does not need a
Go prototype, it can be made file-local instead (using name<>(SB)).
This fixes the current BSD build failures, by giving functions like
sync/atomic.StoreUint32 argument stack map information.
Fixes#8753.
LGTM=khr, iant
R=golang-codereviews, iant, khr, bradfitz
CC=golang-codereviews, r, rlh
https://golang.org/cl/142150043
iterdelete's run time varies; occasionally we get unlucky. To reduce spurious failures, average away some of the variation.
On my machine, 8 of 5000 runs (0.15%) failed before this CL. After this CL, there were no failures after 35,000 runs.
I confirmed that this adjusted test still fails before CL 141270043.
LGTM=khr
R=khr
CC=bradfitz, golang-codereviews
https://golang.org/cl/140610043
During anylit run, nodes such as SLICEARR(statictmp, [:])
may be generated and are expected to be found unchanged by
gen_as_init.
In some walks (in particular walkselect), the statement
may be walked again and lowered to its usual form, leading to a
crash.
Fixes#8017.
Fixes#8024.
Fixes#8058.
LGTM=rsc
R=golang-codereviews, dvyukov, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/112080043
Previously it might happen before calling dowidth and
result in a compiler crash.
Fixes#8060.
LGTM=dvyukov, rsc
R=golang-codereviews, dvyukov, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/110980044
A write *p = x that needs a write barrier (not all do)
now turns into runtime.writebarrierptr(p, x)
or one of the other variants.
The write barrier implementations are trivial.
The goal here is to emit the calls in the correct places
and to incur the cost of those function calls in the Go 1.4 cycle.
Performance on the Go 1 benchmark suite below.
Remember, the goal is to slow things down (and be correct).
We will look into optimizations in separate CLs, as part of
the process of comparing Go 1.3 against tip in order to make
sure Go 1.4 runs at least as fast as Go 1.3.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3118336716 3452876110 +10.73%
BenchmarkFannkuch11 3184497677 3211552284 +0.85%
BenchmarkFmtFprintfEmpty 89.9 107 +19.02%
BenchmarkFmtFprintfString 236 287 +21.61%
BenchmarkFmtFprintfInt 246 278 +13.01%
BenchmarkFmtFprintfIntInt 395 458 +15.95%
BenchmarkFmtFprintfPrefixedInt 343 378 +10.20%
BenchmarkFmtFprintfFloat 477 525 +10.06%
BenchmarkFmtManyArgs 1446 1707 +18.05%
BenchmarkGobDecode 14398047 14685958 +2.00%
BenchmarkGobEncode 12557718 12947104 +3.10%
BenchmarkGzip 453462345 472413285 +4.18%
BenchmarkGunzip 114226016 115127398 +0.79%
BenchmarkHTTPClientServer 114689 112122 -2.24%
BenchmarkJSONEncode 24914536 26135942 +4.90%
BenchmarkJSONDecode 86832877 103620289 +19.33%
BenchmarkMandelbrot200 4833452 4898780 +1.35%
BenchmarkGoParse 4317976 4835474 +11.98%
BenchmarkRegexpMatchEasy0_32 150 166 +10.67%
BenchmarkRegexpMatchEasy0_1K 393 402 +2.29%
BenchmarkRegexpMatchEasy1_32 125 142 +13.60%
BenchmarkRegexpMatchEasy1_1K 1010 1236 +22.38%
BenchmarkRegexpMatchMedium_32 232 301 +29.74%
BenchmarkRegexpMatchMedium_1K 76963 102721 +33.47%
BenchmarkRegexpMatchHard_32 3833 5463 +42.53%
BenchmarkRegexpMatchHard_1K 119668 161614 +35.05%
BenchmarkRevcomp 763449047 706768534 -7.42%
BenchmarkTemplate 124954724 134834549 +7.91%
BenchmarkTimeParse 517 511 -1.16%
BenchmarkTimeFormat 501 514 +2.59%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 53.31 52.26 0.98x
BenchmarkGobEncode 61.12 59.28 0.97x
BenchmarkGzip 42.79 41.08 0.96x
BenchmarkGunzip 169.88 168.55 0.99x
BenchmarkJSONEncode 77.89 74.25 0.95x
BenchmarkJSONDecode 22.35 18.73 0.84x
BenchmarkGoParse 13.41 11.98 0.89x
BenchmarkRegexpMatchEasy0_32 213.30 191.72 0.90x
BenchmarkRegexpMatchEasy0_1K 2603.92 2542.74 0.98x
BenchmarkRegexpMatchEasy1_32 254.00 224.93 0.89x
BenchmarkRegexpMatchEasy1_1K 1013.53 827.98 0.82x
BenchmarkRegexpMatchMedium_32 4.30 3.31 0.77x
BenchmarkRegexpMatchMedium_1K 13.30 9.97 0.75x
BenchmarkRegexpMatchHard_32 8.35 5.86 0.70x
BenchmarkRegexpMatchHard_1K 8.56 6.34 0.74x
BenchmarkRevcomp 332.92 359.62 1.08x
BenchmarkTemplate 15.53 14.39 0.93x
LGTM=rlh
R=rlh
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/136380043
This CL adjusts code referring to src/pkg to refer to src.
Immediately after submitting this CL, I will submit
a change doing 'hg mv src/pkg/* src'.
That change will be too large to review with Rietveld
but will contain only the 'hg mv'.
This CL will break the build.
The followup 'hg mv' will fix it.
For more about the move, see golang.org/s/go14nopkg.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/134570043
Increase NOSPLIT reservation from 192 to 384 bytes.
The problem is that the non-Unix systems (Solaris and Windows)
just can't make system calls in a small amount of space,
and then worse they do things that are complex enough
to warrant calling runtime.throw on failure.
We don't have time to rewrite the code to use less stack.
I'm not happy about this, but it's still a small amount.
The good news is that we're doing this to get to only
using copying stacks for stack growth. Once that is true,
we can drop the default stack size from 8k to 4k, which
should more than make up for the bytes we're losing here.
LGTM=r
R=iant, r, bradfitz, aram.h
CC=golang-codereviews
https://golang.org/cl/140350043
The gp->panicwrap adjustment is just fatally flawed.
Now that there is a Panic.argp field, update that instead.
That can be done on entry only, so that unwinding doesn't
need to worry about undoing anything. The wrappers
emit a few more instructions in the prologue but everything
else in the system gets much simpler.
It also fixes (without trying) a broken test I never checked in.
Fixes#7491.
LGTM=khr
R=khr
CC=dvyukov, golang-codereviews, iant, r
https://golang.org/cl/135490044
newstackcall creates a new stack segment, and we want to
be able to throw away all that code.
LGTM=khr
R=khr, iant
CC=dvyukov, golang-codereviews, r
https://golang.org/cl/139270043
created panic1.go just so diffs were available.
After this CL is in, I'd like to move panic.go -> defer.go
and panic1.go -> panic.go.
LGTM=rsc
R=rsc, khr
CC=golang-codereviews
https://golang.org/cl/133530045
In CL 131450043, which raised it to 160,
I'd raise it to 192 if necessary.
Apparently it is necessary on windows/amd64.
One note for those concerned about the growth:
in the old segmented stack world, we wasted this much
space at the bottom of every stack segment.
In the new contiguous stack world, each goroutine has
only one stack segment, so we only waste this much space
once per goroutine. So even raising the limit further might
still be a net savings.
Fixes windows/amd64 build.
TBR=r
CC=golang-codereviews
https://golang.org/cl/132480043
The Go calling convention uses more stack space than C.
On 64-bit systems we've been right up against the limit
(128 bytes, so only 16 words) and doing awful things to
our source code to work around it. Instead of continuing
to do awful things, raise the limit to 160 bytes.
I am prepared to raise the limit to 192 bytes if necessary,
but I think this will be enough.
Should fix current link-time stack overflow errors on
- nacl/arm
- netbsd/amd64
- openbsd/amd64
- solaris/amd64
- windows/amd64
TBR=r
CC=golang-codereviews, iant
https://golang.org/cl/131450043
Before, a slice with cap=0 or a string with len=0 might have its
base pointer pointing beyond the actual slice/string data into
the next block. The collector had to ignore slices and strings with
cap=0 in order to avoid misinterpreting the base pointer.
Now, a slice with cap=0 or a string with len=0 still has a base
pointer pointing into the actual slice/string data, no matter what.
The collector can now always scan the pointer, which means
strings and slices are no longer special.
Fixes#8404.
LGTM=khr, josharian
R=josharian, khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/112570044
Normally, an expression of the form x.f or *y can be reordered
with function calls and communications.
Select is stricter than normal: each channel expression is evaluated
in source order. If you have case <-x.f and case <-foo(), then if the
evaluation of x.f causes a panic, foo must not have been called.
(This is in contrast to an expression like x.f + foo().)
Enforce this stricter ordering.
Fixes#8336.
LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews, r
https://golang.org/cl/126570043
We need to change the interface value representation for
concurrent garbage collection, so that there is no ambiguity
about whether the data word holds a pointer or scalar.
This CL does NOT make any representation changes.
Instead, it removes representation assumptions from
various pieces of code throughout the tree.
The isdirectiface function in cmd/gc/subr.c is now
the only place that decides that policy.
The policy propagates out from there in the reflect
metadata, as a new flag in the internal kind value.
A follow-up CL will change the representation by
changing the isdirectiface function. If that CL causes
problems, it will be easy to roll back.
Update #8405.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/129090043
Credit to Rémy for finding and writing test case.
Fixes#8325.
LGTM=r
R=golang-codereviews, r
CC=dave, golang-codereviews, iant, remyoudompheng
https://golang.org/cl/124950043
This change introduces gomallocgc, a Go clone of mallocgc.
Only a few uses have been moved over, so there are still
lots of uses from C. Many of these C uses will be moved
over to Go (e.g. in slice.goc), but probably not all.
What should remain of C's mallocgc is an open question.
LGTM=rsc, dvyukov
R=rsc, khr, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/108840046
Following CL 68150047, the goos and goarch
variables are not currently set when the GOOS
and GOARCH environment variables are not set.
This made the content of the build tag to be
ignored in this case.
This CL sets goos and goarch to runtime.GOOS
and runtime.GOARCH when the GOOS and GOARCH
environments variables are not set.
LGTM=aram, bradfitz
R=golang-codereviews, aram, gobot, rsc, dave, bradfitz
CC=golang-codereviews, rsc
https://golang.org/cl/112490043
I'm improving gccgo's detection of variables that are only set
but not used, and it triggers additional errors on this code.
The new gccgo errors are correct; gc seems to suppress them
due to the other, expected, errors. This change uses the
variables so that no compiler will complain.
gccgo change is https://golang.org/cl/119920043 .
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/116050043
benchmark old ns/op new ns/op delta
BenchmarkSelectUncontended 220 165 -25.00%
BenchmarkSelectContended 209 161 -22.97%
BenchmarkSelectProdCons 1042 904 -13.24%
But more importantly this change will allow
to get rid of free function in runtime.
Fixes#6494.
LGTM=rsc, khr
R=golang-codereviews, rsc, dominik.honnef, khr
CC=golang-codereviews, remyoudompheng
https://golang.org/cl/107670043
There is no reason to generate different code for cap and len.
Fixes#8025.
Fixes#8026.
LGTM=rsc
R=rsc, iant, khr
CC=golang-codereviews
https://golang.org/cl/93570044
Fixes#8074.
The issue was not reproduceable by revision
go version devel +e0ad7e329637 Thu Jun 19 22:19:56 2014 -0700 linux/arm
But include the original test case in case the issue reopens itself.
LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews
https://golang.org/cl/107290043
No functional changes.
Generating shorter functions improves compilation time. On my laptop, this test's running time goes from 5.5s to 1.5s; the wall clock time to run all tests goes down 1s. On Raspberry Pi, this CL cuts 50s off the wall clock time to run all tests.
Fixes#7503.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/72590045
There is a hierarchy of location defined by loop depth:
-1 = the heap
0 = function results
1 = local variables (and parameters)
2 = local variable declared inside a loop
3 = local variable declared inside a loop inside a loop
etc
In general if an address from loopdepth n is assigned to
something in loop depth m < n, that indicates an extended
lifetime of some form that requires a heap allocation.
Function results can be local variables too, though, and so
they don't actually fit into the hierarchy very well.
Treat the address of a function result as level 1 so that
if it is written back into a result, the address is treated
as escaping.
Fixes#8185.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/108870044
The analysis for &x was using the loop depth on x set
during x's declaration. A type switch creates a list of
implicit declarations that were not getting initialized
with loop depths.
Fixes#8176.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/108860043
A runtime.Goexit during a panic-invoked deferred call
left the panic stack intact even though all the stack frames
are gone when the goroutine is torn down.
The next goroutine to reuse that struct will have a
bogus panic stack and can cause the traceback routines
to walk into garbage.
Most likely to happen during tests, because t.Fatal might
be called during a deferred func and uses runtime.Goexit.
This "not enough cleared in Goexit" failure mode has
happened to us multiple times now. Clear all the pointers
that don't make sense to keep, not just gp->panic.
Fixes#8158.
LGTM=iant, dvyukov
R=iant, dvyukov
CC=golang-codereviews
https://golang.org/cl/102220043
I am not sure what the rounding here was
trying to do, but it was skipping the first
pointer on native client.
The code above the rounding already checks
that xoffset is widthptr-aligned, so the rnd
was a no-op everywhere but on Native Client.
And on Native Client it was wrong.
Perhaps it was supposed to be rounding down,
not up, but zerorange handles the extra 32 bits
correctly, so the rnd does not seem to be necessary
at all.
This wouldn't be worth doing for Go 1.3 except
that it can affect code on the playground.
Fixes#8155.
LGTM=r, iant
R=golang-codereviews, r, iant
CC=dvyukov, golang-codereviews, khr
https://golang.org/cl/108740047
I introduced this bug when I changed the escape
analysis to run in phases based on call graph
dependency order, in order to be more precise about
inputs escaping back to outputs (functions returning
their arguments).
Given
func f(z **int) *int { return *z }
we were tagging the function as 'z does not escape
and is not returned', which is all true, but not
enough information.
If used as:
var x int
p := &x
q := &p
leak(f(q))
then the compiler might try to keep x, p, and q all
on the stack, since (according to the recorded
information) nothing interesting ends up being
passed to leak.
In fact since f returns *q = p, &x is passed to leak
and x needs to be heap allocated.
To trigger the bug, you need a chain that the
compiler wants to keep on the stack (like x, p, q
above), and you need a function that returns an
indirect of its argument, and you need to pass the
head of the chain to that function. This doesn't
come up very often: this bug has been present since
June 2012 (between Go 1 and Go 1.1) and we haven't
seen it until now. It helps that most functions that
return indirects are getters that are simple enough
to be inlined, avoiding the bug.
Earlier versions of Go also had the benefit that if
&x really wasn't used beyond x's lifetime, nothing
broke if you put &x in a heap-allocated structure
accidentally. With the new stack copying, though,
heap-allocated structures containing &x are not
updated when the stack is copied and x moves,
leading to crashes in Go 1.3 that were not crashes
in Go 1.2 or Go 1.1.
The fix is in two parts.
First, in the analysis of a function, recognize when
a value obtained via indirect of a parameter ends up
being returned. Mark those parameters as having
content escape back to the return results (but we
don't bother to write down which result).
Second, when using the analysis to analyze, say,
f(q), mark parameters with content escaping as
having any indirections escape to the heap. (We
don't bother trying to match the content to the
return value.)
The fix could be less precise (simpler).
In the first part we might mark all content-escaping
parameters as plain escaping, and then the second
part could be dropped. Or we might assume that when
calling f(q) all the things pointed at by q escape
always (for any f and q).
The fix could also be more precise (more complex).
We might record the specific mapping from parameter
to result along with the number of indirects from the
parameter to the thing being returned as the result,
and then at the call sites we could set up exactly the
right graph for the called function. That would make
notleaks(f(q)) be able to keep x on the stack, because
the reuslt of f(q) isn't passed to anything that leaks it.
The less precise the fix, the more stack allocations
become heap allocations.
This fix is exactly as precise as it needs to be so that
none of the current stack allocations in the standard
library turn into heap allocations.
Fixes#8120.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, khr, r
https://golang.org/cl/102040046
The 'address taken' bit in a function variable was not
propagating into the inlined copies, causing incorrect
liveness information.
LGTM=dsymonds, bradfitz
R=golang-codereviews, bradfitz
CC=dsymonds, golang-codereviews, iant, khr, r
https://golang.org/cl/96670046
The 1-byte write was silently clearing a byte on the stack.
If there was another function call with more arguments
in the same stack frame, no harm done.
Otherwise, if the variable at that location was already zero,
no harm done.
Otherwise, problems.
Fixes#8139.
LGTM=dsymonds
R=golang-codereviews, dsymonds
CC=golang-codereviews, iant, r
https://golang.org/cl/100940043
We were requiring that the defer stack and the panic stack
be completely processed, thinking that if any were left over
the stack scan and the defer stack/panic stack must be out
of sync. It turns out that the panic stack may well have
leftover entries in some situations, and that's okay.
Fixes#8132.
LGTM=minux, r
R=golang-codereviews, minux, r
CC=golang-codereviews, iant, khr
https://golang.org/cl/100900044
The 'continuation pc' is where the frame will continue
execution, if anywhere. For a frame that stopped execution
due to a CALL instruction, the continuation pc is immediately
after the CALL. But for a frame that stopped execution due to
a fault, the continuation pc is the pc after the most recent CALL
to deferproc in that frame, or else 0. That is where execution
will continue, if anywhere.
The liveness information is only recorded for CALL instructions.
This change makes sure that we never look for liveness information
except for CALL instructions.
Using a valid PC fixes crashes when a garbage collection or
stack copying tries to process a stack frame that has faulted.
Record continuation pc in heapdump (format change).
Fixes#8048.
LGTM=iant, khr
R=khr, iant, dvyukov
CC=golang-codereviews, r
https://golang.org/cl/100870044
This CL forces the optimizer to preserve some memory stores
that would be redundant except that a stack scan due to garbage
collection or stack copying might look at them during a function call.
As such, it forces additional memory writes and therefore slows
down the execution of some programs, especially garbage-heavy
programs that are already limited by memory bandwidth.
The slowdown can be as much as 7% for end-to-end benchmarks.
These numbers are from running go1.test -test.benchtime=5s three times,
taking the best (lowest) ns/op for each benchmark. I am excluding
benchmarks with time/op < 10us to focus on macro effects.
All benchmarks are on amd64.
Comparing tip (a27f34c771cb) against this CL on an Intel Core i5 MacBook Pro:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3876500413 3856337341 -0.52%
BenchmarkFannkuch11 2965104777 2991182127 +0.88%
BenchmarkGobDecode 8563026 8788340 +2.63%
BenchmarkGobEncode 5050608 5267394 +4.29%
BenchmarkGzip 431191816 434168065 +0.69%
BenchmarkGunzip 107873523 110563792 +2.49%
BenchmarkHTTPClientServer 85036 86131 +1.29%
BenchmarkJSONEncode 22143764 22501647 +1.62%
BenchmarkJSONDecode 79646916 85658808 +7.55%
BenchmarkMandelbrot200 4720421 4700108 -0.43%
BenchmarkGoParse 4651575 4712247 +1.30%
BenchmarkRegexpMatchMedium_1K 71986 73490 +2.09%
BenchmarkRegexpMatchHard_1K 111018 117495 +5.83%
BenchmarkRevcomp 648798723 659352759 +1.63%
BenchmarkTemplate 112673009 112819078 +0.13%
Comparing tip (a27f34c771cb) against this CL on an Intel Xeon E5520:
BenchmarkBinaryTree17 5461110720 5393104469 -1.25%
BenchmarkFannkuch11 4314677151 4327177615 +0.29%
BenchmarkGobDecode 11065853 11235272 +1.53%
BenchmarkGobEncode 6500065 6959837 +7.07%
BenchmarkGzip 647478596 671769097 +3.75%
BenchmarkGunzip 139348579 141096376 +1.25%
BenchmarkHTTPClientServer 69376 73610 +6.10%
BenchmarkJSONEncode 30172320 31796106 +5.38%
BenchmarkJSONDecode 113704905 114239137 +0.47%
BenchmarkMandelbrot200 6032730 6003077 -0.49%
BenchmarkGoParse 6775251 6405995 -5.45%
BenchmarkRegexpMatchMedium_1K 111832 113895 +1.84%
BenchmarkRegexpMatchHard_1K 161112 168420 +4.54%
BenchmarkRevcomp 876363406 892319935 +1.82%
BenchmarkTemplate 146273096 148998339 +1.86%
Just to get a sense of where we are compared to the previous release,
here are the same benchmarks comparing Go 1.2 to this CL.
Comparing Go 1.2 against this CL on an Intel Core i5 MacBook Pro:
BenchmarkBinaryTree17 4370077662 3856337341 -11.76%
BenchmarkFannkuch11 3347052657 2991182127 -10.63%
BenchmarkGobDecode 8791384 8788340 -0.03%
BenchmarkGobEncode 4968759 5267394 +6.01%
BenchmarkGzip 437815669 434168065 -0.83%
BenchmarkGunzip 94604099 110563792 +16.87%
BenchmarkHTTPClientServer 87798 86131 -1.90%
BenchmarkJSONEncode 22818243 22501647 -1.39%
BenchmarkJSONDecode 97182444 85658808 -11.86%
BenchmarkMandelbrot200 4733516 4700108 -0.71%
BenchmarkGoParse 5054384 4712247 -6.77%
BenchmarkRegexpMatchMedium_1K 67612 73490 +8.69%
BenchmarkRegexpMatchHard_1K 107321 117495 +9.48%
BenchmarkRevcomp 733270055 659352759 -10.08%
BenchmarkTemplate 109304977 112819078 +3.21%
Comparing Go 1.2 against this CL on an Intel Xeon E5520:
BenchmarkBinaryTree17 5986953594 5393104469 -9.92%
BenchmarkFannkuch11 4861139174 4327177615 -10.98%
BenchmarkGobDecode 11830997 11235272 -5.04%
BenchmarkGobEncode 6608722 6959837 +5.31%
BenchmarkGzip 661875826 671769097 +1.49%
BenchmarkGunzip 138630019 141096376 +1.78%
BenchmarkHTTPClientServer 71534 73610 +2.90%
BenchmarkJSONEncode 30393609 31796106 +4.61%
BenchmarkJSONDecode 139645860 114239137 -18.19%
BenchmarkMandelbrot200 5988660 6003077 +0.24%
BenchmarkGoParse 6974092 6405995 -8.15%
BenchmarkRegexpMatchMedium_1K 111331 113895 +2.30%
BenchmarkRegexpMatchHard_1K 165961 168420 +1.48%
BenchmarkRevcomp 995049292 892319935 -10.32%
BenchmarkTemplate 145623363 148998339 +2.32%
Fixes#8036.
LGTM=khr
R=golang-codereviews, josharian, khr
CC=golang-codereviews, iant, r
https://golang.org/cl/99660044
[Same as CL 102820043 except applied changes to 6g/gsubr.c
also to 5g/gsubr.c and 8g/gsubr.c. The problem I had last night
trying to do that was that 8g's copy of nodarg has different
(but equivalent) control flow and I was pasting the new code
into the wrong place.]
Description from CL 102820043:
The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.
Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.
The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.
Fixes#8097.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant, khr, r
https://golang.org/cl/103750043