mstats.last_gc is unix time now, it is compared with abstract monotonic time.
On my machine GC is forced every 5 mins regardless of last_gc.
LGTM=rsc
R=golang-codereviews
CC=golang-codereviews, iant, rsc
https://golang.org/cl/91350045
The monotonic clock patch changed all runtime times
to abstract monotonic time. As the result user-visible
MemStats.LastGC become monotonic time as well.
Restore Unix time for LastGC.
This is the simplest way to expose time.now to runtime that I found.
Another option would be to change time.now to C called
int64 runtime.unixnanotime() and then express time.now in terms of it.
But this would require to introduce 2 64-bit divisions into time.now.
Another option would be to change time.now to C called
void runtime.unixnanotime1(struct {int64 sec, int32 nsec} *now)
and then express both time.now and runtime.unixnanotime in terms of it.
Fixes#7852.
LGTM=minux.ma, iant
R=minux.ma, rsc, iant
CC=golang-codereviews
https://golang.org/cl/93720045
Having the pointers means you can grub around in the
binary finding out more about them.
This helped with issue 7748.
LGTM=minux.ma, bradfitz
R=golang-codereviews, minux.ma, bradfitz
CC=golang-codereviews
https://golang.org/cl/88090045
Do not consider idle finalizer/bgsweep/timer goroutines as doing something useful.
We can't simply set isbackground for the whole lifetime of the goroutines,
because when finalizer goroutine calls user function, we do want to consider it
as doing something useful.
This is borken due to timers for quite some time.
With background sweep is become even more broken.
Fixes#7784.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/87960044
Currently Pool can cache up to 15 elements per P, and these elements are not accesible to other Ps.
If a Pool caches large objects, say 2MB, and GOMAXPROCS is set to a large value, say 32,
then the Pool can waste up to 960MB.
The new caching policy caches at most 1 per-P element, the rest is shared between Ps.
Get/Put performance is unchanged. Nested Get/Put performance is 57% worse.
However, overall scalability of nested Get/Put is significantly improved,
so the new policy starts winning under contention.
benchmark old ns/op new ns/op delta
BenchmarkPool 27.4 26.7 -2.55%
BenchmarkPool-4 6.63 6.59 -0.60%
BenchmarkPool-16 1.98 1.87 -5.56%
BenchmarkPool-64 1.93 1.86 -3.63%
BenchmarkPoolOverlflow 3970 6235 +57.05%
BenchmarkPoolOverlflow-4 10935 1668 -84.75%
BenchmarkPoolOverlflow-16 13419 520 -96.12%
BenchmarkPoolOverlflow-64 10295 380 -96.31%
LGTM=rsc
R=rsc
CC=golang-codereviews, khr
https://golang.org/cl/86020043
Cuts the number of calls from 6 to 2 in the non-debug case.
LGTM=iant
R=golang-codereviews, iant
CC=0intro, aram, golang-codereviews, khr
https://golang.org/cl/86040043
Given
type Outer struct {
*Inner
...
}
the compiler generates the implementation of (*Outer).M dispatching to
the embedded Inner. The implementation is logically:
func (p *Outer) M() {
(p.Inner).M()
}
but since the only change here is the replacement of one pointer
receiver with another, the actual generated code overwrites the
original receiver with the p.Inner pointer and then jumps to the M
method expecting the *Inner receiver.
During reflect.Value.Call, we create an argument frame and the
associated data structures to describe it to the garbage collector,
populate the frame, call reflect.call to run a function call using
that frame, and then copy the results back out of the frame. The
reflect.call function does a memmove of the frame structure onto the
stack (to set up the inputs), runs the call, and the memmoves the
stack back to the frame structure (to preserve the outputs).
Originally reflect.call did not distinguish inputs from outputs: both
memmoves were for the full stack frame. However, in the case where the
called function was one of these wrappers, the rewritten receiver is
almost certainly a different type than the original receiver. This is
not a problem on the stack, where we use the program counter to
determine the type information and understand that during (*Outer).M
the receiver is an *Outer while during (*Inner).M the receiver in the
same memory word is now an *Inner. But in the statically typed
argument frame created by reflect, the receiver is always an *Outer.
Copying the modified receiver pointer off the stack into the frame
will store an *Inner there, and then if a garbage collection happens
to scan that argument frame before it is discarded, it will scan the
*Inner memory as if it were an *Outer. If the two have different
memory layouts, the collection will intepret the memory incorrectly.
Fix by only copying back the results.
Fixes#7725.
LGTM=khr
R=khr
CC=dave, golang-codereviews
https://golang.org/cl/85180043
Trying to make GODEBUG=gcdead=1 work with liveness
and in particular ambiguously live variables.
1. In the liveness computation, mark all ambiguously live
variables as live for the entire function, except the entry.
They are zeroed directly after entry, and we need them not
to be poisoned thereafter.
2. In the liveness computation, compute liveness (and deadness)
for all parameters, not just pointer-containing parameters.
Otherwise gcdead poisons untracked scalar parameters and results.
3. Fix liveness debugging print for -live=2 to use correct bitmaps.
(Was not updated for compaction during compaction CL.)
4. Correct varkill during map literal initialization.
Was killing the map itself instead of the inserted value temp.
5. Disable aggressive varkill cleanup for call arguments if
the call appears in a defer or go statement.
6. In the garbage collector, avoid bug scanning empty
strings. An empty string is two zeros. The multiword
code only looked at the first zero and then interpreted
the next two bits in the bitmap as an ordinary word bitmap.
For a string the bits are 11 00, so if a live string was zero
length with a 0 base pointer, the poisoning code treated
the length as an ordinary word with code 00, meaning it
needed poisoning, turning the string into a poison-length
string with base pointer 0. By the same logic I believe that
a live nil slice (bits 11 01 00) will have its cap poisoned.
Always scan full multiword struct.
7. In the runtime, treat both poison words (PoisonGC and
PoisonStack) as invalid pointers that warrant crashes.
Manual testing as follows:
- Create a script called gcdead on your PATH containing:
#!/bin/bash
GODEBUG=gcdead=1 GOGC=10 GOTRACEBACK=2 exec "$@"
- Now you can build a test and then run 'gcdead ./foo.test'.
- More importantly, you can run 'go test -short -exec gcdead std'
to run all the tests.
Fixes#7676.
While here, enable the precise scanning of slices, since that was
disabled due to bugs like these. That now works, both with and
without gcdead.
Fixes#7549.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83410044
Reduce footprint of liveness bitmaps by about 5x.
1. Mark all liveness bitmap symbols as 4-byte aligned
(they were aligned to a larger size by default).
2. The bitmap data is a bitmap count n followed by n bitmaps.
Each bitmap begins with its own count m giving the number
of bits. All the m's are the same for the n bitmaps.
Emit this bitmap length once instead of n times.
3. Many bitmaps within a function have the same bit values,
but each call site was given a distinct bitmap. Merge duplicate
bitmaps so that no bitmap is written more than once.
4. Many functions end up with the same aggregate bitmap data.
We used to name the bitmap data funcname.gcargs and funcname.gclocals.
Instead, name it gclocals.<md5 of data> and mark it dupok so
that the linker coalesces duplicate sets. This cut the bitmap
data remaining after step 3 by 40%; I was not expecting it to
be quite so dramatic.
Applied to "go build -ldflags -w code.google.com/p/go.tools/cmd/godoc":
bitmaps pclntab binary on disk
before this CL 1326600 1985854 12738268
4-byte align 1154288 (0.87x) 1985854 (1.00x) 12566236 (0.99x)
one bitmap len 782528 (0.54x) 1985854 (1.00x) 12193500 (0.96x)
dedup bitmap 414748 (0.31x) 1948478 (0.98x) 11787996 (0.93x)
dedup bitmap set 245580 (0.19x) 1948478 (0.98x) 11620060 (0.91x)
While here, remove various dead blocks of code from plive.c.
Fixes#6929.
Fixes#7568.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83630044
GODEBUG=allocfreetrace=1:
The allocfreetrace=1 mode prints a stack trace for each block
allocated and freed, and also a stack trace for each garbage collection.
It was implemented by reusing the heap profiling support: if allocfreetrace=1
then the heap profile was effectively running at 1 sample per 1 byte allocated
(always sample). The stack being shown at allocation was the stack gathered
for profiling, meaning it was derived only from the program counters and
did not include information about function arguments or frame pointers.
The stack being shown at free was the allocation stack, not the free stack.
If you are generating this log, you can find the allocation stack yourself, but
it can be useful to see exactly the sequence that led to freeing the block:
was it the garbage collector or an explicit free? Now that the garbage collector
runs on an m0 stack, the stack trace for the garbage collector was never interesting.
Fix all these problems:
1. Decouple allocfreetrace=1 from heap profiling.
2. Print the standard goroutine stack traces instead of a custom format.
3. Print the stack trace at time of allocation for an allocation,
and print the stack trace at time of free (not the allocation trace again)
for a free.
4. Print all goroutine stacks at garbage collection. Having all the stacks
means that you can see the exact point at which each goroutine was
preempted, which is often useful for identifying liveness-related errors.
GODEBUG=gcdead=1:
This mode overwrites dead pointers with a poison value.
Detect the poison value as an invalid pointer during collection,
the same way that small integers are invalid pointers.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/81670043
This is the same check we use during stack copying.
The check cannot be applied to C stack frames, even
though we do emit pointer bitmaps for the arguments,
because (1) the pointer bitmaps assume all arguments
are always live, not true of outputs during the prologue,
and (2) the pointer bitmaps encode interface values as
pointer pairs, not true of interfaces holding integers.
For the rest of the frames, however, we should hold ourselves
to the rule that a pointer marked live really is initialized.
The interface scanning already implicitly checks this
because it interprets the type word as a valid type pointer.
This may slow things down a little because of the extra loads.
Or it may speed things up because we don't bother enqueuing
nil pointers anymore. Enough of the rest of the system is slow
right now that we can't measure it meaningfully.
Enable for now, even if it is slow, to shake out bugs in the
liveness bitmaps, and then decide whether to turn it off
for the Go 1.3 release (issue 7650 reminds us to do this).
The new m->traceback field lets us force printing of fp=
values on all goroutine stack traces when we detect a
bad pointer. This makes it easier to understand exactly
where in the frame the bad pointer is, so that we can trace
it back to a specific variable and determine what is wrong.
Update #7650
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80860044
Currently it's possible that bgsweep finishes before all spans
have been swept (we only know that sweeping of all spans has *started*).
In such case bgsweep may fail wake up runfinq goroutine when it needs to.
finq may still be nil at this point, but some finalizers may be queued later.
Make bgsweep to wait for sweeping to *complete*, then it can decide
whether it needs to wake up runfinq for sure.
Update #7533
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/75960043
If we set obj, then it will be enqueued for marking at the end of the scanning loop.
This is not necessary, since we've already marked it.
This can wait for 1.4 if you wish.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/80030043
Change two-bit stack map entries to encode:
0 = dead
1 = scalar
2 = pointer
3 = multiword
If multiword, the two-bit entry for the following word encodes:
0 = string
1 = slice
2 = iface
3 = eface
That way, during stack scanning we can check if a string
is zero length or a slice has zero capacity. We can avoid
following the contained pointer in those cases. It is safe
to do so because it can never be dereferenced, and it is
desirable to do so because it may cause false retention
of the following block in memory.
Slice feature turned off until issue 7564 is fixed.
Update #7549
LGTM=rsc
R=golang-codereviews, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/76380043
The existing code did not have a clear notion of whether
memory has been actually reserved. It checked based on
whether in 32-bit mode or 64-bit mode and (on GNU/Linux) the
requested address, but it confused the requested address and
the returned address.
LGTM=rsc
R=rsc, dvyukov
CC=golang-codereviews, michael.hudson
https://golang.org/cl/79610043
The nproc and ndone fields are uint32. This makes the type
consistent.
LGTM=minux.ma
R=golang-codereviews, minux.ma
CC=golang-codereviews
https://golang.org/cl/79340044
It's possible that bgsweep constantly does not catch up for some reason,
in this case runfinq was not woken at all.
R=rsc
CC=golang-codereviews
https://golang.org/cl/75940043
The problem was that spans end up in wrong lists after split
(e.g. in h->busy instead of h->central->empty).
Also the span can be non-swept before split,
I don't know what it can cause, but it's safer to operate on swept spans.
Fixes#7544.
R=golang-codereviews, rsc
CC=golang-codereviews, khr
https://golang.org/cl/76160043
The garbage collector uses type information to guide the
traversal of the heap. If it sees a field that should be a string,
it marks the object pointed at by the string data pointer as
visited but does not bother to look at the data, because
strings contain bytes, not pointers.
If you save s[len(s):] somewhere, though, the string data pointer
actually points just beyond the string data; if the string data
were exactly the size of an allocated block, the string data
pointer would actually point at the next block. It is incorrect
to mark that next block as visited and not bother to look at
the data, because the next block may be some other type
entirely.
The fix is to ignore strings with zero length during collection:
they are empty and can never become non-empty: the base
pointer will never be used again. The handling of slices already
does this (but using cap instead of len).
This was not a bug in Go 1.2, because until January all string
allocations included a trailing NUL byte not included in the
length, so s[len(s):] still pointed inside the string allocation
(at the NUL).
This bug was causing the crashes in test/run.go. Specifically,
the parsing of a regexp in package regexp/syntax allocated a
[]syntax.Inst with rounded size 1152 bytes. In fact it
allocated many such slices, because during the processing of
test/index2.go it creates thousands of regexps that are all
approximately the same complexity. That takes a long time, and
test/run works on other tests in other goroutines. One such
other test is chan/perm.go, which uses an 1152-byte source
file. test/run reads that file into a []byte and then calls
strings.Split(string(src), "\n"). The string(src) creates an
1152-byte string - and there's a very good chance of it
landing next to one of the many many regexp slices already
allocated - and then because the file ends in a \n,
strings.Split records the tail empty string as the final
element in the slice. A garbage collection happens at this
point, the collection finds that string before encountering
the []syntax.Inst data it now inadvertently points to, and the
[]syntax.Inst data is not scanned for the pointers that it
contains. Each syntax.Inst contains a []rune, those are
missed, and the backing rune arrays are freed for reuse. When
the regexp is later executed, the runes being searched for are
no longer runes at all, and there is no match, even on text
that should match.
On 64-bit machines the pointer in the []rune inside the
syntax.Inst is larger (along with a few other pointers),
pushing the []syntax.Inst backing array into a larger size
class, avoiding the collision with chan/perm.go's
inadvertently sized file.
I expect this was more prevalent on OS X than on Linux or
Windows because those managed to run faster or slower and
didn't overlap index2.go with chan/perm.go as often. On the
ARM systems, we only run one errorcheck test at a time, so
index2 and chan/perm would never overlap.
It is possible that this bug is the root cause of other crashes
as well. For now we only know it is the cause of the test/run crash.
Many thanks to Dmitriy for help debugging.
Fixes#7344.
Fixes#7455.
LGTM=r, dvyukov, dave, iant
R=golang-codereviews, dave, r, dvyukov, delpontej, iant
CC=golang-codereviews, khr
https://golang.org/cl/74250043
Spans are now private to threads, and the loop
is removed from all other functions.
Remove it from marknogc for consistency.
LGTM=khr, rsc
R=golang-codereviews, bradfitz, khr
CC=golang-codereviews, khr, rsc
https://golang.org/cl/72520043
One reason the sync.Pool finalizer test can fail is that
this function's ef1 contains uninitialized data that just
happens to point at some of the old pool. I've seen this cause
retention of a single pool cache line (32 elements) on arm.
Really we need liveness information for C functions, but
for now we can be more careful about data in long-lived
C functions that block.
LGTM=bradfitz, dvyukov
R=golang-codereviews, bradfitz, dvyukov
CC=golang-codereviews, iant, khr
https://golang.org/cl/72490043
Two memory allocator bug fixes.
- efence is not maintaining the proper heap metadata
to make eventual memory reuse safe, so use SysFault.
- now that our heap PageSize is 8k but most hardware
uses 4k pages, SysAlloc and SysReserve results must be
explicitly aligned. Do that in a few more call sites and
document this fact in malloc.h.
Fixes#7448.
LGTM=iant
R=golang-codereviews, josharian, iant
CC=dvyukov, golang-codereviews
https://golang.org/cl/71750048
Before GC, we flush all the per-P allocation caches. Doing
stack shrinking mid-GC causes these caches to fill up. At the
end of gc, the sweepgen is incremented which causes all of the
data in these caches to be in a bad state (cached but not yet
swept).
Move the stack shrinking until after sweepgen is incremented,
so any caching that happens as part of shrinking is done with
already-swept data.
Reenable stack copying.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/69620043
On stack overflow, if all frames on the stack are
copyable, we copy the frames to a new stack twice
as large as the old one. During GC, if a G is using
less than 1/4 of its stack, copy the stack to a stack
half its size.
TODO
- Do something about C frames. When a C frame is in the
stack segment, it isn't copyable. We allocate a new segment
in this case.
- For idempotent C code, we can abort it, copy the stack,
then retry. I'm working on a separate CL for this.
- For other C code, we can raise the stackguard
to the lowest Go frame so the next call that Go frame
makes triggers a copy, which will then succeed.
- Pick a starting stack size?
The plan is that eventually we reach a point where the
stack contains only copyable frames.
LGTM=rsc
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/54650044
MCaches now hold a MSpan for each sizeclass which they have
exclusive access to allocate from, so no lock is needed.
Modifying the heap bitmaps also no longer requires a cas.
runtime.free gets more expensive. But we don't use it
much any more.
It's not much faster on 1 processor, but it's a lot
faster on multiple processors.
benchmark old ns/op new ns/op delta
BenchmarkSetTypeNoPtr1 24 23 -0.42%
BenchmarkSetTypeNoPtr2 33 34 +0.89%
BenchmarkSetTypePtr1 51 49 -3.72%
BenchmarkSetTypePtr2 55 54 -1.98%
benchmark old ns/op new ns/op delta
BenchmarkAllocation 52739 50770 -3.73%
BenchmarkAllocation-2 33957 34141 +0.54%
BenchmarkAllocation-3 33326 29015 -12.94%
BenchmarkAllocation-4 38105 25795 -32.31%
BenchmarkAllocation-5 68055 24409 -64.13%
BenchmarkAllocation-6 71544 23488 -67.17%
BenchmarkAllocation-7 68374 23041 -66.30%
BenchmarkAllocation-8 70117 20758 -70.40%
LGTM=rsc, dvyukov
R=dvyukov, bradfitz, khr, rsc
CC=golang-codereviews
https://golang.org/cl/46810043
See golang.org/s/go13nacl for design overview.
This CL is the mostly mechanical changes from rsc's Go 1.2 based NaCl branch, specifically 39cb35750369 to 500771b477cf from https://code.google.com/r/rsc-go13nacl. This CL does not include working NaCl support, there are probably two or three more large merges to come.
CL 15750044 is not included as it involves more invasive changes to the linker which will need to be merged separately.
The exact change lists included are
15050047: syscall: support for Native Client
15360044: syscall: unzip implementation for Native Client
15370044: syscall: Native Client SRPC implementation
15400047: cmd/dist, cmd/go, go/build, test: support for Native Client
15410048: runtime: support for Native Client
15410049: syscall: file descriptor table for Native Client
15410050: syscall: in-memory file system for Native Client
15440048: all: update +build lines for Native Client port
15540045: cmd/6g, cmd/8g, cmd/gc: support for Native Client
15570045: os: support for Native Client
15680044: crypto/..., hash/crc32, reflect, sync/atomic: support for amd64p32
15690044: net: support for Native Client
15690048: runtime: support for fake time like on Go Playground
15690051: build: disable various tests on Native Client
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/68150047
Reinforce the guarantee that MSpan_EnsureSwept actually ensures that the span is swept.
I have not observed crashes related to this, but I do not see why it can't crash as well.
LGTM=rsc
R=golang-codereviews
CC=golang-codereviews, khr, rsc
https://golang.org/cl/67990043
runfinqv is already defined the same way on line 271.
There may also be something to fix in compiler/linker wrt diagnostics.
Fixes#7375.
LGTM=bradfitz
R=golang-codereviews, dave, bradfitz
CC=golang-codereviews
https://golang.org/cl/67850044
Package runtime's C functions written to be called from Go
started out written in C using carefully constructed argument
lists and the FLUSH macro to write a result back to memory.
For some functions, the appropriate parameter list ended up
being architecture-dependent due to differences in alignment,
so we added 'goc2c', which takes a .goc file containing Go func
declarations but C bodies, rewrites the Go func declaration to
equivalent C declarations for the target architecture, adds the
needed FLUSH statements, and writes out an equivalent C file.
That C file is compiled as part of package runtime.
Native Client's x86-64 support introduces the most complex
alignment rules yet, breaking many functions that could until
now be portably written in C. Using goc2c for those avoids the
breakage.
Separately, Keith's work on emitting stack information from
the C compiler would require the hand-written functions
to add #pragmas specifying how many arguments are result
parameters. Using goc2c for those avoids maintaining #pragmas.
For both reasons, use goc2c for as many Go-called C functions
as possible.
This CL is a replay of the bulk of CL 15400047 and CL 15790043,
both of which were reviewed as part of the NaCl port and are
checked in to the NaCl branch. This CL is part of bringing the
NaCl code into the main tree.
No new code here, just reformatting and occasional movement
into .h files.
LGTM=r
R=dave, alex.brainman, r
CC=golang-codereviews
https://golang.org/cl/65220044
This cleans up the code significantly, and it avoids any
possible problems with madvise zeroing out some but
not all of the data.
Fixes#6400.
LGTM=dave
R=dvyukov, dave
CC=golang-codereviews
https://golang.org/cl/57680046
The issue was that one of the MSpan_Sweep callers
was doing sweep with preemption enabled.
Additional checks are added.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/62990043
State of the world:
CL 46430043 introduced a new concurrent sweep but is broken.
CL 62360043 made the new sweep non-concurrent
to try to fix the world while we understand what's wrong with
the concurrent version.
This CL fixes the non-concurrent form to run finalizers.
This CL is just a band-aid to get the build green again.
Dmitriy is working on understanding and then fixing what's
wrong with the concurrent sweep.
TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/62370043
Moves sweep phase out of stoptheworld by adding
background sweeper goroutine and lazy on-demand sweeping.
It turned out to be somewhat trickier than I expected,
because there is no point in time when we know size of live heap
nor consistent number of mallocs and frees.
So everything related to next_gc, mprof, memstats, etc becomes trickier.
At the end of GC next_gc is conservatively set to heap_alloc*GOGC,
which is much larger than real value. But after every sweep
next_gc is decremented by freed*GOGC. So when everything is swept
next_gc becomes what it should be.
For mprof I had to introduce 3-generation scheme (allocs, revent_allocs, prev_allocs),
because by the end of GC we know number of frees for the *previous* GC.
Significant caution is required to not cross yet-unknown real value of next_gc.
This is achieved by 2 means:
1. Whenever I allocate a span from MCentral, I sweep a span in that MCentral.
2. Whenever I allocate N pages from MHeap, I sweep until at least N pages are
returned to heap.
This provides quite strong guarantees that heap does not grow when it should now.
http-1
allocated 7036 7033 -0.04%
allocs 60 60 +0.00%
cputime 51050 46700 -8.52%
gc-pause-one 34060569 1777993 -94.78%
gc-pause-total 2554 133 -94.79%
latency-50 178448 170926 -4.22%
latency-95 284350 198294 -30.26%
latency-99 345191 220652 -36.08%
rss 101564416 101007360 -0.55%
sys-gc 6606832 6541296 -0.99%
sys-heap 88801280 87752704 -1.18%
sys-other 7334208 7405928 +0.98%
sys-stack 524288 524288 +0.00%
sys-total 103266608 102224216 -1.01%
time 50339 46533 -7.56%
virtual-mem 292990976 293728256 +0.25%
garbage-1
allocated 2983818 2990889 +0.24%
allocs 62880 62902 +0.03%
cputime 16480000 16190000 -1.76%
gc-pause-one 828462467 487875135 -41.11%
gc-pause-total 4142312 2439375 -41.11%
rss 1151709184 1153712128 +0.17%
sys-gc 66068352 66068352 +0.00%
sys-heap 1039728640 1039728640 +0.00%
sys-other 37776064 40770176 +7.93%
sys-stack 8781824 8781824 +0.00%
sys-total 1152354880 1155348992 +0.26%
time 16496998 16199876 -1.80%
virtual-mem 1409564672 1402281984 -0.52%
LGTM=rsc
R=golang-codereviews, sameer, rsc, iant, jeremyjackins, gobot
CC=golang-codereviews, khr
https://golang.org/cl/46430043
Currently windows crashes because early allocs in schedinit
try to allocate tiny memory blocks, but m->p is not yet setup.
I've considered calling procresize(1) earlier in schedinit,
but this refactoring is better and must fix the issue as well.
Fixes#7218.
R=golang-codereviews, r
CC=golang-codereviews
https://golang.org/cl/54570045
There is more zeroing than I would like right now -
temporaries used for the new map and channel runtime
calls need to be eliminated - but it will do for now.
This CL only has an effect if you are building with
GOEXPERIMENT=precisestack ./all.bash
(or make.bash). It costs about 5% in the overall time
spent in all.bash. That number will come down before
we make it on by default, but this should be enough for
Keith to try using the precise maps for copying stacks.
amd64 only (and it's not really great generated code).
TBR=khr, iant
CC=golang-codereviews
https://golang.org/cl/56430043
Introduces two-phase goroutine parking mechanism -- prepare to park, commit park.
This mechanism does not require backing mutex to protect wait predicate.
Use it in netpoll. See comment in netpoll.goc for details.
This slightly reduces contention between reader, writer and read/write io notifications;
and just eliminates a bunch of mutex operations from hotpaths, thus making then faster.
benchmark old ns/op new ns/op delta
BenchmarkTCP4ConcurrentReadWrite 2109 1945 -7.78%
BenchmarkTCP4ConcurrentReadWrite-2 1162 1113 -4.22%
BenchmarkTCP4ConcurrentReadWrite-4 798 755 -5.39%
BenchmarkTCP4ConcurrentReadWrite-8 803 748 -6.85%
BenchmarkTCP4Persistent 9411 9240 -1.82%
BenchmarkTCP4Persistent-2 5888 5813 -1.27%
BenchmarkTCP4Persistent-4 4016 3968 -1.20%
BenchmarkTCP4Persistent-8 3943 3857 -2.18%
R=golang-codereviews, mikioh.mikioh, gobot, iant, rsc
CC=golang-codereviews, khr
https://golang.org/cl/45700043
Currently we collect (add) all roots into a global array in a single-threaded GC phase.
This hinders parallelism.
With this change we just kick off parallel for for number_of_goroutines+5 iterations.
Then parallel for callback decides whether it needs to scan stack of a goroutine
scan data segment, scan finalizers, etc. This eliminates the single-threaded phase entirely.
This requires to store all goroutines in an array instead of a linked list
(to allow direct indexing).
This CL also removes DebugScan functionality. It is broken because it uses
unbounded stack, so it can not run on g0. When it was working, I've found
it helpless for debugging issues because the two algorithms are too different now.
This change would require updating the DebugScan, so it's simpler to just delete it.
With 8 threads this change reduces GC pause by ~6%, while keeping cputime roughly the same.
garbage-8
allocated 2987886 2989221 +0.04%
allocs 62885 62887 +0.00%
cputime 21286000 21272000 -0.07%
gc-pause-one 26633247 24885421 -6.56%
gc-pause-total 873570 811264 -7.13%
rss 242089984 242515968 +0.18%
sys-gc 13934336 13869056 -0.47%
sys-heap 205062144 205062144 +0.00%
sys-other 12628288 12628288 +0.00%
sys-stack 11534336 11927552 +3.41%
sys-total 243159104 243487040 +0.13%
time 2809477 2740795 -2.44%
R=golang-codereviews, rsc
CC=cshapiro, golang-codereviews, khr
https://golang.org/cl/46860043
Instead of a per-goroutine stack of defers for all sizes,
introduce per-P defer pool for argument sizes 8, 24, 40, 56, 72 bytes.
For a program that starts 1e6 goroutines and then joins then:
old: rss=6.6g virtmem=10.2g time=4.85s
new: rss=4.5g virtmem= 8.2g time=3.48s
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/42750044
Currently for 2-word blocks we set the flag to clear the flag. Makes no sense.
In particular on 32-bits we call memclr always.
R=golang-codereviews, dave, iant
CC=golang-codereviews, khr, rsc
https://golang.org/cl/41170044
Example of output:
goroutine 4 [sleep for 3 min]:
time.Sleep(0x34630b8a000)
src/pkg/runtime/time.goc:31 +0x31
main.func·002()
block.go:16 +0x2c
created by main.main
block.go:17 +0x33
Full program and output are here:
http://play.golang.org/p/NEZdADI3TdFixes#6809.
R=golang-codereviews, khr, kamil.kisiel, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/50420043
The spans array is allocated in runtime·mallocinit. On a
32-bit system the number of entries in the spans array is
MaxArena32 / PageSize, which (2U << 30) / (1 << 12) == (1 << 19).
So we are allocating an array that can hold 19 bits for an
index that can hold 20 bits. According to the comment in the
function, this is intentional: we only allocate enough spans
(and bitmaps) for a 2G arena, because allocating more would
probably be wasteful.
But since the span index is simply the upper 20 bits of the
memory address, this scheme only works if memory addresses are
limited to the low 2G of memory. That would be OK if we were
careful to enforce it, but we're not. What we are careful to
enforce, in functions like runtime·MHeap_SysAlloc, is that we
always return addresses between the heap's arena_start and
arena_start + MaxArena32.
We generally get away with it because we start allocating just
after the program end, so we only run into trouble with
programs that allocate a lot of memory, enough to get past
address 0x80000000.
This changes the code that computes a span index to subtract
arena_start on 32-bit systems just as we currently do on
64-bit systems.
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/49460043
record finalizers and heap profile info. Enables
removing the special bit from the heap bitmap. Also
provides a generic mechanism for annotating occasional
heap objects.
finalizers
overhead per obj
old 680 B 80 B avg
new 16 B/span 48 B
profile
overhead per obj
old 32KB 24 B + hash tables
new 16 B/span 24 B
R=cshapiro, khr, dvyukov, gobot
CC=golang-codereviews
https://golang.org/cl/13314053
On the plus side, we don't need to change the bits when mallocing
pointerless objects. On the other hand, we need to mark objects in the
free lists during GC. But the free lists are small at GC time, so it
should be a net win.
benchmark old ns/op new ns/op delta
BenchmarkMalloc8 40 33 -17.65%
BenchmarkMalloc16 45 38 -15.72%
BenchmarkMallocTypeInfo8 58 59 +0.85%
BenchmarkMallocTypeInfo16 63 64 +1.10%
R=golang-dev, rsc, dvyukov
CC=cshapiro, golang-dev
https://golang.org/cl/41040043
Adds the Pool type and docs, and use it in fmt.
This is a temporary implementation, until Dmitry
makes it fast.
Uses the API proposal from Russ in http://goo.gl/cCKeb2 but
adds an optional New field, as used in fmt and elsewhere.
Almost all callers want that.
Update #4720
R=golang-dev, rsc, cshapiro, iant, r, dvyukov, khr
CC=golang-dev
https://golang.org/cl/41860043
When enabled this new debugging mode will allocate objects on
their own page and never recycle memory addresses. This is an
essential tool to root cause a broad class of heap corruption.
R=golang-dev, dave, daniel.morsing, dvyukov, rsc, iant, cshapiro
CC=golang-dev
https://golang.org/cl/22060046
This change allows the garbage collector to examine stack
slots that are determined as live and containing a pointer
value by the garbage collector. This results in a mean
reduction of 65% in the number of stack slots scanned during
an invocation of "GOGC=1 all.bash".
Unfortunately, this does not yet allow garbage collection to
be precise for the stack slots computed as live. Pointers
confound the determination of what definitions reach a given
instruction. In general, this problem is not solvable without
runtime cost but some advanced cooperation from the compiler
might mitigate common cases.
R=golang-dev, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/14430048
The code for call site-specific pointer bitmaps was not ready in time,
but the zeroing required without it is too expensive to use by default.
We will have to wait for precise collection of stack frames until Go 1.3.
The precise collection can be re-enabled by
GOEXPERIMENT=precisestack ./all.bash
but that will not be the default for a Go 1.2 build.
Fixes#6087.
R=golang-dev, jeremyjackins, dan.kortschak, r
CC=golang-dev
https://golang.org/cl/13677045
Currently lots of sys allocations are not accounted in any of XxxSys,
including GC bitmap, spans table, GC roots blocks, GC finalizer blocks,
iface table, netpoll descriptors and more. Up to ~20% can unaccounted.
This change introduces 2 new stats: GCSys and OtherSys for GC metadata
and all other misc allocations, respectively.
Also ensures that all XxxSys indeed sum up to Sys. All sys memory allocation
functions require the stat for accounting, so that it's impossible to miss something.
Also fix updating of mcache_sys/inuse, they were not updated after deallocation.
test/bench/garbage/parser before:
Sys 670064344
HeapSys 610271232
StackSys 65536
MSpanSys 14204928
MCacheSys 16384
BuckHashSys 1439992
after:
Sys 670064344
HeapSys 610271232
StackSys 65536
MSpanSys 14188544
MCacheSys 16384
BuckHashSys 3194304
GCSys 39198688
OtherSys 3129656
Fixes#5799.
R=rsc, dave, alex.brainman
CC=golang-dev
https://golang.org/cl/12946043
When searching for an allocated bit, flushptrbuf would search
backward in the bitmap word containing the bit of pointer
being looked-up before searching the span. This extra check
was not replicated in markonly which, instead, after not
finding an allocated bit for a pointer would directly look in
the span.
Using statistics generated from godoc, before this change span
lookups were, on average, more common than word lookups. It
was common for markonly to consult spans for one third of its
pointer lookups. With this change in place, what were
previously span lookups are overwhelmingly become by the word
lookups making the total number of span lookups a relatively
small fraction of the whole.
This change also introduces some statistics gathering about
lookups guarded by the CollectStats enum.
R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/13311043
#pragma textflag and #pragma dataflag directives.
Update dataflag directives to use symbols instead of integer constants.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/13310043
the use of the flag, especially for objects which actually do have
pointers but we don't want the GC to scan them.
R=golang-dev, cshapiro
CC=golang-dev
https://golang.org/cl/13181045
GC acquires worldsema, which is a goroutine-level semaphore
which parks goroutines. g0 can not be parked.
Fixes#6193.
R=khr, khr
CC=golang-dev
https://golang.org/cl/12880045
Update the original change but do not read interface types in
the arguments area. Once the arguments area is zeroed as the
locals area is we can safely read interface type values there
too.
««« original CL description
undo CL 12785045 / 71ce80dc4195
This has broken the 32-bit builds.
««« original CL description
cmd/gc, runtime: use type information to scan interface values
R=golang-dev, rsc, dvyukov
CC=golang-dev
https://golang.org/cl/12785045
»»»
R=khr, golang-dev, khr
CC=golang-dev
https://golang.org/cl/13010045
»»»
R=khr, khr
CC=golang-dev
https://golang.org/cl/13073045
This has broken the 32-bit builds.
««« original CL description
cmd/gc, runtime: use type information to scan interface values
R=golang-dev, rsc, dvyukov
CC=golang-dev
https://golang.org/cl/12785045
»»»
R=khr, golang-dev, khr
CC=golang-dev
https://golang.org/cl/13010045
Originally the requirement was f(x) where f's argument is
exactly x's type.
CL 11858043 relaxed the requirement in a non-standard
way: f's argument must be exactly x's type or interface{}.
If we're going to relax the requirement, it should be done
in a way consistent with the rest of Go. This CL allows f's
argument to have any type for which x is assignable;
that's the same requirement the compiler would impose
if compiling f(x) directly.
Fixes#5368.
R=dvyukov, bradfitz, pieter
CC=golang-dev
https://golang.org/cl/12895043
Prior to this change, pointer maps encoded the disposition of
a word using a single bit. A zero signaled a non-pointer
value and a one signaled a pointer value. Interface values,
which are a effectively a union type, were conservatively
labeled as a pointer.
This change widens the logical element size of the pointer map
to two bits per word. As before, zero signals a non-pointer
value and one signals a pointer value. Additionally, a two
signals an iface pointer and a three signals an eface pointer.
Following other changes to the runtime, values two and three
will allow a type information to drive interpretation of the
subsequent word so only those interface values containing a
pointer value will be scanned.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12689046
The mutex, fdMutex, handles locking and lifetime of sysfd,
and serializes Read and Write methods.
This allows to strip 2 sync.Mutex.Lock calls,
2 sync.Mutex.Unlock calls, 1 defer and some amount
of misc overhead from every network operation.
On linux/amd64, Intel E5-2690:
benchmark old ns/op new ns/op delta
BenchmarkTCP4Persistent 9595 9454 -1.47%
BenchmarkTCP4Persistent-2 8978 8772 -2.29%
BenchmarkTCP4ConcurrentReadWrite 4900 4625 -5.61%
BenchmarkTCP4ConcurrentReadWrite-2 2603 2500 -3.96%
In general it strips 70-500 ns from every network operation depending
on processor model. On my relatively new E5-2690 it accounts to ~5%
of network op cost.
Fixes#6074.
R=golang-dev, bradfitz, alex.brainman, iant, mikioh.mikioh
CC=golang-dev
https://golang.org/cl/12418043
Previously, all word aligned locations in the local variables
area were scanned as conservative roots. With this change, a
bitmap is generated describing the locations of pointer values
in local variables.
With this change the argument bitmap information has been
changed to only store information about arguments. The locals
member, has been removed. In its place, the bitmap data for
local variables is now used to store the size of locals. If
the size is negative, the magnitude indicates the size of the
local variables area.
R=rsc
CC=golang-dev
https://golang.org/cl/12328044
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.
This is the same as reverted change 12250043
with save marked as textflag 7.
The problem was that if save calls morestack,
then subsequent lessstack spoils g->sched.pc/sp.
And that bad values were remembered in g->syscallpc/sp.
Entersyscallblock had the same problem,
but it was never triggered to date.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12478043
Break all 386 builders.
««« original CL description
runtime: use gcpc/gcsp during traceback of goroutines in syscalls
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12250043
»»»
R=rsc
CC=golang-dev
https://golang.org/cl/12424045
It was needed for the old scheduler,
because there temporary could be more threads than gomaxprocs.
In the new scheduler gomaxprocs is always respected.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12438043
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12250043
Make it accept type, combine flags.
Several reasons for the change:
1. mallocgc and settype must be atomic wrt GC
2. settype is called from only one place now
3. it will help performance (eventually settype
functionality must be combined with markallocated)
4. flags are easier to read now (no mallocgc(sz, 0, 1, 0) anymore)
R=golang-dev, iant, nightlyone, rsc, dave, khr, bradfitz, r
CC=golang-dev
https://golang.org/cl/10136043
This CL introduces a FUNCDATA number for runtime-specific
garbage collection metadata, changes the C and Go compilers
to emit that metadata, and changes the runtime to expect it.
The old pseudo-instructions that carried this information
are gone, as is the linker code to process them.
R=golang-dev, dvyukov, cshapiro
CC=golang-dev
https://golang.org/cl/11406044
Currently preemption signal g->stackguard0==StackPreempt
can be lost if it is received when preemption is disabled
(e.g. m->lock!=0). This change duplicates the preemption
signal in g->preempt and restores g->stackguard0
when preemption is enabled.
Update #543.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/10792043
Design at http://golang.org/s/go12symtab.
This enables some cleanup of the garbage collector metadata
that will be done in future CLs.
This CL does not move the old symtab and pclntab back into
an unmapped section of the file. That's a bit tricky and will be
done separately.
Fixes#4020.
R=golang-dev, dave, cshapiro, iant, r
CC=golang-dev, nigeltao
https://golang.org/cl/11085043
Currently it replaces GOGCTRACE env var (GODEBUG=gctrace=1).
The plan is to extend it with other type of debug tracing,
e.g. GODEBUG=gctrace=1,schedtrace=100.
R=rsc
CC=bradfitz, daniel.morsing, gobot, golang-dev
https://golang.org/cl/10026045
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.
For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:
TEXT myfunc
check for split
jmpcond nosplit
call morestack
nosplit:
sub $xxx, sp
Now an extra instruction is inserted after the call:
TEXT myfunc
start:
check for split
jmpcond nosplit
call morestack
jmp start
nosplit:
sub $xxx, sp
The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.
The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.
The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.
Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)
runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow
goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
/Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
/Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
/Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
/Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
/Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
/Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
/Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
/Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
/Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
/Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
/Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
/Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8
And here is a seg fault during oldstack:
SIGSEGV: segmentation violation
PC=0x1b2a6
runtime.oldstack()
/Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
/Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22
goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
/Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
/Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
/Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
/Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8
goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
/Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
/Users/rsc/g/go/src/pkg/runtime/proc.c:166
rax 0x23ccc0
rbx 0x23ccc0
rcx 0x0
rdx 0x38
rdi 0x2102c0170
rsi 0x221032cfe0
rbp 0x221032cfa0
rsp 0x7fff5fbff5b0
r8 0x2102c0120
r9 0x221032cfa0
r10 0x221032c000
r11 0x104ce8
r12 0xe5c80
r13 0x1be82baac718
r14 0x13091135f7d69200
r15 0x0
rip 0x1b2a6
rflags 0x10246
cs 0x2b
fs 0x0
gs 0x0
Fixes#5723.
R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
If first GC runs concurrently with setGCPercent,
it can overwrite gcpercent value with default.
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/10242047
No need to change to Grunnable state.
Add some more checks for Grunning state.
R=golang-dev, rsc, khr, dvyukov
CC=golang-dev
https://golang.org/cl/10186045