1
0
mirror of https://github.com/golang/go synced 2024-10-04 06:31:22 -06:00
Commit Graph

262 Commits

Author SHA1 Message Date
Dmitriy Vyukov
101c00a44f runtime: fix dump of data/bss
Fixes #8530.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, rsc
https://golang.org/cl/124440043
2014-08-18 16:42:24 +04:00
Dmitriy Vyukov
c6fe53a230 runtime: mark with non-atomic operations when GOMAXPROCS=1
Perf builders show 3-5% GC pause increase with GOMAXPROCS=1 when marking with atomic ops:
http://goperfd.appspot.com/perfdetail?commit=a8a6e765d6a87f7ccb71fd85a60eb5a821151f85&commit0=3b864e02b987171e05e2e9d0840b85b5b6476386&kind=builder&builder=linux-amd64&benchmark=http

LGTM=rlh
R=golang-codereviews, rlh
CC=dave, golang-codereviews, khr, rsc
https://golang.org/cl/128340043
2014-08-16 09:07:55 +04:00
Dmitriy Vyukov
c5e648d684 runtime: fix getgcmask
bv.data is an array of uint32s but the code was using
offsets computed for an array of bytes.
Add a test for stack GC info.
Fixes #8531.

LGTM=rsc
R=golang-codereviews
CC=golang-codereviews, khr, rsc
https://golang.org/cl/124450043
2014-08-15 22:36:12 +04:00
Dmitriy Vyukov
0c32bd6262 runtime: mark objects with non-atomic operations
On the go.benchmarks/garbage benchmark with GOMAXPROCS=16:
                   old ns/op     new ns/op     delta
time               1392254       1353170       -2.81%
cputime            21995751      21373999      -2.83%
gc-pause-one       15044812      13050524      -13.26%
gc-pause-total     213636        185317        -13.26%

LGTM=rlh
R=golang-codereviews, rlh
CC=golang-codereviews, khr, rsc
https://golang.org/cl/123380043
2014-08-14 21:38:24 +04:00
Dmitriy Vyukov
187d0f6720 runtime: keep objects in free lists marked as allocated.
Restore https://golang.org/cl/41040043 after GC rewrite.
Original description:
On the plus side, we don't need to change the bits on malloc and free.
On the downside, we need to mark objects in the free lists during GC.
But the free lists are small at GC time, so it should be a net win.

benchmark             old ns/op     new ns/op     delta
BenchmarkMalloc8      21.9          20.4          -6.85%
BenchmarkMalloc16     31.1          29.6          -4.82%

LGTM=khr
R=khr
CC=golang-codereviews, rlh, rsc
https://golang.org/cl/122280043
2014-08-13 20:42:55 +04:00
Peter Collingbourne
e03bce158f cmd/cc, runtime: eliminate use of the unnamed substructure C extension
Eliminating use of this extension makes it easier to port the Go runtime
to other compilers. This CL also disables the extension in cc to prevent
accidental use.

LGTM=rsc, khr
R=rsc, aram, khr, dvyukov
CC=axwalk, golang-codereviews
https://golang.org/cl/106790044
2014-08-07 09:00:02 -04:00
Dmitriy Vyukov
aac7f1a0d6 runtime: convert markallocated from C to Go
benchmark                      old ns/op     new ns/op     delta
BenchmarkMalloc8               28.7          22.4          -21.95%
BenchmarkMalloc16              44.8          33.8          -24.55%
BenchmarkMallocTypeInfo8       49.0          32.9          -32.86%
BenchmarkMallocTypeInfo16      46.7          35.8          -23.34%
BenchmarkMallocLargeStruct     907           901           -0.66%
BenchmarkGobDecode             13235542      12036851      -9.06%
BenchmarkGobEncode             10639699      9539155       -10.34%
BenchmarkJSONEncode            25193036      21898922      -13.08%
BenchmarkJSONDecode            96104044      89464904      -6.91%

Fixes #8452.

LGTM=khr
R=golang-codereviews, bradfitz, rsc, dave, khr
CC=golang-codereviews
https://golang.org/cl/122090043
2014-08-07 13:34:30 +04:00
Dmitriy Vyukov
cd2f8356ce runtime: remove mal/malloc/FlagNoGC/FlagNoInvokeGC
FlagNoGC is unused now.
FlagNoInvokeGC is unneeded as we don't invoke GC
on g0 and when holding locks anyway.
mal/malloc have very few uses and you never remember
the exact set of flags they use and the difference between them.
Moreover, eventually we need to give exact types to all allocations,
something what mal/malloc do not support.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, rsc
https://golang.org/cl/117580043
2014-08-07 13:04:04 +04:00
Dmitriy Vyukov
192bccbf33 runtime: shrink stacks in parallel
Shrinkstack does not touch normal heap anymore,
so we can shink stacks concurrently with marking.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, khr, rlh, rsc
https://golang.org/cl/122130043
2014-08-07 12:55:28 +04:00
Keith Randall
e359bea8ad runtime: clean up naming of mcallable functions.
Introduce the mFunction type to represent an mcall/onM-able function.
Name such functions using _m.

LGTM=bradfitz
R=bradfitz
CC=golang-codereviews
https://golang.org/cl/121320043
2014-08-06 14:33:57 -07:00
Dmitriy Vyukov
f6f2f77142 runtime: cache one GC workbuf in thread-local storage
We call scanblock for lots of small root pieces
e.g. for every stack frame args and locals area.
Every scanblock invocation calls getempty/putempty,
which accesses lock-free stack shared among all worker threads.
One-element local cache allows most scanblock calls
to proceed without accessing the shared stack.

LGTM=rsc
R=golang-codereviews, rlh
CC=golang-codereviews, khr, rsc
https://golang.org/cl/121250043
2014-08-06 01:50:37 +04:00
Ian Lance Taylor
ab5d105ba9 runtime: use memmove rather than memcopy in mgc0.c
For consistency with other code, as that was the only use of
memcopy outside of alg.goc.

LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/122030044
2014-08-04 20:40:44 -07:00
Dmitriy Vyukov
cecca43804 runtime: get rid of free
Several reasons:
1. Significantly simplifies runtime.
2. This code proved to be buggy.
3. Free is incompatible with bump-the-pointer allocation.
4. We want to write runtime in Go, Go does not have free.
5. Too much code to free env strings on startup.

LGTM=khr
R=golang-codereviews, josharian, tracey.brendan, khr
CC=bradfitz, golang-codereviews, r, rlh, rsc
https://golang.org/cl/116390043
2014-07-31 12:55:40 +04:00
Keith Randall
4aa50434e1 runtime: rewrite malloc in Go.
This change introduces gomallocgc, a Go clone of mallocgc.
Only a few uses have been moved over, so there are still
lots of uses from C. Many of these C uses will be moved
over to Go (e.g. in slice.goc), but probably not all.
What should remain of C's mallocgc is an open question.

LGTM=rsc, dvyukov
R=rsc, khr, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/108840046
2014-07-30 09:01:52 -07:00
Dmitriy Vyukov
cd17a717f9 runtime: simpler and faster GC
Implement the design described in:
https://docs.google.com/document/d/1v4Oqa0WwHunqlb8C3ObL_uNQw3DfSY-ztoA-4wWbKcg/pub

Summary of the changes:
GC uses "2-bits per word" pointer type info embed directly into bitmap.
Scanning of stacks/data/heap is unified.
The old spans types go away.
Compiler generates "sparse" 4-bits type info for GC (directly for GC bitmap).
Linker generates "dense" 2-bits type info for data/bss (the same as stacks use).

Summary of results:
-1680 lines of code total (-1000+ in mgc0.c only)
-25% memory consumption
-3-7% binary size
-15% GC pause reduction
-7% run time reduction

LGTM=khr
R=golang-codereviews, rsc, christoph, khr
CC=golang-codereviews, rlh
https://golang.org/cl/106260045
2014-07-29 11:01:02 +04:00
Dmitriy Vyukov
92c54e4a73 runtime: simplify code
LGTM=khr
R=golang-codereviews, dave, khr
CC=golang-codereviews, rsc
https://golang.org/cl/116950043
2014-07-22 01:56:01 +04:00
Keith Randall
2425a2e32f runtime: fix gctrace=1
updatememstats is called on both the m and g stacks.
Call into flushallmcaches correctly.  flushallmcaches
can only run on the M stack.

This is somewhat temporary.  once ReadMemStats is in
Go we can have all of this code M-only.

LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews
https://golang.org/cl/116880043
2014-07-18 13:05:21 +04:00
Keith Randall
f378f30034 undo CL 101570044 / 2c57aaea79c4
redo stack allocation.  This is mostly the same as
the original CL with a few bug fixes.

1. add racemalloc() for stack allocations
2. fix poolalloc/poolfree to terminate free lists correctly.
3. adjust span ref count correctly.
4. don't use cache for sizes >= StackCacheSize.

Should fix bugs and memory leaks in original changelist.

««« original CL description
undo CL 104200047 / 318b04f28372

Breaks windows and race detector.
TBR=rsc

««« original CL description
runtime: stack allocator, separate from mallocgc

In order to move malloc to Go, we need to have a
separate stack allocator.  If we run out of stack
during malloc, malloc will not be available
to allocate a new stack.

Stacks are the last remaining FlagNoGC objects in the
GC heap.  Once they are out, we can get rid of the
distinction between the allocated/blockboundary bits.
(This will be in a separate change.)

Fixes #7468
Fixes #7424

LGTM=rsc, dvyukov
R=golang-codereviews, dvyukov, khr, dave, rsc
CC=golang-codereviews
https://golang.org/cl/104200047
»»»

TBR=rsc
CC=golang-codereviews
https://golang.org/cl/101570044
»»»

LGTM=dvyukov
R=dvyukov, dave, khr, alex.brainman
CC=golang-codereviews
https://golang.org/cl/112240044
2014-07-17 14:41:46 -07:00
Dmitriy Vyukov
d3be3daafe runtime: delete unnecessary confusing code
The code in GC that handles gp->gobuf.ctxt is wrong,
because it does not mark the ctxt object itself,
if just queues the ctxt object for scanning.
So the ctxt object can be collected as garbage.
However, Gobuf.ctxt is void*, so it's always marked and
scanned through G.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, khr, rsc
https://golang.org/cl/105490044
2014-07-03 22:58:42 +04:00
Keith Randall
3cf83c182a undo CL 104200047 / 318b04f28372
Breaks windows and race detector.
TBR=rsc

««« original CL description
runtime: stack allocator, separate from mallocgc

In order to move malloc to Go, we need to have a
separate stack allocator.  If we run out of stack
during malloc, malloc will not be available
to allocate a new stack.

Stacks are the last remaining FlagNoGC objects in the
GC heap.  Once they are out, we can get rid of the
distinction between the allocated/blockboundary bits.
(This will be in a separate change.)

Fixes #7468
Fixes #7424

LGTM=rsc, dvyukov
R=golang-codereviews, dvyukov, khr, dave, rsc
CC=golang-codereviews
https://golang.org/cl/104200047
»»»

TBR=rsc
CC=golang-codereviews
https://golang.org/cl/101570044
2014-06-30 19:48:08 -07:00
Keith Randall
7c13860cd0 runtime: stack allocator, separate from mallocgc
In order to move malloc to Go, we need to have a
separate stack allocator.  If we run out of stack
during malloc, malloc will not be available
to allocate a new stack.

Stacks are the last remaining FlagNoGC objects in the
GC heap.  Once they are out, we can get rid of the
distinction between the allocated/blockboundary bits.
(This will be in a separate change.)

Fixes #7468
Fixes #7424

LGTM=rsc, dvyukov
R=golang-codereviews, dvyukov, khr, dave, rsc
CC=golang-codereviews
https://golang.org/cl/104200047
2014-06-30 18:59:24 -07:00
Dmitriy Vyukov
5bfe8adee5 runtime: fix GC bitmap corruption
Fixes #8299.

R=golang-codereviews
CC=golang-codereviews, khr, rsc
https://golang.org/cl/103640044
2014-06-28 19:20:46 -07:00
Dmitriy Vyukov
03f2189a1b runtime: make garbage collector faster by deleting code again
Remove GC bitmap backward scanning.
This was already done once in https://golang.org/cl/5530074/
Still makes GC a bit faster.
On the garbage benchmark, before:
        gc-pause-one=237345195
        gc-pause-total=4746903
        cputime=32427775
        time=32458208
after:
        gc-pause-one=235484019
        gc-pause-total=4709680
        cputime=31861965
        time=31877772
Also prepares mgc0.c for future changes.

R=golang-codereviews, khr, khr
CC=golang-codereviews, rsc
https://golang.org/cl/105380043
2014-06-27 18:19:02 -07:00
Russ Cox
89f185fe8a all: remove 'extern register M *m' from runtime
The runtime has historically held two dedicated values g (current goroutine)
and m (current thread) in 'extern register' slots (TLS on x86, real registers
backed by TLS on ARM).

This CL removes the extern register m; code now uses g->m.

On ARM, this frees up the register that formerly held m (R9).
This is important for NaCl, because NaCl ARM code cannot use R9 at all.

The Go 1 macrobenchmarks (those with per-op times >= 10 µs) are unaffected:

BenchmarkBinaryTree17              5491374955     5471024381     -0.37%
BenchmarkFannkuch11                4357101311     4275174828     -1.88%
BenchmarkGobDecode                 11029957       11364184       +3.03%
BenchmarkGobEncode                 6852205        6784822        -0.98%
BenchmarkGzip                      650795967      650152275      -0.10%
BenchmarkGunzip                    140962363      141041670      +0.06%
BenchmarkHTTPClientServer          71581          73081          +2.10%
BenchmarkJSONEncode                31928079       31913356       -0.05%
BenchmarkJSONDecode                117470065      113689916      -3.22%
BenchmarkMandelbrot200             6008923        5998712        -0.17%
BenchmarkGoParse                   6310917        6327487        +0.26%
BenchmarkRegexpMatchMedium_1K      114568         114763         +0.17%
BenchmarkRegexpMatchHard_1K        168977         169244         +0.16%
BenchmarkRevcomp                   935294971      914060918      -2.27%
BenchmarkTemplate                  145917123      148186096      +1.55%

Minux previous reported larger variations, but these were caused by
run-to-run noise, not repeatable slowdowns.

Actual code changes by Minux.
I only did the docs and the benchmarking.

LGTM=dvyukov, iant, minux
R=minux, josharian, iant, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/109050043
2014-06-26 11:54:39 -04:00
Dmitriy Vyukov
cc81712190 runtime: remove obsolete afterprologue check
Afterprologue check was required when did not know
about return arguments of functions and/or they were not zeroed.
Now 100% precision is required for stacks due to stack copying,
so it must work w/o afterprologue one way or another.
I can limit this change for 1.3 to merely adding a TODO,
but this check is super confusing so I don't want this knowledge to get lost.

LGTM=rsc
R=golang-codereviews, gobot, rsc, khr
CC=golang-codereviews, khr, rsc
https://golang.org/cl/96580045
2014-06-19 22:04:10 -07:00
Russ Cox
14d2ee1d00 runtime: make continuation pc available to stack walk
The 'continuation pc' is where the frame will continue
execution, if anywhere. For a frame that stopped execution
due to a CALL instruction, the continuation pc is immediately
after the CALL. But for a frame that stopped execution due to
a fault, the continuation pc is the pc after the most recent CALL
to deferproc in that frame, or else 0. That is where execution
will continue, if anywhere.

The liveness information is only recorded for CALL instructions.
This change makes sure that we never look for liveness information
except for CALL instructions.

Using a valid PC fixes crashes when a garbage collection or
stack copying tries to process a stack frame that has faulted.

Record continuation pc in heapdump (format change).

Fixes #8048.

LGTM=iant, khr
R=khr, iant, dvyukov
CC=golang-codereviews, r
https://golang.org/cl/100870044
2014-05-31 10:10:12 -04:00
Dmitriy Vyukov
e893acf184 runtime: fix freeOSMemory to free memory immediately
Currently freeOSMemory makes only marking phase of GC, but not sweeping phase.
So recently memory is not released after freeOSMemory.
Do both marking and sweeping during freeOSMemory.
Fixes #8019.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, rsc
https://golang.org/cl/97550043
2014-05-19 12:06:30 +04:00
Russ Cox
68aaf2ccda runtime: make scan of pointer-in-interface same as scan of pointer
The GC program describing a data structure sometimes trusts the
pointer base type and other times does not (if not, the garbage collector
must fall back on per-allocation type information stored in the heap).
Make the scanning of a pointer in an interface do the same.
This fixes a crash in a particular use of reflect.SliceHeader.

Fixes #8004.

LGTM=khr
R=golang-codereviews, khr
CC=0xe2.0x9a.0x9b, golang-codereviews, iant, r
https://golang.org/cl/100470045
2014-05-15 15:53:36 -04:00
Dmitriy Vyukov
a12661329b runtime: fix triggering of forced GC
mstats.last_gc is unix time now, it is compared with abstract monotonic time.
On my machine GC is forced every 5 mins regardless of last_gc.

LGTM=rsc
R=golang-codereviews
CC=golang-codereviews, iant, rsc
https://golang.org/cl/91350045
2014-05-13 09:53:03 +04:00
Dmitriy Vyukov
acb03b8028 runtime: optimize markspan
Increases throughput by 2x on a memory hungry program on 8-node NUMA machine.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/100230043
2014-05-07 19:32:34 +04:00
Dmitriy Vyukov
350a8fcde1 runtime: make MemStats.LastGC Unix time again
The monotonic clock patch changed all runtime times
to abstract monotonic time. As the result user-visible
MemStats.LastGC become monotonic time as well.
Restore Unix time for LastGC.

This is the simplest way to expose time.now to runtime that I found.
Another option would be to change time.now to C called
int64 runtime.unixnanotime() and then express time.now in terms of it.
But this would require to introduce 2 64-bit divisions into time.now.
Another option would be to change time.now to C called
void runtime.unixnanotime1(struct {int64 sec, int32 nsec} *now)
and then express both time.now and runtime.unixnanotime in terms of it.

Fixes #7852.

LGTM=minux.ma, iant
R=minux.ma, rsc, iant
CC=golang-codereviews
https://golang.org/cl/93720045
2014-05-02 17:32:42 +01:00
Russ Cox
a5b1530557 runtime: adjust GC debug print to include source pointers
Having the pointers means you can grub around in the
binary finding out more about them.

This helped with issue 7748.

LGTM=minux.ma, bradfitz
R=golang-codereviews, minux.ma, bradfitz
CC=golang-codereviews
https://golang.org/cl/88090045
2014-04-16 11:39:43 -04:00
Dmitriy Vyukov
55e0f36fb4 runtime: fix program termination when main goroutine calls Goexit
Do not consider idle finalizer/bgsweep/timer goroutines as doing something useful.
We can't simply set isbackground for the whole lifetime of the goroutines,
because when finalizer goroutine calls user function, we do want to consider it
as doing something useful.
This is borken due to timers for quite some time.
With background sweep is become even more broken.
Fixes #7784.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/87960044
2014-04-15 19:48:17 +04:00
Dmitriy Vyukov
8fc6ed4c89 sync: less agressive local caching in Pool
Currently Pool can cache up to 15 elements per P, and these elements are not accesible to other Ps.
If a Pool caches large objects, say 2MB, and GOMAXPROCS is set to a large value, say 32,
then the Pool can waste up to 960MB.
The new caching policy caches at most 1 per-P element, the rest is shared between Ps.

Get/Put performance is unchanged. Nested Get/Put performance is 57% worse.
However, overall scalability of nested Get/Put is significantly improved,
so the new policy starts winning under contention.

benchmark                     old ns/op     new ns/op     delta
BenchmarkPool                 27.4          26.7          -2.55%
BenchmarkPool-4               6.63          6.59          -0.60%
BenchmarkPool-16              1.98          1.87          -5.56%
BenchmarkPool-64              1.93          1.86          -3.63%
BenchmarkPoolOverlflow        3970          6235          +57.05%
BenchmarkPoolOverlflow-4      10935         1668          -84.75%
BenchmarkPoolOverlflow-16     13419         520           -96.12%
BenchmarkPoolOverlflow-64     10295         380           -96.31%

LGTM=rsc
R=rsc
CC=golang-codereviews, khr
https://golang.org/cl/86020043
2014-04-14 21:13:32 +04:00
Russ Cox
5539ef02b6 runtime: make times in GODEBUG=gctrace=1 output clearer
TBR=0intro
CC=golang-codereviews
https://golang.org/cl/86620043
2014-04-10 14:34:48 -04:00
Russ Cox
95ee7d6414 runtime: use 3x fewer nanotime calls in garbage collection
Cuts the number of calls from 6 to 2 in the non-debug case.

LGTM=iant
R=golang-codereviews, iant
CC=0intro, aram, golang-codereviews, khr
https://golang.org/cl/86040043
2014-04-09 10:38:12 -04:00
Russ Cox
72c5d5e756 reflect, runtime: fix crash in GC due to reflect.call + precise GC
Given
        type Outer struct {
                *Inner
                ...
        }
the compiler generates the implementation of (*Outer).M dispatching to
the embedded Inner. The implementation is logically:
        func (p *Outer) M() {
                (p.Inner).M()
        }
but since the only change here is the replacement of one pointer
receiver with another, the actual generated code overwrites the
original receiver with the p.Inner pointer and then jumps to the M
method expecting the *Inner receiver.

During reflect.Value.Call, we create an argument frame and the
associated data structures to describe it to the garbage collector,
populate the frame, call reflect.call to run a function call using
that frame, and then copy the results back out of the frame. The
reflect.call function does a memmove of the frame structure onto the
stack (to set up the inputs), runs the call, and the memmoves the
stack back to the frame structure (to preserve the outputs).

Originally reflect.call did not distinguish inputs from outputs: both
memmoves were for the full stack frame. However, in the case where the
called function was one of these wrappers, the rewritten receiver is
almost certainly a different type than the original receiver. This is
not a problem on the stack, where we use the program counter to
determine the type information and understand that during (*Outer).M
the receiver is an *Outer while during (*Inner).M the receiver in the
same memory word is now an *Inner. But in the statically typed
argument frame created by reflect, the receiver is always an *Outer.
Copying the modified receiver pointer off the stack into the frame
will store an *Inner there, and then if a garbage collection happens
to scan that argument frame before it is discarded, it will scan the
*Inner memory as if it were an *Outer. If the two have different
memory layouts, the collection will intepret the memory incorrectly.

Fix by only copying back the results.

Fixes #7725.

LGTM=khr
R=khr
CC=dave, golang-codereviews
https://golang.org/cl/85180043
2014-04-08 11:11:35 -04:00
Russ Cox
28f1868fed cmd/gc, runtime: make GODEBUG=gcdead=1 mode work with liveness
Trying to make GODEBUG=gcdead=1 work with liveness
and in particular ambiguously live variables.

1. In the liveness computation, mark all ambiguously live
variables as live for the entire function, except the entry.
They are zeroed directly after entry, and we need them not
to be poisoned thereafter.

2. In the liveness computation, compute liveness (and deadness)
for all parameters, not just pointer-containing parameters.
Otherwise gcdead poisons untracked scalar parameters and results.

3. Fix liveness debugging print for -live=2 to use correct bitmaps.
(Was not updated for compaction during compaction CL.)

4. Correct varkill during map literal initialization.
Was killing the map itself instead of the inserted value temp.

5. Disable aggressive varkill cleanup for call arguments if
the call appears in a defer or go statement.

6. In the garbage collector, avoid bug scanning empty
strings. An empty string is two zeros. The multiword
code only looked at the first zero and then interpreted
the next two bits in the bitmap as an ordinary word bitmap.
For a string the bits are 11 00, so if a live string was zero
length with a 0 base pointer, the poisoning code treated
the length as an ordinary word with code 00, meaning it
needed poisoning, turning the string into a poison-length
string with base pointer 0. By the same logic I believe that
a live nil slice (bits 11 01 00) will have its cap poisoned.
Always scan full multiword struct.

7. In the runtime, treat both poison words (PoisonGC and
PoisonStack) as invalid pointers that warrant crashes.

Manual testing as follows:

- Create a script called gcdead on your PATH containing:

        #!/bin/bash
        GODEBUG=gcdead=1 GOGC=10 GOTRACEBACK=2 exec "$@"
- Now you can build a test and then run 'gcdead ./foo.test'.
- More importantly, you can run 'go test -short -exec gcdead std'
   to run all the tests.

Fixes #7676.

While here, enable the precise scanning of slices, since that was
disabled due to bugs like these. That now works, both with and
without gcdead.

Fixes #7549.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83410044
2014-04-03 20:33:25 -04:00
Russ Cox
4676fae525 cmd/gc, cmd/ld, runtime: compact liveness bitmaps
Reduce footprint of liveness bitmaps by about 5x.

1. Mark all liveness bitmap symbols as 4-byte aligned
(they were aligned to a larger size by default).

2. The bitmap data is a bitmap count n followed by n bitmaps.
Each bitmap begins with its own count m giving the number
of bits. All the m's are the same for the n bitmaps.
Emit this bitmap length once instead of n times.

3. Many bitmaps within a function have the same bit values,
but each call site was given a distinct bitmap. Merge duplicate
bitmaps so that no bitmap is written more than once.

4. Many functions end up with the same aggregate bitmap data.
We used to name the bitmap data funcname.gcargs and funcname.gclocals.
Instead, name it gclocals.<md5 of data> and mark it dupok so
that the linker coalesces duplicate sets. This cut the bitmap
data remaining after step 3 by 40%; I was not expecting it to
be quite so dramatic.

Applied to "go build -ldflags -w code.google.com/p/go.tools/cmd/godoc":

                bitmaps           pclntab           binary on disk
before this CL  1326600           1985854           12738268
4-byte align    1154288 (0.87x)   1985854 (1.00x)   12566236 (0.99x)
one bitmap len   782528 (0.54x)   1985854 (1.00x)   12193500 (0.96x)
dedup bitmap     414748 (0.31x)   1948478 (0.98x)   11787996 (0.93x)
dedup bitmap set 245580 (0.19x)   1948478 (0.98x)   11620060 (0.91x)

While here, remove various dead blocks of code from plive.c.

Fixes #6929.
Fixes #7568.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83630044
2014-04-02 16:49:27 -04:00
Russ Cox
1ec4d5e9e7 runtime: adjust GODEBUG=allocfreetrace=1 and GODEBUG=gcdead=1
GODEBUG=allocfreetrace=1:

The allocfreetrace=1 mode prints a stack trace for each block
allocated and freed, and also a stack trace for each garbage collection.

It was implemented by reusing the heap profiling support: if allocfreetrace=1
then the heap profile was effectively running at 1 sample per 1 byte allocated
(always sample). The stack being shown at allocation was the stack gathered
for profiling, meaning it was derived only from the program counters and
did not include information about function arguments or frame pointers.
The stack being shown at free was the allocation stack, not the free stack.
If you are generating this log, you can find the allocation stack yourself, but
it can be useful to see exactly the sequence that led to freeing the block:
was it the garbage collector or an explicit free? Now that the garbage collector
runs on an m0 stack, the stack trace for the garbage collector was never interesting.

Fix all these problems:

1. Decouple allocfreetrace=1 from heap profiling.
2. Print the standard goroutine stack traces instead of a custom format.
3. Print the stack trace at time of allocation for an allocation,
   and print the stack trace at time of free (not the allocation trace again)
   for a free.
4. Print all goroutine stacks at garbage collection. Having all the stacks
   means that you can see the exact point at which each goroutine was
   preempted, which is often useful for identifying liveness-related errors.

GODEBUG=gcdead=1:

This mode overwrites dead pointers with a poison value.
Detect the poison value as an invalid pointer during collection,
the same way that small integers are invalid pointers.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/81670043
2014-04-01 13:30:10 -04:00
Russ Cox
5a23a7e52c runtime: enable 'bad pointer' check during garbage collection of Go stack frames
This is the same check we use during stack copying.
The check cannot be applied to C stack frames, even
though we do emit pointer bitmaps for the arguments,
because (1) the pointer bitmaps assume all arguments
are always live, not true of outputs during the prologue,
and (2) the pointer bitmaps encode interface values as
pointer pairs, not true of interfaces holding integers.

For the rest of the frames, however, we should hold ourselves
to the rule that a pointer marked live really is initialized.
The interface scanning already implicitly checks this
because it interprets the type word  as a valid type pointer.

This may slow things down a little because of the extra loads.
Or it may speed things up because we don't bother enqueuing
nil pointers anymore. Enough of the rest of the system is slow
right now that we can't measure it meaningfully.
Enable for now, even if it is slow, to shake out bugs in the
liveness bitmaps, and then decide whether to turn it off
for the Go 1.3 release (issue 7650 reminds us to do this).

The new m->traceback field lets us force printing of fp=
values on all goroutine stack traces when we detect a
bad pointer. This makes it easier to understand exactly
where in the frame the bad pointer is, so that we can trace
it back to a specific variable and determine what is wrong.

Update #7650

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80860044
2014-03-27 14:06:15 -04:00
Dmitriy Vyukov
f8c350873c runtime: fix yet another race in bgsweep
Currently it's possible that bgsweep finishes before all spans
have been swept (we only know that sweeping of all spans has *started*).
In such case bgsweep may fail wake up runfinq goroutine when it needs to.
finq may still be nil at this point, but some finalizers may be queued later.
Make bgsweep to wait for sweeping to *complete*, then it can decide
whether it needs to wake up runfinq for sure.
Update #7533

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/75960043
2014-03-26 15:11:36 +04:00
Dmitriy Vyukov
40f5e67571 runtime: minor improvement of string scanning
If we set obj, then it will be enqueued for marking at the end of the scanning loop.
This is not necessary, since we've already marked it.
This can wait for 1.4 if you wish.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/80030043
2014-03-26 15:03:58 +04:00
Keith Randall
fff63c2448 runtime: WriteHeapDump dumps the heap to a file.
See http://golang.org/s/go13heapdump for the file format.

LGTM=rsc
R=rsc, bradfitz, dvyukov, khr
CC=golang-codereviews
https://golang.org/cl/37540043
2014-03-25 15:09:49 -07:00
Keith Randall
1b45cc45e3 runtime: redo stack map entries to avoid false retention
Change two-bit stack map entries to encode:
0 = dead
1 = scalar
2 = pointer
3 = multiword

If multiword, the two-bit entry for the following word encodes:
0 = string
1 = slice
2 = iface
3 = eface

That way, during stack scanning we can check if a string
is zero length or a slice has zero capacity.  We can avoid
following the contained pointer in those cases.  It is safe
to do so because it can never be dereferenced, and it is
desirable to do so because it may cause false retention
of the following block in memory.

Slice feature turned off until issue 7564 is fixed.

Update #7549

LGTM=rsc
R=golang-codereviews, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/76380043
2014-03-25 14:11:34 -07:00
Ian Lance Taylor
4ebfa83199 runtime: accurately record whether heap memory is reserved
The existing code did not have a clear notion of whether
memory has been actually reserved.  It checked based on
whether in 32-bit mode or 64-bit mode and (on GNU/Linux) the
requested address, but it confused the requested address and
the returned address.

LGTM=rsc
R=rsc, dvyukov
CC=golang-codereviews, michael.hudson
https://golang.org/cl/79610043
2014-03-25 13:22:19 -07:00
Ian Lance Taylor
8de04c78b7 runtime: change nproc local variable to uint32
The nproc and ndone fields are uint32.  This makes the type
consistent.

LGTM=minux.ma
R=golang-codereviews, minux.ma
CC=golang-codereviews
https://golang.org/cl/79340044
2014-03-25 05:18:08 -07:00
Dmitriy Vyukov
fed5428c4a runtime: fix another race in bgsweep
It's possible that bgsweep constantly does not catch up for some reason,
in this case runfinq was not woken at all.

R=rsc
CC=golang-codereviews
https://golang.org/cl/75940043
2014-03-14 23:32:12 +04:00
Dmitriy Vyukov
8d321625fd runtime: fix spans corruption
The problem was that spans end up in wrong lists after split
(e.g. in h->busy instead of h->central->empty).
Also the span can be non-swept before split,
I don't know what it can cause, but it's safer to operate on swept spans.
Fixes #7544.

R=golang-codereviews, rsc
CC=golang-codereviews, khr
https://golang.org/cl/76160043
2014-03-14 23:25:48 +04:00
Dmitriy Vyukov
0da73b9f07 runtime: fix a race in bgsweep
See the comment for description.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/75670044
2014-03-14 21:21:44 +04:00