1
0
mirror of https://github.com/golang/go synced 2024-11-14 18:40:29 -07:00
Commit Graph

82 Commits

Author SHA1 Message Date
Austin Clements
24a7252e25 runtime: finish sweeping before concurrent GC starts
Currently, the concurrent sweep follows a 1:1 rule: when allocation
needs a span, it sweeps a span (likewise, when a large allocation
needs N pages, it sweeps until it frees N pages). This rule worked
well for the STW collector (especially when GOGC==100) because it did
no more sweeping than necessary to keep the heap from growing, would
generally finish sweeping just before GC, and ensured good temporal
locality between sweeping a page and allocating from it.

It doesn't work well with concurrent GC. Since concurrent GC requires
starting GC earlier (sometimes much earlier), the sweep often won't be
done when GC starts. Unfortunately, the first thing GC has to do is
finish the sweep. In the mean time, the mutator can continue
allocating, pushing the heap size even closer to the goal size. This
worked okay with the 7/8ths trigger, but it gets into a vicious cycle
with the GC trigger controller: if the mutator is allocating quickly
and driving the trigger lower, more and more sweep work will be left
to GC; this both causes GC to take longer (allowing the mutator to
allocate more during GC) and delays the start of the concurrent mark
phase, which throws off the GC controller's statistics and generally
causes it to push the trigger even lower.

As an example of a particularly bad case, the garbage benchmark with
GOMAXPROCS=4 and -benchmem 512 (MB) spends the first 0.4-0.8 seconds
of each GC cycle sweeping, during which the heap grows by between
109MB and 252MB.

To fix this, this change replaces the 1:1 sweep rule with a
proportional sweep rule. At the end of GC, GC knows exactly how much
heap allocation will occur before the next concurrent GC as well as
how many span pages must be swept. This change computes this "sweep
ratio" and when the mallocgc asks for a span, the mcentral sweeps
enough spans to bring the swept span count into ratio with the
allocated byte count.

On the benchmark from above, this entirely eliminates sweeping at the
beginning of GC, which reduces the time between startGC readying the
GC goroutine and GC stopping the world for sweep termination to ~100µs
during which the heap grows at most 134KB.

Change-Id: I35422d6bba0c2310d48bb1f8f30a72d29e98c1af
Reviewed-on: https://go-review.googlesource.com/8921
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:46 +00:00
Austin Clements
a0452a6821 runtime: proportional response GC trigger controller
Currently, concurrent GC triggers at a fixed 7/8*GOGC heap growth. For
mutators that allocate slowly, this means GC will trigger too early
and run too often, wasting CPU time on GC. For mutators that allocate
quickly, this means GC will trigger too late, causing the program to
exceed the GOGC heap growth goal and/or to exceed CPU goals because of
a high mutator assist ratio.

This change adds a feedback control loop to dynamically adjust the GC
trigger from cycle to cycle. By monitoring the heap growth and GC CPU
utilization from cycle to cycle, this adjusts the Go garbage collector
to target the GOGC heap growth goal and the 25% CPU utilization goal.

Change-Id: Ic82eef288c1fa122f73b69fe604d32cbb219e293
Reviewed-on: https://go-review.googlesource.com/8851
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:37 +00:00
Austin Clements
8d03acce54 runtime: multi-threaded, utilization-scheduled background mark
Currently, the concurrent mark phase is performed by the main GC
goroutine. Prior to the previous commit enabling preemption, this
caused marking to always consume 1/GOMAXPROCS of the available CPU
time. If GOMAXPROCS=1, this meant background GC would consume 100% of
the CPU (effectively a STW). If GOMAXPROCS>4, background GC would use
less than the goal of 25%. If GOMAXPROCS=4, background GC would use
the goal 25%, but if the mutator wasn't using the remaining 75%,
background marking wouldn't take advantage of the idle time. Enabling
preemption in the previous commit made GC miss CPU targets in
completely different ways, but set us up to bring everything back in
line.

This change replaces the fixed GC goroutine with per-P background mark
goroutines. Once started, these goroutines don't go in the standard
run queues; instead, they are scheduled specially such that the time
spent in mutator assists and the background mark goroutines totals 25%
of the CPU time available to the program. Furthermore, this lets
background marking take advantage of idle Ps, which significantly
boosts GC performance for applications that under-utilize the CPU.

This requires also changing how time is reported for gctrace, so this
change splits the concurrent mark CPU time into assist/background/idle
scanning.

This also requires increasing the size of the StackRecord slice used
in a GoroutineProfile test.

Change-Id: I0936ff907d2cee6cb687a208f2df47e8988e3157
Reviewed-on: https://go-review.googlesource.com/8850
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:32 +00:00
Austin Clements
af060c3086 runtime: generally allow preemption during concurrent GC phases
Currently, the entire GC process runs with g.m.preemptoff set. In the
concurrent phases, the parts that actually need preemption disabled
are run on a system stack and there's no overall need to stay on the
same M or P during the concurrent phases. Hence, move the setting of
g.m.preemptoff to when we start mark termination, at which point we
really do need preemption disabled.

This dramatically changes the scheduling behavior of the concurrent
mark phase. Currently, since this is non-preemptible, concurrent mark
gets one dedicated P (so 1/GOMAXPROCS utilization). With this change,
the GC goroutine is scheduled like any other goroutine during
concurrent mark, so it gets 1/<runnable goroutines> utilization.

You might think it's not even necessary to set g.m.preemptoff at that
point since the world is stopped, but stackalloc/stackfree use this as
a signal that the per-P pools are not safe to access without
synchronization.

Change-Id: I08aebe8179a7d304650fb8449ff36262b3771099
Reviewed-on: https://go-review.googlesource.com/8839
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:27 +00:00
Austin Clements
100da60979 runtime: track time spent in mutator assists
This time is tracked per P and periodically flushed to the global
controller state. This will be used to compute mutator assist
utilization in order to schedule background GC work.

Change-Id: Ib94f90903d426a02cf488bf0e2ef67a068eb3eec
Reviewed-on: https://go-review.googlesource.com/8837
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:22 +00:00
Austin Clements
4b2fde945a runtime: proportional mutator assist
Currently, mutator allocation periodically assists the garbage
collector by performing a small, fixed amount of scanning work.
However, to control heap growth, mutators need to perform scanning
work *proportional* to their allocation rate.

This change implements proportional mutator assists. This uses the
scan work estimate computed by the garbage collector at the beginning
of each cycle to compute how much scan work must be performed per
allocation byte to complete the estimated scan work by the time the
heap reaches the goal size. When allocation triggers an assist, it
uses this ratio and the amount allocated since the last assist to
compute the assist work, then attempts to steal as much of this work
as possible from the background collector's credit, and then performs
any remaining scan work itself.

Change-Id: I98b2078147a60d01d6228b99afd414ef857e4fba
Reviewed-on: https://go-review.googlesource.com/8836
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:18 +00:00
Austin Clements
8e24283a28 runtime: track background scan work credit
This tracks scan work done by background GC in a global pool. Mutator
assists will draw on this credit to avoid doing work when background
GC is staying ahead.

Unlike the other GC controller tracking variables, this will be both
written and read throughout the cycle. Hence, we can't arbitrarily
delay updates like we can for scan work and bytes marked. However, we
still want to minimize contention, so this global credit pool is
allowed some error from the "true" amount of credit. Background GC
accumulates credit locally up to a limit and only then flushes to the
global pool. Similarly, mutator assists will draw from the credit pool
in batches.

Change-Id: I1aa4fc604b63bf53d1ee2a967694dffdfc3e255e
Reviewed-on: https://go-review.googlesource.com/8834
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:09 +00:00
Austin Clements
4e9fc0df48 runtime: implement GC scan work estimator
This implements tracking the scan work ratio of a GC cycle and using
this to estimate the scan work that will be required by the next GC
cycle. Currently this estimate is unused; it will be used to drive
mutator assists.

Change-Id: I8685b59d89cf1d83eddfc9b30d84da4e3a7f4b72
Reviewed-on: https://go-review.googlesource.com/8833
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:04 +00:00
Austin Clements
571ebae6ef runtime: track scan work performed during concurrent mark
This tracks the amount of scan work in terms of scanned pointers
during the concurrent mark phase. We'll use this information to
estimate scan work for the next cycle.

Currently this aggregates the work counter in gcWork and dispose
atomically aggregates this into a global work counter. dispose happens
relatively infrequently, so the contention on the global counter
should be low. If this turns out to be an issue, we can reduce the
number of disposes, and if it's still a problem, we can switch to
per-P counters.

Change-Id: Iac0364c466ee35fab781dbbbe7970a5f3c4e1fc1
Reviewed-on: https://go-review.googlesource.com/8832
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21 15:35:00 +00:00
Russ Cox
6a2b0c0b6d runtime: delete cgo_allocate
This memory is untyped and can't be used anymore.
The next version of SWIG won't need it.

Change-Id: I592b287c5f5186975ee09a9b28d8efe3b57134e7
Reviewed-on: https://go-review.googlesource.com/8956
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-04-17 01:30:47 +00:00
Austin Clements
4b956ae317 runtime: start concurrent GC promptly when we reach its trigger
Currently, when allocation reaches the concurrent GC trigger size, we
start the concurrent collector by ready'ing its G. This simply puts it
on the end of the P's run queue, which means we may not actually start
GC for some time as the current G continues to run and then the P
drains other Gs already on its run queue. Since the mutator can
continue to allocate, the heap can potentially be much larger than we
intended by the time GC actually starts. Furthermore, how much larger
is difficult to predict since it depends on the scheduler.

Fix this by preempting the current G and switching directly to the
concurrent GC G as soon as we reach the trigger heap size.

On the garbage benchmark from the benchmarks subrepo with
GOMAXPROCS=4, this reduces the time from triggering the GC to the
beginning of sweep termination by 10 to 30 milliseconds, which reduces
allocation after the trigger by up to 10MB (a large fraction of the
64MB live heap the benchmark tries to maintain).

One other known source of delay before we "really" start GC is the
sweep finalization performed before sweep termination. This has
similar negative effects on heap size and predictability, but is an
orthogonal problem. This change adds a TODO for this.

Change-Id: I8bae98cb43685c1bf353ff55868e4647e3743c47
Reviewed-on: https://go-review.googlesource.com/8513
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-10 18:22:52 +00:00
Austin Clements
6afb5fa48f runtime: remove GoSched/GoStart trace events around GC
These were appropriate for STW GC, since it interrupted the allocating
Goroutine, but don't apply to concurrent GC, which runs on its own
Goroutine. Forced GC is still STW, but it makes sense to attribute the
GC to the goroutine that called runtime.GC().

Change-Id: If12418ca66dc7e53b8b16025af4e03adb5d9577e
Reviewed-on: https://go-review.googlesource.com/8715
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-10 18:21:52 +00:00
Michael Hudson-Doyle
a1f57598cc runtime, cmd/internal/ld: rename themoduledata to firstmoduledata
'themoduledata' doesn't really make sense now we support multiple moduledata
objects.

Change-Id: I8263045d8f62a42cb523502b37289b0fba054f62
Reviewed-on: https://go-review.googlesource.com/8521
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-04-10 05:11:49 +00:00
Michael Hudson-Doyle
fae4a128cb runtime, reflect: support multiple moduledata objects
This changes all the places that consult themoduledata to consult a
linked list of moduledata objects, as will be necessary for
-linkshared to work.

Obviously, as there is as yet no way of adding moduledata objects to
this list, all this change achieves right now is wasting a few
instructions here and there.

Change-Id: I397af7f60d0849b76aaccedf72238fe664867051
Reviewed-on: https://go-review.googlesource.com/8231
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-04-10 04:51:42 +00:00
Austin Clements
cb10ff1ef9 runtime: report next_gc for initial heap size in gctrace
Currently, the initial heap size reported in the gctrace line is the
heap_live right before sweep termination. However, we triggered GC
when heap_live reached next_gc, and there may have been significant
allocation between that point and the beginning of sweep
termination. Ideally these would be essentially the same, but
currently there's scheduler delay when readying the GC goroutine as
well as delay from background sweep finalization.

We should fix this delay, but in the mean time, to give the user a
better idea of how much the heap grew during the whole of garbage
collection, report the trigger rather than what the heap size happened
to be after the garbage collector finished rolling out of bed. This
will also be more useful for heap growth plots.

Change-Id: I08476b9fbcfb2de90592405e9c9f434dfb9eb1f8
Reviewed-on: https://go-review.googlesource.com/8512
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-09 22:18:06 +00:00
Austin Clements
8c3fc088fb runtime: report marked heap size in gctrace
When the gctrace GODEBUG option is enabled, it will now report three
heap sizes: the heap size at the beginning of the GC cycle, the heap
size at the end of the GC cycle before sweeping, and marked heap size,
which is the amount of heap that will be retained until the next GC
cycle.

Change-Id: Ie13f8a6d5c609bc9cc47c7555960ab55b37b5f1c
Reviewed-on: https://go-review.googlesource.com/8430
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-06 21:28:23 +00:00
Austin Clements
6d12b1780e runtime: make next_gc be heap size to trigger GC at
In the STW collector, next_gc was both the heap size to trigger GC at
as well as the goal heap size.

Early in the concurrent collector's development, next_gc was the goal
heap size, but was also used as the heap size to trigger GC at. This
meant we always overshot the goal because of allocation during
concurrent GC.

Currently, next_gc is still the goal heap size, but we trigger
concurrent GC at 7/8*GOGC heap growth. This complicates
shouldtriggergc, but was necessary because of the incremental
maintenance of next_gc.

Now we simply compute next_gc for the next cycle during mark
termination. Hence, it's now easy to take the simpler route and
redefine next_gc as the heap size at which the next GC triggers. We
can directly compute this with the 7/8 backoff during mark termination
and shouldtriggergc can simply test if the live heap size has grown
over the next_gc trigger.

This will also simplify later changes once we start setting next_gc in
more sophisticated ways.

Change-Id: I872be4ae06b4f7a0d7f7967360a054bd36b90eea
Reviewed-on: https://go-review.googlesource.com/8420
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-06 21:28:18 +00:00
Austin Clements
d7e0ad4b82 runtime: introduce heap_live; replace use of heap_alloc in GC
Currently there are two main consumers of memstats.heap_alloc:
updatememstats (aka ReadMemStats) and shouldtriggergc.

updatememstats recomputes heap_alloc from the ground up, so we don't
need to keep heap_alloc up to date for it. shouldtriggergc wants to
know how many bytes were marked by the previous GC plus how many bytes
have been allocated since then, but this *isn't* what heap_alloc
tracks. heap_alloc also includes objects that are not marked and
haven't yet been swept.

Introduce a new memstat called heap_live that actually tracks what
shouldtriggergc wants to know and stop keeping heap_alloc up to date.

Unlike heap_alloc, heap_live follows a simple sawtooth that drops
during each mark termination and increases monotonically between GCs.
heap_alloc, on the other hand, has much more complicated behavior: it
may drop during sweep termination, slowly decreases from background
sweeping between GCs, is roughly unaffected by allocation as long as
there are unswept spans (because we sweep and allocate at the same
rate), and may go up after background sweeping is done depending on
the GC trigger.

heap_live simplifies computing next_gc and using it to figure out when
to trigger garbage collection. Currently, we guess next_gc at the end
of a cycle and update it as we sweep and get a better idea of how much
heap was marked. Now, since we're directly tracking how much heap is
marked, we can directly compute next_gc.

This also corrects bugs that could cause us to trigger GC early.
Currently, in any case where sweep termination actually finds spans to
sweep, heap_alloc is an overestimation of live heap, so we'll trigger
GC too early. heap_live, on the other hand, is unaffected by sweeping.

Change-Id: I1f96807b6ed60d4156e8173a8e68745ffc742388
Reviewed-on: https://go-review.googlesource.com/8389
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-06 21:28:13 +00:00
Austin Clements
50a66562a0 runtime: track heap bytes marked by GC
This tracks the number of heap bytes marked by a GC cycle. We'll use
this information to precisely trigger the next GC cycle.

Currently this aggregates the work counter in gcWork and dispose
atomically aggregates this into a global work counter. dispose happens
relatively infrequently, so the contention on the global counter
should be low. If this turns out to be an issue, we can reduce the
number of disposes, and if it's still a problem, we can switch to
per-P counters.

Change-Id: I1bc377cb2e802ef61c2968602b63146d52e7f5db
Reviewed-on: https://go-review.googlesource.com/8388
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-06 21:28:07 +00:00
Austin Clements
f244a1471d runtime: add cumulative GC CPU % to gctrace line
This tracks both total CPU time used by GC and the total time
available to all Ps since the beginning of the program and uses this
to derive a cumulative CPU usage percent for the gctrace line.

Change-Id: Ica85372b8dd45f7621909b325d5ac713a9b0d015
Reviewed-on: https://go-review.googlesource.com/8350
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-02 23:37:13 +00:00
Austin Clements
24ee948269 runtime: update gctrace line for new garbage collector
GODEBUG=gctrace=1 turns on a per-GC cycle trace line. The current line
is left over from the STW garbage collector and includes a lot of
information that is no longer meaningful for the concurrent GC and
doesn't include a lot of information that is important.

Replace this line with a new line designed for the new garbage
collector.

This new line is focused more on helping the user understand the
impact of the garbage collector on their program and less on telling
us, the runtime developers, everything that's happening inside
GC. It's designed to fit in 80 columns and intentionally omit some
potentially useful things that were in the old line. We might want a
"verbose" mode that adds information for us.

We'll be able to further simplify the line once we eliminate the STW
around enabling the write barrier. Then we'll have just one STW phase,
one concurrent phase, and one more STW phase, so we'll be able to
reduce the number of times from five to three.

Change-Id: Icc30939fe4576fb4491b4eac811649395727aa2a
Reviewed-on: https://go-review.googlesource.com/8208
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-02 23:37:06 +00:00
Michael Hudson-Doyle
67426a8a9e runtime, cmd/internal/ld: change runtime to use a single linker symbol
In preparation for being able to run a go program that has code
in several objects, this changes from having several linker
symbols used by the runtime into having one linker symbol that
points at a structure containing the needed data.  Multiple
object support will construct a linked list of such structures.

A follow up will initialize the slices in the themoduledata
structure directly from the linker but I was aiming for a minimal
diff for now.

Change-Id: I613cce35309801cf265a1d5ae5aaca8d689c5cbf
Reviewed-on: https://go-review.googlesource.com/7441
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-03-31 22:45:07 +00:00
Austin Clements
06de3f52a7 runtime: document subtlety around entering mark termination
The barrier in gcDrain does not account for concurrent gcDrainNs
happening in gchelpwork, so it can actually return while there is
still work being done. It turns out this is okay, but for subtle
reasons involving gcDrainN always being run on the system
stack. Document these reasons.

Change-Id: Ib07b3753cc4e2b54533ab3081a359cbd1c3c08fb
Reviewed-on: https://go-review.googlesource.com/7736
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-03-20 14:05:05 +00:00
Russ Cox
631d6a33bf runtime: implement atomicand8 atomically
We're skating on thin ice, and things are finally starting to melt around here.
(I want to avoid the debugging session that will happen when someone
uses atomicand8 expecting it to be atomic with respect to other operations.)

Change-Id: I254f1582be4eb1f2d7fbba05335a91c6bf0c7f02
Reviewed-on: https://go-review.googlesource.com/7861
Reviewed-by: Minux Ma <minux@golang.org>
2015-03-20 04:45:29 +00:00
Austin Clements
fa2f9c2c09 runtime: run concurrent mark phase on regular stack
Currently, the GC's concurrent mark phase runs on the system
stack. There's no need to do this, and running it this way ties up the
entire M and P running the GC by preventing the scheduler from
preempting the GC even during concurrent mark.

Fix this by running concurrent mark on the regular G stack. It's still
non-preemptible because we also set preemptoff around the whole GC
process, but this moves us closer to making it preemptible.

Change-Id: Ia9f1245e299b8c5c513a4b1e3ef13eaa35ac5e73
Reviewed-on: https://go-review.googlesource.com/7730
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-03-19 15:55:12 +00:00
Austin Clements
bef356b282 runtime: improve comment in concurrent GC
"Sync" is not very informative. What's being synchronized and with
whom? Update this comment to explain what we're really doing: enabling
write barriers.

Change-Id: I4f0cbb8771988c7ba4606d566b77c26c64165f0f
Reviewed-on: https://go-review.googlesource.com/7700
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-03-19 15:55:06 +00:00
Austin Clements
d21cef1f8f runtime: remove pointless harvestwbufs
Currently we harvestwbufs the moment we enter the mark phase, even
before starting the world again. Since cached wbufs are only filled
when we're in mark or mark termination, they should all be empty at
this point, making the harvest pointless. Remove the harvest.

We should, but do not currently harvest at the end of the mark phase
when we're running out of work to do.

Change-Id: I5f4ba874f14dd915b8dfbc4ee5bb526eecc2c0b4
Reviewed-on: https://go-review.googlesource.com/7669
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-03-19 15:55:00 +00:00
Rick Hudson
d0eab03091 runtime: Adjust when write barriers are active
Even though the world is stopped the GC may do pointer
writes that need to be protected by write barriers.
This means that the write barrier must be on
continuously from the time the mark phase starts and
the mark termination phase ends. Checks were added to
ensure that no allocation happens during a GC.

Hoist the logic that clears pools the start of the GC
so that the memory can be reclaimed during this GC cycle.

Change-Id: I9d1551ac5db9bac7bac0cb5370d5b2b19a9e6a52
Reviewed-on: https://go-review.googlesource.com/6990
Reviewed-by: Austin Clements <austin@google.com>
2015-03-10 15:04:12 +00:00
Dmitry Vyukov
919fd24884 runtime: remove runtime frames from stacks in traces
Stip uninteresting bottom and top frames from trace stacks.
This makes both binary and json trace files smaller,
and also makes stacks shorter and more readable in the viewer.

Change-Id: Ib9c80ccc280504f0e235f867f53f1d2652c41583
Reviewed-on: https://go-review.googlesource.com/5523
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
2015-03-10 14:46:15 +00:00
Russ Cox
5789b28525 runtime: start GC background sweep eagerly
Starting it lazily causes a memory allocation (for the goroutine) during GC.

First use of channels for runtime implementation.

Change-Id: I9cd24dcadbbf0ee5070ee6d0ed7ea415504f316c
Reviewed-on: https://go-review.googlesource.com/6960
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2015-03-05 21:41:55 +00:00
Dmitry Vyukov
b759e225f5 runtime: bound defer pools (try 2)
The unbounded list-based defer pool can grow infinitely.
This can happen if a goroutine routinely allocates a defer;
then blocks on one P; and then unblocked, scheduled and
frees the defer on another P.
The scenario was reported on golang-nuts list.

We've been here several times. Any unbounded local caches
are bad and grow to infinite size. This change introduces
central defer pool; local pools become fixed-size
with the only purpose of amortizing accesses to the
central pool.

Freedefer now executes on system stack to not consume
nosplit stack space.

Change-Id: I1a27695838409259d1586a0adfa9f92bccf7ceba
Reviewed-on: https://go-review.googlesource.com/3967
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
2015-03-04 14:29:58 +00:00
Dmitry Vyukov
5ef145c809 runtime: bound sudog cache
The unbounded list-based sudog cache can grow infinitely.
This can happen if a goroutine is routinely blocked on one P
and then unblocked and scheduled on another P.
The scenario was reported on golang-nuts list.

We've been here several times. Any unbounded local caches
are bad and grow to infinite size. This change introduces
central sudog cache; local caches become fixed-size
with the only purpose of amortizing accesses to the
central cache.

The change required to move sudog cache from mcache to P,
because mcache is not scanned by GC.

Change-Id: I3bb7b14710354c026dcba28b3d3c8936a8db4e90
Reviewed-on: https://go-review.googlesource.com/3742
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
2015-03-04 14:14:29 +00:00
Austin Clements
07b73ce146 runtime: simplify gcResetGState
Since allglock is held in this function, there's no point to
tip-toeing around allgs.  Just use a for-range loop.

Change-Id: I1ee61c7e8cac8b8ebc8107c0c22f739db5db9840
Reviewed-on: https://go-review.googlesource.com/5882
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-25 15:48:57 +00:00
Austin Clements
b3d791c7bb runtime: consolidate gcworkdone/gcscanvalid clearing loops
Previously, we had three loops in the garbage collector that all
cleared the per-G GC flags.  Consolidate these into one function.
This one function is designed to work in a concurrent setting.  As a
result, it's slightly more expensive than the loops it replaces during
STW phases, but these happen at most twice per GC.

Change-Id: Id1ec0074fd58865eb0112b8a0547b267802d0df1
Reviewed-on: https://go-review.googlesource.com/5881
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-25 15:46:41 +00:00
Austin Clements
37b8597178 runtime: remove unnecessary gcworkdone resetting loop
The loop in gcMark is redundant with the gcworkdone resetting
performed by markroot, which called a few lines later in gcMark.

Change-Id: Ie0a826a614ecfa79e6e6b866e8d1de40ba515856
Reviewed-on: https://go-review.googlesource.com/5880
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-25 15:46:21 +00:00
Rick Hudson
e31e35a0de runtime: reset gcscanvalid and gcworkdone when GODEBUG=gctrace=2
When GODEBUG=gctrace=2 two gcs are preformed. During the first gc
the stack scan sets the g's gcscanvalid and gcworkdone flags to true
indicating that the stacks have to be scanned and do not need to
be rescanned. These need to be reset to false for the second GC so the
stacks are rescanned, otherwise if the only pointer to an object is
on the stack it will not be discovered and the object will be freed.
Typically this will include the object that was just allocated in
the mallocgc call that initiated the GC.

Change-Id: Ic25163f4689905fd810c90abfca777324005c02f
Reviewed-on: https://go-review.googlesource.com/5861
Reviewed-by: Russ Cox <rsc@golang.org>
2015-02-25 00:14:42 +00:00
Russ Cox
89a091de24 runtime: split gc_m into gcMark and gcSweep
This is a nice split but more importantly it provides a better
way to fit the checkmark phase into the sequencing.

Also factor out common span copying into gcSpanCopy.

Change-Id: Ia058644974e4ed4ac3cf4b017a3446eb2284d053
Reviewed-on: https://go-review.googlesource.com/5333
Reviewed-by: Austin Clements <austin@google.com>
2015-02-20 17:00:39 +00:00
Russ Cox
929597b9e9 runtime: unroll gc_m loop
The loop made more sense when gc_m was not its own function.

Change-Id: I71a7f21d777e69c1924e3b534c507476daa4dfdd
Reviewed-on: https://go-review.googlesource.com/5332
Reviewed-by: Austin Clements <austin@google.com>
2015-02-20 17:00:30 +00:00
Russ Cox
2b655c0b92 runtime: tidy GC driver
Change-Id: I0da26e89ae73272e49e82c6549c774e5bc97f64c
Reviewed-on: https://go-review.googlesource.com/5331
Reviewed-by: Austin Clements <austin@google.com>
2015-02-20 17:00:22 +00:00
Russ Cox
5254b7e9ce runtime: do not unmap work.spans until after checkmark phase
This is causing crashes.

Change-Id: I1832f33d114bc29894e491dd2baac45d7ab3a50d
Reviewed-on: https://go-review.googlesource.com/5330
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-02-19 21:33:06 +00:00
Russ Cox
6c4b54f409 runtime: missed change from reorganization CL
That is, I accidentally dropped this change of Austin's
when preparing my CL. I blame Git.

Change-Id: I9dd772c84edefad96c4b16785fdd2dea04a4a0d6
Reviewed-on: https://go-review.googlesource.com/5320
Reviewed-by: Austin Clements <austin@google.com>
2015-02-19 20:46:59 +00:00
Russ Cox
484f801ff4 runtime: reorganize memory code
Move code from malloc1.go, malloc2.go, mem.go, mgc0.go into
appropriate locations.

Factor mgc.go into mgc.go, mgcmark.go, mgcsweep.go, mstats.go.

A lot of this code was in certain files because the right place was in
a C file but it was written in Go, or vice versa. This is one step toward
making things actually well-organized again.

Change-Id: I6741deb88a7cfb1c17ffe0bcca3989e10207968f
Reviewed-on: https://go-review.googlesource.com/5300
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-19 20:17:01 +00:00
Austin Clements
02dcdba7c8 runtime: switch to gcWork abstraction
This converts the garbage collector from directly manipulating work
buffers to using the new gcWork abstraction.

The previous management of work buffers was rather ad hoc.  As a
result, switching to the gcWork abstraction changes many details of
work buffer management.

If greyobject fills a work buffer, it can now pull from work.partial
in addition to work.empty.

Previously, gcDrain started with a partial or empty work buffer and
fetched an empty work buffer if it filled its current buffer (in
greyobject).  Now, gcDrain starts with a full work buffer and fetches
an partial or empty work buffer if it fills its current buffer (in
greyobject).  The original behavior was bad because gcDrain would
immediately drop the empty work buffer returned by greyobject and
fetch a full work buffer, which greyobject was likely to immediately
overflow, fetching another empty work buffer, etc.  The new behavior
isn't great at the start because greyobject is likely to immediately
overflow the full buffer, but the steady-state behavior should be more
stable.  Both before and after this change, gcDrain fetches a full
work buffer if it drains its current buffer.  Basically all of these
choices are bad; the right answer is to use a dual work buffer scheme.

Previously, shade always fetched a work buffer (though usually from
m.currentwbuf), even if the object was already marked.  Now it only
fetches a work buffer if it actually greys an object.

Change-Id: I8b880ed660eb63135236fa5d5678f0c1c041881f
Reviewed-on: https://go-review.googlesource.com/5232
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-19 16:59:34 +00:00
Austin Clements
b30d19de59 runtime: introduce higher-level GC work abstraction
This introduces a producer/consumer abstraction for GC work pointers
that internally handles the details of filling, draining, and
shuffling work buffers.

In addition to simplifying the GC code, this should make it easy for
us to change how we use work buffers, including cleaning up how we use
the work.partial queue, reintroducing a FIFO lookahead cache, adding
prefetching, and using dual buffers to avoid flapping.

This commit doesn't change any existing code.  The following commit
will switch the garbage collector from explicit workbuf manipulation
to gcWork.

Change-Id: Ifbfe5fff45bf0362d6d7c3cecb061f0c9874077d
Reviewed-on: https://go-review.googlesource.com/5231
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-19 16:59:26 +00:00
Austin Clements
1ae124b5ff runtime: make gcDrainN take an int instead of uintptr
Nit.  There's no reason to take a uintptr and doing so just requires
casts in annoying places.

Change-Id: Ifeb9638c6d94eae619c490930cf724cc315680ba
Reviewed-on: https://go-review.googlesource.com/5230
Reviewed-by: Russ Cox <rsc@golang.org>
2015-02-19 02:47:45 +00:00
Austin Clements
6e5cc1f1ac runtime: rename drainworkbuf and drainobjects
drainworkbuf is now gcDrain, since it drains until there's
nothing left to drain.  drainobjects is now gcDrainN because it's
the bounded equivalent to gcDrain.

The new names use the Go camel case convention because we have to
start somewhere.  The "gc" prefix is because we don't have runtime
packages yet and just "drain" is too ambiguous.

Change-Id: I88dbdf32e8ce4ce6c3b7e1f234664be9b76cb8fd
Reviewed-on: https://go-review.googlesource.com/4785
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-13 15:34:55 +00:00
Austin Clements
60a16ea367 runtime: remove drainallwbufs argument to drainworkbuf
All calls to drainworkbuf now pass true for this argument, so remove
the argument and update the documentation to reflect the simplified
interface.

At a higher level, there are no longer any situations where we drain
"one wbuf" (though drainworkbuf didn't guarantee this anyway).  We
either drain everything, or we drain a specific number of objects.

Change-Id: Ib7ee0fde56577eff64232ee1e711ec57c4361335
Reviewed-on: https://go-review.googlesource.com/4784
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-13 15:34:49 +00:00
Austin Clements
15c9a2ef4e runtime: eliminate drainworkbufs from scanblock
scanblock is only called during _GCscan and _GCmarktermination.
During _GCscan, scanblock didn't call drainworkbufs anyway.  During
_GCmarktermination, there's really no point in draining some (largely
arbitrary) amount of work during the scanblock, since the GC is about
to drain everything anyway, so simply eliminate this case.

Change-Id: I7f3c59ce9186a83037c6f9e9b143181acd04c597
Reviewed-on: https://go-review.googlesource.com/4783
Reviewed-by: Russ Cox <rsc@golang.org>
2015-02-13 15:34:39 +00:00
Austin Clements
1ac65f82ad runtime: eliminate b == 0 special case from scanblock
We no longer ever call scanblock with b == 0.

Change-Id: I9b01da39595e0cc251668c24d58748d88f5f0792
Reviewed-on: https://go-review.googlesource.com/4782
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-13 15:34:34 +00:00
Austin Clements
cf964e1653 runtime: replace scanblock(0, 0, nil, nil) with drainworkbuf
scanblock(0, 0, nil, nil) was just a confusing way of saying

  wbuf = getpartialorempty()
  drainworkbuf(wbuf, true)

Make drainworkbuf accept a nil workbuf and perform the
getpartialorempty itself and replace all uses of scanblock(0, 0, nil,
nil) with direct calls to drainworkbuf(nil, true).

Change-Id: I7002a2f8f3eaf6aa85bbf17ccc81d7288acfef1c
Reviewed-on: https://go-review.googlesource.com/4781
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-02-13 15:34:24 +00:00