Currently bulkBarrierPreWrite calls writebarrierptr_prewrite, but this
means that we check writeBarrier.needed twice and perform cgo checks
twice.
Change bulkBarrierPreWrite to call writebarrierptr_prewrite1 to skip
over these duplicate checks.
This may speed up bulkBarrierPreWrite slightly, but mostly this will
save us from running out of nosplit stack space on ppc64x in the near
future.
Updates #17503.
Change-Id: I1cea1a2207e884ab1a279c6a5e378dcdc048b63e
Reviewed-on: https://go-review.googlesource.com/31890
Reviewed-by: Rick Hudson <rlh@golang.org>
gobuf.ctxt is set to nil from many places in assembly code and these
assignments require write barriers with the hybrid barrier.
Conveniently, in most of these places ctxt should already be nil, in
which case we don't need the barrier. This commit changes these places
to assert that ctxt is already nil.
gogo is more complicated, since ctxt may not already be nil. For gogo,
we manually perform the write barrier if ctxt is not nil.
Updates #17503.
Change-Id: I9d75e27c75a1b7f8b715ad112fc5d45ffa856d30
Reviewed-on: https://go-review.googlesource.com/31764
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Currently, we perform write barriers after performing pointer writes.
At the moment, it simply doesn't matter what order this happens in, as
long as they appear atomic to GC. But both the hybrid barrier and ROC
are going to require a pre-write write barrier.
For the hybrid barrier, this is important because the barrier needs to
observe both the current value of the slot and the value that will be
written to it. (Alternatively, the caller could do the write and pass
in the old value, but it seems easier and more useful to just swap the
order of the barrier and the write.)
For ROC, this is necessary because, if the pointer write is going to
make the pointer reachable to some goroutine that it currently is not
visible to, the garbage collector must take some special action before
that pointer becomes more broadly visible.
This commits swaps pointer writes around so the write barrier occurs
before the pointer write.
The main subtlety here is bulk memory writes. Currently, these copy to
the destination first and then use the pointer bitmap of the
destination to find the copied pointers and invoke the write barrier.
This is necessary because the source may not have a pointer bitmap. To
handle these, we pass both the source and the destination to the bulk
memory barrier, which uses the pointer bitmap of the destination, but
reads the pointer values from the source.
Updates #17503.
Change-Id: I78ecc0c5c94ee81c29019c305b3d232069294a55
Reviewed-on: https://go-review.googlesource.com/31763
Reviewed-by: Rick Hudson <rlh@golang.org>
Apparently on macOS Sierra LLDB thinks /usr/lib/dyld is mapped
at address 0, even if Go code starts at 0x1000, and it looks up
addresses from dyld which shadows Go symbols. Move Go binary at
a higher address to avoid clash.
Fixes#17463. Re-enable TestLldbPython.
Change-Id: I89ca6f3ee48aa6da9862bfa0c2da91477cc93255
Reviewed-on: https://go-review.googlesource.com/32185
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
As for dropg, save is writing a nil pointer that will generate a write
barrier with the hybrid barrier. However, in this case, ctxt always
should already be nil, so replace the write with an assertion that
this is the case.
At this point, we're ready to disable the write barrier elision
optimizations that interfere with the hybrid barrier.
Updates #17503.
Change-Id: I83208e65aa33403d442401f355b2e013ab9a50e9
Reviewed-on: https://go-review.googlesource.com/31571
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently this contains no write barriers because it's writing nil
pointers, but with the hybrid barrier, even these will produce write
barriers. However, since these are *gs and *ms, they don't need write
barriers, so we can simply eliminate them.
Updates #17503.
Change-Id: Ib188a60492c5cfb352814bf9b2bcb2941fb7d6c0
Reviewed-on: https://go-review.googlesource.com/31570
Reviewed-by: Rick Hudson <rlh@golang.org>
The hybrid barrier requires allocate-black, but there's one case where
we don't currently allocate black: the tiny allocator. If we allocate
a *new* tiny alloc block during GC, it will be allocated black, but if
we allocated the current block before GC, it won't be black, and the
further allocations from it won't mark it, which means we may free a
reachable tiny block during sweeping.
Fix this by passing over all mcaches at the beginning of mark, while
the world is still stopped, and greying their tiny blocks.
Updates #17503.
Change-Id: I04d4df7cc2f553f8f7b1e4cb0b52e2946588111a
Reviewed-on: https://go-review.googlesource.com/31456
Reviewed-by: Rick Hudson <rlh@golang.org>
The hybrid barrier requires barriers on stack-to-stack copies if
either stack is grey. There are only two instances of this in the
runtime: channel sends and starting a goroutine. Channel sends already
use typedmemmove and hence have the necessary barriers. This commits
adds barriers for the stack-to-stack copy when starting a goroutine.
Updates #17503.
Change-Id: Ibb55e08127ca4d021ac54be61cb96732efa5df5b
Reviewed-on: https://go-review.googlesource.com/31455
Reviewed-by: Rick Hudson <rlh@golang.org>
Original Change by Daria Kolistratova <daria.kolistratova@intel.com>
Added functions with suffix proto and stuff from pprof tool to translate
to protobuf. Done as the profile proto is more extensible than the legacy
pprof format and is pprof's preferred profile format. Large part was taken
from https://github.com/google/pprof tool. Tested by hand and compared the
result with translated by pprof tool, profiles are identical.
Fixes#16093
Change-Id: I2751345b09a66ee2b6aa64be76cba4cd1c326aa6
Reviewed-on: https://go-review.googlesource.com/32257
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
Waiting 2ms for all the kicked-off goroutines to run and block
seems a little optimistic. No harm done by waiting for 200ms instead.
Fixes#17238.
Change-Id: I827532ea2f5f1f3ed04179f8957dd2c563946ed0
Reviewed-on: https://go-review.googlesource.com/32103
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently we initialize LR on a new stack by writing nil to it. But
this is an initializing write since the newly allocated stack is not
zeroed, so this is unsafe with the hybrid barrier. Change this is a
uintptr write to avoid a bad write barrier.
Updates #17503.
Change-Id: I062ac352e35df7da4644c1f2a5aaab87049d1f60
Reviewed-on: https://go-review.googlesource.com/32093
Reviewed-by: Rick Hudson <rlh@golang.org>
We reuse finalizers in finblocks, which are allocated off-heap. This
means they have to be zero-initialized before becoming visible to the
garbage collector. We actually already do this by clearing the
finalizer before returning it to the pool, but we're not careful to
enforce correct memory ordering. Fix this by manipulating the
finalizer count atomically so these writes synchronize properly with
the garbage collector.
Updates #17503.
Change-Id: I7797d31df3c656c9fe654bc6da287f66a9e2037d
Reviewed-on: https://go-review.googlesource.com/31454
Reviewed-by: Rick Hudson <rlh@golang.org>
runfinq allocates a stack frame on the heap for constructing the
finalizer function calls and reuses it for each call. However, because
the type of this frame is constantly shifting, it tells mallocgc there
are no pointers in it and it acts essentially like uninitialized
memory between uses. But runfinq uses pointer writes with write
barriers to "initialize" this memory, which is not going to be safe
with the hybrid barrier, since the hybrid barrier may see a stale
pointer left in the "uninitialized" frame.
Fix this by zero-initializing the argument values in the frame before
writing the argument pointers.
Updates #17503.
Change-Id: I951c0a2be427eb9082a32d65c4410e6fdef041be
Reviewed-on: https://go-review.googlesource.com/31453
Reviewed-by: Rick Hudson <rlh@golang.org>
Since barrier-less memclr is only safe in very narrow circumstances,
this commit renames memclr to avoid accidentally calling memclr on
typed memory. This can cause subtle, non-deterministic bugs, so it's
worth some effort to prevent. In the near term, this will also prevent
bugs creeping in from any concurrent CLs that add calls to memclr; if
this happens, whichever patch hits master second will fail to compile.
This also adds the other new memclr variants to the compiler's
builtin.go to minimize the churn on that binary blob. We'll use these
in future commits.
Updates #17503.
Change-Id: I00eead049f5bd35ca107ea525966831f3d1ed9ca
Reviewed-on: https://go-review.googlesource.com/31369
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently fixalloc does not zero memory it reuses. This is dangerous
with the hybrid barrier if the type may contain heap pointers, since
it may cause us to observe a dead heap pointer on reuse. It's also
error-prone since it's the only allocator that doesn't zero on
allocation (mallocgc of course zeroes, but so do persistentalloc and
sysAlloc). It's also largely pointless: for mcache, the caller
immediately memclrs the allocation; and the two specials types are
tiny so there's no real cost to zeroing them.
Change fixalloc to zero allocations by default.
The only type we don't zero by default is mspan. This actually
requires that the spsn's sweepgen survive across freeing and
reallocating a span. If we were to zero it, the following race would
be possible:
1. The current sweepgen is 2. Span s is on the unswept list.
2. Direct sweeping sweeps span s, finds it's all free, and releases s
to the fixalloc.
3. Thread 1 allocates s from fixalloc. Suppose this zeros s, including
s.sweepgen.
4. Thread 1 calls s.init, which sets s.state to _MSpanDead.
5. On thread 2, background sweeping comes across span s in allspans
and cas's s.sweepgen from 0 (sg-2) to 1 (sg-1). Now it thinks it
owns it for sweeping. 6. Thread 1 continues initializing s.
Everything breaks.
I would like to fix this because it's obviously confusing, but it's a
subtle enough problem that I'm leaving it alone for now. The solution
may be to skip sweepgen 0, but then we have to think about wrap-around
much more carefully.
Updates #17503.
Change-Id: Ie08691feed3abbb06a31381b94beb0a2e36a0613
Reviewed-on: https://go-review.googlesource.com/31368
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently the zero value of an mspan is in the "in use" state. This
seems like a bad idea in general. But it's going to wreak havoc when
we make fixalloc zero allocations: even "freed" mspan objects are
still on the allspans list and still get looked at by the garbage
collector. Hence, if we leave the mspan states the way they are,
allocating a span that reuses old memory will temporarily pass that
span (which is visible to GC!) through the "in use" state, which can
cause "unswept span" panics.
Fix all of this by making the zero state "dead".
Updates #17503.
Change-Id: I77c7ac06e297af4b9e6258bc091c37abe102acc3
Reviewed-on: https://go-review.googlesource.com/31367
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The hybrid barrier requires distinguishing typed and untyped memory
even when zeroing because the *current* contents of the memory matters
even when overwriting.
This commit introduces runtime.typedmemclr and runtime.memclrHasPointers
as a typed memory clearing functions parallel to runtime.typedmemmove.
Currently these simply call memclr, but with the hybrid barrier we'll
need to shade any pointers we're overwriting. These will provide us
with the necessary hooks to do so.
Updates #17503.
Change-Id: I74478619f8907825898092aaa204d6e4690f27e6
Reviewed-on: https://go-review.googlesource.com/31366
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently all mcaches are flushed in a single STW root job. This takes
about 5 µs per P, but since it's done sequentially it adds about
5*GOMAXPROCS µs to the STW.
Fix this by parallelizing the job. Since there are exactly GOMAXPROCS
mcaches to flush, this parallelizes quite nicely and brings the STW
latency cost down to a constant 5 µs (assuming GOMAXPROCS actually
reflects the number of CPUs).
Updates #17503.
Change-Id: Ibefeb1c2229975d5137c6e67fac3b6c92103742d
Reviewed-on: https://go-review.googlesource.com/32033
Reviewed-by: Rick Hudson <rlh@golang.org>
Added functions with suffix proto and stuff from pprof tool to translate
to protobuf. Done as the profile proto is more extensible than the legacy
pprof format and is pprof's preferred profile format. Large part was taken
from https://github.com/google/pprof tool. Tested by hand and compared the
result with translated by pprof tool, profiles are identical.
Fixes#16093
Change-Id: I5acdb2809cab0d16ed4694fdaa7b8ddfd68df11e
Reviewed-on: https://go-review.googlesource.com/30556
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
The current logic in gcDrain conflates non-blocking with preemptible
draining for root jobs. As a result, if you do a non-blocking (but
*not* preemptible) drain, like dedicated workers do, the root job
drain will stop if preempted and fall through to heap marking jobs,
which won't stop until it fails to get a heap marking job.
This commit fixes the condition on root marking jobs so they only stop
when preempted if the drain is preemptible.
Coincidentally, this also fixes a nil pointer dereference if we call
gcDrain with gcDrainNoBlock and without a user G, since it tries to
get the preempt flag from the nil user G. This combination never
happens right now, but will in the future.
Change-Id: Ia910ec20a9b46237f7926969144a33b1b4a7b2f9
Reviewed-on: https://go-review.googlesource.com/32291
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
This adds support to the runtime/trace test for saving traces
collected by its tests to disk and a script in internal/trace that
uses this to collect canned traces for the trace test suite. This can
be used to add to the test suite when we introduce a new trace format
version.
Change-Id: Id9ac1ff312235bf02f982fdfff8a827f54035758
Reviewed-on: https://go-review.googlesource.com/32290
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently when a goroutine blocks on a GC assist, it emits a generic
EvGoBlock event. Since assist blocking events and, in particular, the
length of the blocked assist queue, are important for diagnosing GC
behavior, this commit adds a new EvGoBlockGC event for blocking on a
GC assist. The trace viewer uses this event to report a "waiting on
GC" count in the "Goroutines" row. This makes sense because, unlike
other blocked goroutines, these goroutines do have work to do, so
being blocked on a GC assist is quite similar to being in the
"runnable" state, which we also report in the trace viewer.
Change-Id: Ic21a326992606b121ea3d3d00110d8d1fdc7a5ef
Reviewed-on: https://go-review.googlesource.com/30704
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently mark workers are shown in the trace as regular goroutines
labeled "runtime.gcBgMarkWorker". That's somewhat unhelpful to an end
user because of the opaque label and particularly unhelpful to runtime
developers because it doesn't distinguish the different types of mark
workers.
Fix this by introducing a variant of the GoStart event called
GoStartLabel that lets the runtime indicate a label for a goroutine
execution span and using this to label mark worker executions as "GC
(<mode>)" in the trace viewer.
Since this bumps the trace version to 1.8, we also add test data for
1.7 traces.
Change-Id: Id7b9c0536508430c661ffb9e40e436f3901ca121
Reviewed-on: https://go-review.googlesource.com/30702
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently, gcDrain looks for the preemption flag at getg().preempt.
However, commit d6625ca moved mark worker draining to the system
stack, which means getg() returns the g0, which never has the preempt
flag set, so idle and fractional workers don't get preempted after
10ms and just run until they run out of work. As a result, if there's
enough idle time, GC becomes effectively STW.
Fix this by looking for the preemption flag on getg().m.curg, which
will always be the user G (where the preempt flag is set), regardless
of whether gcDrain is running on the user or the g0 stack.
Change-Id: Ib554cf49a705b86ccc3d08940bc869f868c50dd2
Reviewed-on: https://go-review.googlesource.com/32251
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
runtime.SetMutexProfileFraction(n int) will capture 1/n-th of stack
traces of goroutines holding contended mutexes if n > 0. From runtime/pprof,
pprot.Lookup("mutex").WriteTo writes the accumulated
stack traces to w (in essentially the same format that blocking
profiling uses).
Change-Id: Ie0b54fa4226853d99aa42c14cb529ae586a8335a
Reviewed-on: https://go-review.googlesource.com/29650
Reviewed-by: Austin Clements <austin@google.com>
- removes the runtime function stringtoslicebytetmp
- removes the generation of calls to stringtoslicebytetmp from the frontend
- adds handling of OSTRARRAYBYTETMP in the backend
This reduces binary sizes and avoids function call overhead.
Change-Id: Ib9988d48549cee663b685b4897a483f94727b940
Reviewed-on: https://go-review.googlesource.com/32158
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
I do not know why it is included. All tests pass without it.
Change-Id: I839076ee131816dfd177570a902c69fe8fba5022
Reviewed-on: https://go-review.googlesource.com/32144
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Otherwise, the way the ELF dynamic linker works means that you can end up with
the same itab being passed to additab twice, leading to the itab linked list
having a cycle in it. Add a test to additab in runtime to catch this when it
happens, not some arbitrary and surprsing time later.
Fixes#17594
Change-Id: I6c82edcc9ac88ac188d1185370242dc92f46b1ad
Reviewed-on: https://go-review.googlesource.com/32131
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Assembly copied from the clock_gettime(CLOCK_MONOTONIC)
call in runtime.nanotime in these files and then modified to use
CLOCK_REALTIME.
Also comment system call numbers in a few other files.
Fixes#11222.
Change-Id: Ie132086de7386f865908183aac2713f90fc73e0d
Reviewed-on: https://go-review.googlesource.com/32177
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Don't include package path when creating LSyms for auto and param
variables during Prog generation, and update the DWARF emit routine
accordingly (remove the code that chops off package path from names in
DWARF var location expressions). Implementation suggested by mdempsky@.
The intent of this change is to have saner location expressions in cases
where the variable corresponds to a structure field. For example, the
SSA compiler's "decompose" phase can take a slice value and break it
apart into three scalar variables corresponding to the fields (slice "X"
gets split into "X.len", "X.cap", "X.ptr"). In such cases we want the
name in the location expression to omit the package path but preserve
the original variable name (e.g. "X").
Fixes#16338
Change-Id: Ibc444e7f3454b70fc500a33f0397e669d127daa1
Reviewed-on: https://go-review.googlesource.com/31819
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Currently, markroot delays scanning mark worker stacks until mark
termination by putting the mark worker G directly on the rescan list
when it encounters one during the mark phase. Without this, since mark
workers are non-preemptible, two mark workers that attempt to scan
each other's stacks can deadlock.
However, this is annoyingly asymmetric and causes some real problems.
First, markroot does not own the G at that point, so it's not
technically safe to add it to the rescan list. I haven't been able to
find a specific problem this could cause, but I suspect it's the root
cause of issue #17099. Second, this will interfere with the hybrid
barrier, since there is no stack rescanning during mark termination
with the hybrid barrier.
This commit switches to a different approach. We move the mark
worker's call to gcDrain to the system stack and set the mark worker's
status to _Gwaiting for the duration of the drain to indicate that
it's preemptible. This lets another mark worker scan its G stack while
the drain is running on the system stack. We don't return to the G
stack until we can switch back to _Grunning, which ensures we don't
race with a stack scan. This lets us eliminate the special case for
mark worker stack scans and scan them just like any other goroutine.
The only subtlety to this approach is that we have to disable stack
shrinking for mark workers; they could be referring to captured
variables from the G stack, so it's not safe to move their stacks.
Updates #17099 and #17503.
Change-Id: Ia5213949ec470af63e24dfce01df357c12adbbea
Reviewed-on: https://go-review.googlesource.com/31820
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, if the number of stack barriers for a stack is 0, we'll
create a zero-length slice that points just past the end of the stack
allocation. This bad pointer causes GC panics.
Fix this by creating a nil slice if the stack barrier count is 0.
In practice, the only way this can happen is if
GODEBUG=gcstackbarrieroff=1 is set because even the minimum size stack
reserves space for two stack barriers.
Change-Id: I3527c9a504c445b64b81170ee285a28594e7983d
Reviewed-on: https://go-review.googlesource.com/31762
Reviewed-by: Rick Hudson <rlh@golang.org>
This adds debug code enabled in gccheckmark mode that panics if we
attempt to mark an unallocated object. This is a common issue with the
hybrid barrier when we're manipulating uninitialized memory that
contains stale pointers. This also tends to catch bugs that will lead
to "sweep increased allocation count" crashes closer to the source of
the bug.
Change-Id: I443ead3eac6f316a46f50b106078b524cac317f4
Reviewed-on: https://go-review.googlesource.com/31761
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently reflectcall has a subtle dance with write barriers where the
assembly code copies the result values from the stack to the in-heap
argument frame without write barriers and then calls into the runtime
after the fact to invoke the necessary write barriers.
For the hybrid barrier (and for ROC), we need to switch to a
*pre*-write write barrier, which is very difficult to do with the
current setup. We could tie ourselves in knots of subtle reasoning
about why it's okay in this particular case to have a post-write write
barrier, but this commit instead takes a different approach. Rather
than making things more complex, this simplifies reflection calls so
that the argument copy is done in Go using normal bulk write barriers.
The one difficulty with this approach is that calling into Go requires
putting arguments on the stack, but the call* functions "donate" their
entire stack frame to the called function. We can get away with this
now because the copy avoids using the stack and has copied the results
out before we clobber the stack frame to call into the write barrier.
The solution in this CL is to call another function, passing arguments
in registers instead of on the stack, and let that other function
reserve more stack space and setup the arguments for the runtime.
This approach seemed to work out the best. I also tried making the
call* functions reserve 32 extra bytes of frame for the write barrier
arguments and adjust SP up by 32 bytes around the call. However, even
with the necessary changes to the assembler to correct the spdelta
table, the runtime was still having trouble with the frame layout (and
the changes to the assembler caused many other things that do strange
things with the SP to fail to assemble). The approach I took doesn't
require any funny business with the SP.
Updates #17503.
Change-Id: Ie2bb0084b24d6cff38b5afb218b9e0534ad2119e
Reviewed-on: https://go-review.googlesource.com/31655
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Now that sweeping and span marking use the sweep list, there's no need
for the work.spans snapshot of the allspans list. This change
eliminates the few remaining uses of it, which are either dead code or
can use allspans directly, and removes work.spans and its support
functions.
Change-Id: Id5388b42b1e68e8baee853d8eafb8bb4ff95bb43
Reviewed-on: https://go-review.googlesource.com/30537
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently markrootSpans iterates over all spans ever allocated to find
the in-use spans. Since we now have a list of in-use spans, change it
to iterate over that instead.
This, combined with the previous change, fixes#9265. Before these two
changes, blowing up the heap to 8GB and then shrinking it to a 0MB
live set caused the small-heap portion of the test to run 60x slower
than without the initial blowup. With these two changes, the time is
indistinguishable.
No significant effect on other benchmarks.
Change-Id: I4a27e533efecfb5d18cba3a87c0181a81d0ddc1e
Reviewed-on: https://go-review.googlesource.com/30536
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently sweeping walks the list of all spans, which means the work
in sweeping is proportional to the maximum number of spans ever used.
If the heap was once large but is now small, this causes an
amortization failure: on a small heap, GCs happen frequently, but a
full sweep still has to happen in each GC cycle, which means we spent
a lot of time in sweeping.
Fix this by creating a separate list consisting of just the in-use
spans to be swept, so sweeping is proportional to the number of in-use
spans (which is proportional to the live heap). Specifically, we
create two lists: a list of unswept in-use spans and a list of swept
in-use spans. At the start of the sweep cycle, the swept list becomes
the unswept list and the new swept list is empty. Allocating a new
in-use span adds it to the swept list. Sweeping moves spans from the
unswept list to the swept list.
This fixes the amortization problem because a shrinking heap moves
spans off the unswept list without adding them to the swept list,
reducing the time required by the next sweep cycle.
Updates #9265. This fix eliminates almost all of the time spent in
sweepone; however, markrootSpans has essentially the same bug, so now
the test program from this issue spends all of its time in
markrootSpans.
No significant effect on other benchmarks.
Change-Id: Ib382e82790aad907da1c127e62b3ab45d7a4ac1e
Reviewed-on: https://go-review.googlesource.com/30535
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently we set the len and cap of h.spans to the full reserved
region of the address space and track the actual mapped region
separately in h.spans_mapped. Since we have both the len and cap at
our disposal, change things so len(h.spans) tracks how much of the
spans array is mapped and eliminate h.spans_mapped. This simplifies
mheap and means we'll get nice "index out of bounds" exceptions if we
do try to go off the end of the spans rather than a SIGSEGV.
Change-Id: I8ed9a1a9a844d90e9fd2e269add4704623dbdfe6
Reviewed-on: https://go-review.googlesource.com/30533
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Like h_allspans and mheap_.allspans, these were two ways of referring
to the spans array from when the runtime was split between C and Go.
Clean this up by making mheap_.spans a slice and eliminating h_spans.
Change-Id: I3aa7038d53c3a4252050aa33e468c48dfed0b70e
Reviewed-on: https://go-review.googlesource.com/30532
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This was necessary in the C days when allspans was an mspan**, but now
that allspans is a Go slice, this is redundant with len(allspans) and
we can use range loops over allspans.
Change-Id: Ie1dc39611e574e29a896e01690582933f4c5be7e
Reviewed-on: https://go-review.googlesource.com/30531
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
These are two ways to refer to the allspans array that hark back to
when the runtime was split between C and Go. Clean this up by making
mheap_.allspans a slice and eliminating h_allspans.
Change-Id: Ic9360d040cf3eb590b5dfbab0b82e8ace8525610
Reviewed-on: https://go-review.googlesource.com/30530
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>