The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.
This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:
$ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])')
$ go test go/doc -update
Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
It's possible for arena_start+MaxArena32 to wrap.
We do the right thing in the bounds check but not in the print.
For #13992 (to fix the print there, not the bug).
Change-Id: I4df845d0c03f0f35461b128e4f6765d3ccb71c6d
Reviewed-on: https://go-review.googlesource.com/18975
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Currently, we update memstats.heap_live from mcache.local_cachealloc
whenever we lock the heap (e.g., to obtain a fresh span or to release
an unused span). However, under the right circumstances,
local_cachealloc can accumulate allocations up to the size of
the *entire heap* without flushing them to heap_live. Specifically,
since span allocations from an mcentral don't lock the heap, if a
large number of pages are held in an mcentral and the application
continues to use and free objects of that size class (e.g., the
BinaryTree17 benchmark), local_cachealloc won't be flushed until the
mcentral runs out of spans.
This is a problem because, unlike many of the memory statistics that
are purely informative, heap_live is used to determine when the
garbage collector should start and how hard it should work.
This commit eliminates local_cachealloc, instead atomically updating
heap_live directly. To control contention, we do this only when
obtaining a span from an mcentral. Furthermore, we make heap_live
conservative: allocating a span assumes that all free slots in that
span will be used and accounts for these when the span is
allocated, *before* the objects themselves are. This is important
because 1) this triggers the GC earlier than necessary rather than
potentially too late and 2) this leads to a conservative GC rate
rather than a GC rate that is potentially too low.
Alternatively, we could have flushed local_cachealloc when it passed
some threshold, but this would require determining a threshold and
would cause heap_live to underestimate the true value rather than
overestimate.
Fixes#12199.
name old time/op new time/op delta
BinaryTree17-12 2.88s ± 4% 2.88s ± 1% ~ (p=0.470 n=19+19)
Fannkuch11-12 2.48s ± 1% 2.48s ± 1% ~ (p=0.243 n=16+19)
FmtFprintfEmpty-12 50.9ns ± 2% 50.7ns ± 1% ~ (p=0.238 n=15+14)
FmtFprintfString-12 175ns ± 1% 171ns ± 1% -2.48% (p=0.000 n=18+18)
FmtFprintfInt-12 159ns ± 1% 158ns ± 1% -0.78% (p=0.000 n=19+18)
FmtFprintfIntInt-12 270ns ± 1% 265ns ± 2% -1.67% (p=0.000 n=18+18)
FmtFprintfPrefixedInt-12 235ns ± 1% 234ns ± 0% ~ (p=0.362 n=18+19)
FmtFprintfFloat-12 309ns ± 1% 308ns ± 1% -0.41% (p=0.001 n=18+19)
FmtManyArgs-12 1.10µs ± 1% 1.08µs ± 0% -1.96% (p=0.000 n=19+18)
GobDecode-12 7.81ms ± 1% 7.80ms ± 1% ~ (p=0.425 n=18+19)
GobEncode-12 6.53ms ± 1% 6.53ms ± 1% ~ (p=0.817 n=19+19)
Gzip-12 312ms ± 1% 312ms ± 2% ~ (p=0.967 n=19+20)
Gunzip-12 42.0ms ± 1% 41.9ms ± 1% ~ (p=0.172 n=19+19)
HTTPClientServer-12 63.7µs ± 1% 63.8µs ± 1% ~ (p=0.639 n=19+19)
JSONEncode-12 16.4ms ± 1% 16.4ms ± 1% ~ (p=0.954 n=19+19)
JSONDecode-12 58.5ms ± 1% 57.8ms ± 1% -1.27% (p=0.000 n=18+19)
Mandelbrot200-12 3.86ms ± 1% 3.88ms ± 0% +0.44% (p=0.000 n=18+18)
GoParse-12 3.67ms ± 2% 3.66ms ± 1% -0.52% (p=0.001 n=18+19)
RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 0% ~ (p=0.257 n=19+18)
RegexpMatchEasy0_1K-12 347ns ± 1% 347ns ± 1% ~ (p=0.527 n=18+18)
RegexpMatchEasy1_32-12 83.7ns ± 2% 83.1ns ± 2% ~ (p=0.096 n=18+19)
RegexpMatchEasy1_1K-12 509ns ± 1% 505ns ± 1% -0.75% (p=0.000 n=18+19)
RegexpMatchMedium_32-12 130ns ± 2% 129ns ± 1% ~ (p=0.962 n=20+20)
RegexpMatchMedium_1K-12 39.5µs ± 2% 39.4µs ± 1% ~ (p=0.376 n=20+19)
RegexpMatchHard_32-12 2.04µs ± 0% 2.04µs ± 1% ~ (p=0.195 n=18+17)
RegexpMatchHard_1K-12 61.4µs ± 1% 61.4µs ± 1% ~ (p=0.885 n=19+19)
Revcomp-12 540ms ± 2% 542ms ± 4% ~ (p=0.552 n=19+17)
Template-12 69.6ms ± 1% 71.2ms ± 1% +2.39% (p=0.000 n=20+20)
TimeParse-12 357ns ± 1% 357ns ± 1% ~ (p=0.883 n=18+20)
TimeFormat-12 379ns ± 1% 362ns ± 1% -4.53% (p=0.000 n=18+19)
[Geo mean] 62.0µs 61.8µs -0.44%
name old time/op new time/op delta
XBenchGarbage-12 5.89ms ± 2% 5.81ms ± 2% -1.41% (p=0.000 n=19+18)
Change-Id: I96b31cca6ae77c30693a891cff3fe663fa2447a0
Reviewed-on: https://go-review.googlesource.com/17748
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently, if an allocation is large enough that arena_end + size
overflows (which is not hard to do on 32-bit), we go ahead and call
sysReserve with the impossible base and length and depend on this to
either directly fail because the kernel can't possibly fulfill the
requested mapping (causing mheap.sysAlloc to return nil) or to succeed
with a mapping at some other address which will then be rejected as
outside the arena.
In order to make this less subtle, less dependent on the kernel
getting all of this right, and to eliminate the hopeless system call,
add an explicit overflow check.
Updates #13143. This real issue has been fixed by 0de59c2, but this is
a belt-and-suspenders improvement on top of that. It was uncovered by
my symbolic modeling of that bug.
Change-Id: I85fa868a33286fdcc23cdd7cdf86b19abf1cb2d1
Reviewed-on: https://go-review.googlesource.com/16961
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
mcache.tiny is in non-GC'd memory, but points to heap memory. As a
result, there may or may not be write barriers when writing to
mcache.tiny. Make it clearer that funny things are going on by making
mcache.tiny a uintptr instead of an unsafe.Pointer.
Change-Id: I732a5b7ea17162f196a9155154bbaff8d4d00eac
Reviewed-on: https://go-review.googlesource.com/16963
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In mheap.sysAlloc, if an allocation at arena_used would exceed
arena_end (but wouldn't yet push us past arena_start+_MaxArean32), it
trie to extend the arena reservation by another 256 MB. It extends the
arena by calling sysReserve, which, on 32-bit, calls mmap without
MAP_FIXED, which means the address is just a hint and the kernel can
put the mapping wherever it wants. In particular, mmap may choose an
address below arena_start (the kernel also chose arena_start, so there
could be lots of space below it). Currently, we don't detect this case
and, if it happens, mheap.sysAlloc will corrupt arena_end and
arena_used then return the low pointer to mheap.grow, which will crash
when it attempts to index in to h_spans with an underflowed index.
Fix this by checking not only that that p+p_size isn't too high, but
that p isn't too low.
Fixes#13143.
Change-Id: I8d0f42bd1484460282a83c6f1a6f8f0df7fb2048
Reviewed-on: https://go-review.googlesource.com/16927
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
If the area returned by sysReserve in mheap.sysAlloc is outside the
usable arena, we sysFree it. We pass a fake stat pointer to sysFree
because we haven't added the allocation to any stat at that point.
However, we pass a 0 stat, so sysFree panics when it decrements the
stat because the fake stat underflows.
Fix this by setting the fake stat to the allocation size.
Updates #13143 (this is a prerequisite to fixing that bug).
Change-Id: I61a6c9be19ac1c95863cf6a8435e19790c8bfc9a
Reviewed-on: https://go-review.googlesource.com/16926
Reviewed-by: Ian Lance Taylor <iant@golang.org>
runtime/internal/sys will hold system-, architecture- and config-
specific constants.
Updates #11647
Change-Id: I6db29c312556087a42e8d2bdd9af40d157c56b54
Reviewed-on: https://go-review.googlesource.com/16817
Reviewed-by: Russ Cox <rsc@golang.org>
Applies to types fixAlloc, mCache, mCentral, mHeap, mSpan, and
mSpanList.
Two special cases:
1. mHeap_Scavenge() previously didn't take an *mheap parameter, so it
was specially handled in this CL.
2. mHeap_Free() would have collided with mheap's "free" field, so it's
been renamed to (*mheap).freeSpan to parallel its underlying
(*mheap).freeSpanLocked method.
Change-Id: I325938554cca432c166fe9d9d689af2bbd68de4b
Reviewed-on: https://go-review.googlesource.com/16221
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently mallocgc detects if the GC is in a state where it can't
assist, but also can't allocate uncontrolled and yields to help out
the GC. This was a workaround for periods when we were trying to
schedule the GC coordinator. It is no longer necessary because there
is no GC coordinator and malloc can always assist with any GC
transitions that are necessary.
Updates #11970.
Change-Id: I4f7beb7013e85e50ae99a3a8b0bb708ba49cbcd4
Reviewed-on: https://go-review.googlesource.com/16392
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This moves all of the mark 1 to mark 2 transition and mark termination
to the mark done transition function. This means these transitions are
now handled on the goroutine that detected mark completion. This also
means that the GC coordinator and the background completion barriers
are no longer used and various workarounds to yield to the coordinator
are no longer necessary. These will be removed in follow-up commits.
One consequence of this is that mark workers now need to be
preemptible when performing the mark done transition. This allows them
to stop the world and to perform the final clean-up steps of GC after
restarting the world. They are only made preemptible while performing
this transition, so if the worker findRunnableGCWorker would schedule
isn't available, we didn't want to schedule it anyway.
Fixes#11970.
Change-Id: I9203a2d6287eeff62d589ec02ad9cb1e29ddb837
Reviewed-on: https://go-review.googlesource.com/16391
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This moves all of GC initialization, sweep termination, and the
transition to concurrent marking in to the off->mark transition
function. This means it's now handled on the goroutine that detected
the state exit condition.
As a result, malloc no longer needs to Gosched() at the beginning of
the GC cycle to prevent over-allocation while the GC is starting up
because it will now *help* the GC to start up. The Gosched hack is
still necessary during GC shutdown (this is easy to test by enabling
gctrace and hitting Ctrl-S to block the gctrace output).
At this point, the GC coordinator still handles later phases. This
requires a small tweak to how we start the GC coordinator. Currently,
starting the GC coordinator is best-effort and may fail if the
coordinator is about to park from the previous cycle but hasn't yet.
We fix this by replacing the park/ready to wake up the coordinator
with a semaphore. This is temporary since the coordinator will be
going away in a few commits.
Updates #11970.
Change-Id: I2c6a11c91e72dfbc59c2d8e7c66146dee9a444fe
Reviewed-on: https://go-review.googlesource.com/16357
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This begins the conversion of the centralized GC coordinator to a
decentralized state machine by introducing the internal API that
triggers the first state transition from _GCoff to _GCmark (or
_GCmarktermination).
This change introduces the transition lock, the off->mark transition
condition (which is very similar to shouldtriggergc()), and the
general structure of a state transition. Since we're doing this
conversion in stages, it then falls back to the GC coordinator to
actually execute the cycle. We'll start moving logic out of the GC
coordinator and in to transition functions next.
This fixes a minor bug in gcstoptheworld debug mode where passing the
heap trigger once could trigger multiple STW GCs.
Updates #11970.
Change-Id: I964087dd190a639eb5766398f8e1bbf8b352902f
Reviewed-on: https://go-review.googlesource.com/16355
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
In the Go signal handler on Plan 9, when a signal with
the _SigThrow flag is received, we call startpanic before
printing the stack trace.
The startpanic function calls systemstack which calls
startpanic_m. In the startpanic_m function, we call
allocmcache to allocate _g_.m.mcache. The problem is
that allocmcache calls nextSample, which does a floating
point operation to return a sampling point for heap profiling.
However, Plan 9 doesn't support floating point in the
signal handler.
This change adds a new function nextSampleNoFP, only
called when in the Plan 9 signal handler, which is
similar to nextSample, but avoids floating point.
Change-Id: Iaa30437aa0f7c8c84d40afbab7567ad3bd5ea2de
Reviewed-on: https://go-review.googlesource.com/16307
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
arena_{start,used,end} are already uintptr, so no need to convert them
to uintptr, much less to convert them to unsafe.Pointer and then to
uintptr. No binary change to pkg/linux_amd64/runtime.a.
Change-Id: Ia4232ed2a724c44fde7eba403c5fe8e6dccaa879
Reviewed-on: https://go-review.googlesource.com/16339
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
When a new tiny block is allocated because we're allocating an object
that won't fit into the current block, mallocgc saves the new block if
it has more space leftover than the old block. However, the logic for
this was subtly broken in golang.org/cl/2814, resulting in never
saving (or consequently reusing) a tiny block.
Change-Id: Ib5f6769451fb82877ddeefe75dfe79ed4a04fd40
Reviewed-on: https://go-review.googlesource.com/16330
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Add explicit memory sanitizer instrumentation to the runtime and syscall
packages. The compiler does not instrument the runtime package. It
does instrument the syscall package, but we need to add a couple of
cases that it can't see.
Change-Id: I2d66073f713fe67e33a6720460d2bb8f72f31394
Reviewed-on: https://go-review.googlesource.com/16164
Reviewed-by: David Crawshaw <crawshaw@golang.org>
This isn't C anymore. No binary change to pkg/linux_amd64/runtime.a.
Change-Id: I24d66b0f5ac888f432b874aac684b1395e7c8345
Reviewed-on: https://go-review.googlesource.com/15903
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently, when the mutator allocates, the runtime first allocates the
memory and then, if that G has done "enough" allocation, the runtime
checks whether the G has assist debt to pay off and, if so, pays it
off. This approach leads to under-assisting, where a G can allocate a
large region (or many small regions) before paying for it, or can even
exit with outstanding debt.
This commit flips this around so that a G always acquires enough
credit for an allocation before it can perform that allocation. We
continue to amortize the cost of assists by requiring that they
over-assist when triggered to build up credit for many allocations.
Fixes#11967.
Change-Id: Idac9f11133b328535667674d837be72c23ebd899
Reviewed-on: https://go-review.googlesource.com/15409
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
The current heap sampling introduces some bias that interferes
with unsampling, producing unexpected heap profiles.
The solution is to use a Poisson process to generate the
sampling points, using the formulas described at
https://en.wikipedia.org/wiki/Poisson_process
This fixes#12620
Change-Id: If2400809ed3c41de504dd6cff06be14e476ff96c
Reviewed-on: https://go-review.googlesource.com/14590
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
sysReserve will return nil on failure - correctly handle this case and return
nil to the caller. Currently, a failure will result in h.arena_end being set
to psize, h.arena_used being set to zero and fun times ensue.
On the openbsd/arm builder this has resulted in:
runtime: address space conflict: map(0x0) = 0x40946000
fatal error: runtime: address space conflict
When it should be reporting out of memory instead.
Change-Id: Iba828d5ee48ee1946de75eba409e0cfb04f089d4
Reviewed-on: https://go-review.googlesource.com/15056
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
88e945f introduced a non-speculative double check of the heap trigger
before actually starting a concurrent GC. This was necessary to fix a
race for heap-triggered GC, but broke sysmon-triggered periodic GC,
since the heap check will of course fail for periodically triggered
GC.
Fix this by telling startGC whether or not this GC was triggered by
heap size or a timer and only doing the heap size double check for GCs
triggered by heap size.
Fixes#12026.
Change-Id: I7c3f6ec364545c36d619f2b4b3bf3b758e3bcbd6
Reviewed-on: https://go-review.googlesource.com/13168
Reviewed-by: Russ Cox <rsc@golang.org>
Proportional concurrent sweep is currently based on a ratio of spans
to be swept per bytes of object allocation. However, proportional
sweeping is performed during span allocation, not object allocation,
in order to minimize contention and overhead. Since objects are
allocated from spans after those spans are allocated, the system tends
to operate in debt, which means when the next GC cycle starts, there
is often sweep debt remaining, so GC has to finish the sweep, which
delays the start of the cycle and delays enabling mutator assists.
For example, it's quite likely that many Ps will simultaneously refill
their span caches immediately after a GC cycle (because GC flushes the
span caches), but at this point, there has been very little object
allocation since the end of GC, so very little sweeping is done. The
Ps then allocate objects from these cached spans, which drives up the
bytes of object allocation, but since these allocations are coming
from cached spans, nothing considers whether more sweeping has to
happen. If the sweep ratio is high enough (which can happen if the
next GC trigger is very close to the retained heap size), this can
easily represent a sweep debt of thousands of pages.
Fix this by making proportional sweep proportional to the number of
bytes of spans allocated, rather than the number of bytes of objects
allocated. Prior to allocating a span, both the small object path and
the large object path ensure credit for allocating that span, so the
system operates in the black, rather than in the red.
Combined with the previous commit, this should eliminate all sweeping
from GC start up. On the stress test in issue #11911, this reduces the
time spent sweeping during GC (and delaying start up) by several
orders of magnitude:
mean 99%ile max
pre fix 1 ms 11 ms 144 ms
post fix 270 ns 735 ns 916 ns
Updates #11911.
Change-Id: I89223712883954c9d6ec2a7a51ecb97172097df3
Reviewed-on: https://go-review.googlesource.com/13044
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently there are two sensitive periods during which a mutator can
allocate past the heap goal but mutator assists can't be enabled: 1)
at the beginning of GC between when the heap first passes the heap
trigger and sweep termination and 2) at the end of GC between mark
termination and when the background GC goroutine parks. During these
periods there's no back-pressure or safety net, so a rapidly
allocating mutator can allocate past the heap goal. This is
exacerbated if there are many goroutines because the GC coordinator is
scheduled as any other goroutine, so if it gets preempted during one
of these periods, it may stay preempted for a long period (10s or 100s
of milliseconds).
Normally the mutator does scan work to create back-pressure against
allocation, but there is no scan work during these periods. Hence, as
a fall back, if a mutator would assist but can't yet, simply yield the
CPU. This delays the mutator somewhat, but more importantly gives more
CPU time to the GC coordinator for it to complete the transition.
This is obviously a workaround. Issue #11970 suggests a far better but
far more invasive way to fix this.
Updates #11911. (This very nearly fixes the issue, but about once
every 15 minutes I get a GC cycle where the assists are enabled but
don't do enough work.)
Change-Id: I9768b79e3778abd3e06d306596c3bd77f65bf3f1
Reviewed-on: https://go-review.googlesource.com/13026
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The one in misc/makerelease/makerelease.go is particularly bad and
probably warrants rotating our keys.
I didn't update old weekly notes, and reverted some changes involving
test code for now, since we're late in the Go 1.5 freeze. Otherwise,
the rest are all auto-generated changes, and all manually reviewed.
Change-Id: Ia2753576ab5d64826a167d259f48a2f50508792d
Reviewed-on: https://go-review.googlesource.com/12048
Reviewed-by: Rob Pike <r@golang.org>
In order to avoid a race with a concurrent write barrier or garbage
collector thread, any update to arena_used must be preceded by mapping
the corresponding heap bitmap and spans array memory. Otherwise, the
concurrent access may observe that a pointer falls within the heap
arena, but then attempt to access unmapped memory to look up its span
or heap bits.
Commit d57c889 fixed all of the places where we updated arena_used
immediately before mapping the heap bitmap and spans, but it missed
the one place where we update arena_used and depend on later code to
update it again and map the bitmap and spans. This creates a window
where the original race can still happen. This commit fixes this by
mapping the heap bitmap and spans before this arena_used update as
well. This code path is only taken when expanding the heap reservation
on 32-bit over a hole in the address space, so these extra mmap calls
should have negligible impact.
Fixes#10212, #11324.
Change-Id: Id67795e6c7563eb551873bc401e5cc997aaa2bd8
Reviewed-on: https://go-review.googlesource.com/11340
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently its possible for the garbage collector to observe
uninitialized memory or stale heap bitmap bits on weakly ordered
architectures such as ARM and PPC. On such architectures, the stores
that zero newly allocated memory and initialize its heap bitmap may
move after a store in user code that makes the allocated object
observable by the garbage collector.
To fix this, add a "publication barrier" (also known as an "export
barrier") before returning from mallocgc. This is a store/store
barrier that ensures any write done by user code that makes the
returned object observable to the garbage collector will be ordered
after the initialization performed by mallocgc. No barrier is
necessary on the reading side because of the data dependency between
loading the pointer and loading the contents of the object.
Fixes one of the issues raised in #9984.
Change-Id: Ia3d96ad9c5fc7f4d342f5e05ec0ceae700cd17c8
Reviewed-on: https://go-review.googlesource.com/11083
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Martin Capitanio <capnm9@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Some latency regressions have crept into our system over the past few
weeks. This CL fixes those by having the mark phase more aggressively
blacken objects so that the mark termination phase, a STW phase, has less
work to do. Three approaches were taken when the mark phase believes
it has no more work to do, ie all the work buffers are empty.
If things have gone well the mark phase is correct and there is
in fact little or no work. In that case the following items will
take very little time. If the mark phase is wrong this CL will
ferret that work out and give the mark phase a chance to deal with
it concurrently before mark termination begins.
When the mark phase first appears to be out of work, it does three things:
1) It switches from allocating white to allocating black to reduce the
number of unmarked objects reachable only from stacks.
2) It flushes and disables per-P GC work caches so all work must be in
globally visible work buffers.
3) It rescans the global roots---the BSS and data segments---so there
are fewer objects to blacken during mark termination. We do not rescan
stacks at this point, though that could be done in a later CL.
After these steps, it again drains the global work buffers.
On a lightly loaded machine the garbage benchmark has reduced the
number of GC cycles with latency > 10 ms from 83 out of 4083 cycles
down to 2 out of 3995 cycles. Maximum latency was reduced from
60+ msecs down to 20 ms.
Change-Id: I152285b48a7e56c5083a02e8e4485dd39c990492
Reviewed-on: https://go-review.googlesource.com/10590
Reviewed-by: Austin Clements <austin@google.com>
//go:systemstack means that the function must run on the system stack.
Add one use in runtime as a demonstration.
Fixes#9174.
Change-Id: I8d4a509cb313541426157da703f1c022e964ace4
Reviewed-on: https://go-review.googlesource.com/10840
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
A workaround for #10460.
Change-Id: I607a556561d509db6de047892f886fb565513895
Reviewed-on: https://go-review.googlesource.com/10819
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This avoids a race with gcmarkwb_m that was leading to faults.
Fixes#10212.
Change-Id: I6fcf8d09f2692227063ce29152cb57366ea22487
Reviewed-on: https://go-review.googlesource.com/10816
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This tracks the number of scannable bytes in the allocated heap. That
is, bytes that the garbage collector must scan before reaching the
last pointer field in each object.
This will be used to compute a more robust estimate of the GC scan
work.
Change-Id: I1eecd45ef9cdd65b69d2afb5db5da885c80086bb
Reviewed-on: https://go-review.googlesource.com/9695
Reviewed-by: Russ Cox <rsc@golang.org>
shouldtriggergc is slightly expensive due to the call overhead
and the use of an atomic. This CL reduces the number of time
one checks if a GC should be done from one at each allocation
to once when a span is allocated. Since shouldtriggergc is an
important abstraction simply hand inlining it, along with its
atomic instruction would lose the abstraction.
Change-Id: Ia3210655b4b3d433f77064a21ecb54e4d9d435f7
Reviewed-on: https://go-review.googlesource.com/9403
Reviewed-by: Austin Clements <austin@google.com>
Currently, we use a full stop-the-world around enabling write
barriers. This is to ensure that all Gs have enabled write barriers
before any blackening occurs (either in gcBgMarkWorker() or in
gcAssistAlloc()).
However, there's no need to bring the whole world to a synchronous
stop to ensure this. This change replaces the STW with a ragged
barrier that ensures each P has individually observed that write
barriers should be enabled before GC performs any blackening.
Change-Id: If2f129a6a55bd8bdd4308067af2b739f3fb41955
Reviewed-on: https://go-review.googlesource.com/8207
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The motivation is that sysAlloc/Free() currently aren't safe to be
called without a valid G, because arm's xadd64() uses locks that require
a valid G.
The solution here was proposed by Dmitry Vyukov: use xadduintptr()
instead of xadd64(), until arm can support xadd64 on all of its
architectures (not a trivial task for arm).
Change-Id: I250252079357ea2e4360e1235958b1c22051498f
Reviewed-on: https://go-review.googlesource.com/9002
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
This field used to decrease with sweeps (and potentially go
negative). Now it is always zero or positive, so change it to a
uintptr so it meshes better with other memory stats.
Change-Id: I6a50a956ddc6077eeaf92011c51743cb69540a3c
Reviewed-on: https://go-review.googlesource.com/8899
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, mutator allocation periodically assists the garbage
collector by performing a small, fixed amount of scanning work.
However, to control heap growth, mutators need to perform scanning
work *proportional* to their allocation rate.
This change implements proportional mutator assists. This uses the
scan work estimate computed by the garbage collector at the beginning
of each cycle to compute how much scan work must be performed per
allocation byte to complete the estimated scan work by the time the
heap reaches the goal size. When allocation triggers an assist, it
uses this ratio and the amount allocated since the last assist to
compute the assist work, then attempts to steal as much of this work
as possible from the background collector's credit, and then performs
any remaining scan work itself.
Change-Id: I98b2078147a60d01d6228b99afd414ef857e4fba
Reviewed-on: https://go-review.googlesource.com/8836
Reviewed-by: Rick Hudson <rlh@golang.org>
This CL revises CL 7504 to use explicitly uintptr types for the
struct fields that are going to be updated sometimes without
write barriers. The result is that the fields are now updated *always*
without write barriers.
This approach has two important properties:
1) Now the GC never looks at the field, so if the missing reference
could cause a problem, it will do so all the time, not just when the
write barrier is missed at just the right moment.
2) Now a write barrier never happens for the field, avoiding the
(correct) detection of inconsistent write barriers when GODEBUG=wbshadow=1.
Change-Id: Iebd3962c727c0046495cc08914a8dc0808460e0e
Reviewed-on: https://go-review.googlesource.com/9019
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
'themoduledata' doesn't really make sense now we support multiple moduledata
objects.
Change-Id: I8263045d8f62a42cb523502b37289b0fba054f62
Reviewed-on: https://go-review.googlesource.com/8521
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This tracks the number of heap bytes marked by a GC cycle. We'll use
this information to precisely trigger the next GC cycle.
Currently this aggregates the work counter in gcWork and dispose
atomically aggregates this into a global work counter. dispose happens
relatively infrequently, so the contention on the global counter
should be low. If this turns out to be an issue, we can reduce the
number of disposes, and if it's still a problem, we can switch to
per-P counters.
Change-Id: I1bc377cb2e802ef61c2968602b63146d52e7f5db
Reviewed-on: https://go-review.googlesource.com/8388
Reviewed-by: Russ Cox <rsc@golang.org>
In preparation for being able to run a go program that has code
in several objects, this changes from having several linker
symbols used by the runtime into having one linker symbol that
points at a structure containing the needed data. Multiple
object support will construct a linked list of such structures.
A follow up will initialize the slices in the themoduledata
structure directly from the linker but I was aiming for a minimal
diff for now.
Change-Id: I613cce35309801cf265a1d5ae5aaca8d689c5cbf
Reviewed-on: https://go-review.googlesource.com/7441
Reviewed-by: Ian Lance Taylor <iant@golang.org>
To reduce lock contention in this mode, makes persistent allocation state per-P,
which means at most 64 kB overhead x $GOMAXPROCS, which should be
completely tolerable.
Change-Id: I34ca95e77d7e67130e30822e5a4aff6772b1a1c5
Reviewed-on: https://go-review.googlesource.com/7740
Reviewed-by: Rick Hudson <rlh@golang.org>
Everything has moved to Go, but comments still refer to .c/.h files.
Fix all of those up, at least for these three directories.
Fixes#10138
Change-Id: Ie5efe89b247841e0b3f82aac5256b2c606ef67dc
Reviewed-on: https://go-review.googlesource.com/7431
Reviewed-by: Russ Cox <rsc@golang.org>
Even though the world is stopped the GC may do pointer
writes that need to be protected by write barriers.
This means that the write barrier must be on
continuously from the time the mark phase starts and
the mark termination phase ends. Checks were added to
ensure that no allocation happens during a GC.
Hoist the logic that clears pools the start of the GC
so that the memory can be reclaimed during this GC cycle.
Change-Id: I9d1551ac5db9bac7bac0cb5370d5b2b19a9e6a52
Reviewed-on: https://go-review.googlesource.com/6990
Reviewed-by: Austin Clements <austin@google.com>