2014-11-11 15:05:02 -07:00
|
|
|
// Copyright 2009 The Go Authors. All rights reserved.
|
|
|
|
// Use of this source code is governed by a BSD-style
|
|
|
|
// license that can be found in the LICENSE file.
|
|
|
|
|
|
|
|
package runtime
|
|
|
|
|
|
|
|
import "unsafe"
|
|
|
|
|
2015-02-19 11:38:46 -07:00
|
|
|
// Per-thread (in Go, per-P) cache for small objects.
|
|
|
|
// No locking needed because it is per-thread (per-P).
|
runtime: clear tiny alloc cache in mark term, not sweep term
The tiny alloc cache is maintained in a pointer from non-GC'd memory
(mcache) to heap memory and hence must be handled carefully.
Currently we clear the tiny alloc cache during sweep termination and,
if it is assigned to a non-nil value during concurrent marking, we
depend on a write barrier to keep the new value alive. However, while
the compiler currently always generates this write barrier, we're
treading on thin ice because write barriers may not happen for writes
to non-heap memory (e.g., typedmemmove). Without this lucky write
barrier, the GC may free a current tiny block while it's still
reachable by the tiny allocator, leading to later memory corruption.
Change this code so that, rather than depending on the write barrier,
we simply clear the tiny cache during mark termination when we're
clearing all of the other mcaches. If the current tiny block is
reachable from regular pointers, it will be retained; if it isn't
reachable from regular pointers, it may be freed, but that's okay
because there won't be any pointers in non-GC'd memory to it.
Change-Id: I8230980d8612c35c2997b9705641a1f9f865f879
Reviewed-on: https://go-review.googlesource.com/16962
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-16 13:20:59 -07:00
|
|
|
//
|
|
|
|
// mcaches are allocated from non-GC'd memory, so any heap pointers
|
|
|
|
// must be specially handled.
|
2015-02-19 11:38:46 -07:00
|
|
|
type mcache struct {
|
|
|
|
// The following members are accessed on every malloc,
|
|
|
|
// so they are grouped here for better caching.
|
runtime: fix (sometimes major) underestimation of heap_live
Currently, we update memstats.heap_live from mcache.local_cachealloc
whenever we lock the heap (e.g., to obtain a fresh span or to release
an unused span). However, under the right circumstances,
local_cachealloc can accumulate allocations up to the size of
the *entire heap* without flushing them to heap_live. Specifically,
since span allocations from an mcentral don't lock the heap, if a
large number of pages are held in an mcentral and the application
continues to use and free objects of that size class (e.g., the
BinaryTree17 benchmark), local_cachealloc won't be flushed until the
mcentral runs out of spans.
This is a problem because, unlike many of the memory statistics that
are purely informative, heap_live is used to determine when the
garbage collector should start and how hard it should work.
This commit eliminates local_cachealloc, instead atomically updating
heap_live directly. To control contention, we do this only when
obtaining a span from an mcentral. Furthermore, we make heap_live
conservative: allocating a span assumes that all free slots in that
span will be used and accounts for these when the span is
allocated, *before* the objects themselves are. This is important
because 1) this triggers the GC earlier than necessary rather than
potentially too late and 2) this leads to a conservative GC rate
rather than a GC rate that is potentially too low.
Alternatively, we could have flushed local_cachealloc when it passed
some threshold, but this would require determining a threshold and
would cause heap_live to underestimate the true value rather than
overestimate.
Fixes #12199.
name old time/op new time/op delta
BinaryTree17-12 2.88s ± 4% 2.88s ± 1% ~ (p=0.470 n=19+19)
Fannkuch11-12 2.48s ± 1% 2.48s ± 1% ~ (p=0.243 n=16+19)
FmtFprintfEmpty-12 50.9ns ± 2% 50.7ns ± 1% ~ (p=0.238 n=15+14)
FmtFprintfString-12 175ns ± 1% 171ns ± 1% -2.48% (p=0.000 n=18+18)
FmtFprintfInt-12 159ns ± 1% 158ns ± 1% -0.78% (p=0.000 n=19+18)
FmtFprintfIntInt-12 270ns ± 1% 265ns ± 2% -1.67% (p=0.000 n=18+18)
FmtFprintfPrefixedInt-12 235ns ± 1% 234ns ± 0% ~ (p=0.362 n=18+19)
FmtFprintfFloat-12 309ns ± 1% 308ns ± 1% -0.41% (p=0.001 n=18+19)
FmtManyArgs-12 1.10µs ± 1% 1.08µs ± 0% -1.96% (p=0.000 n=19+18)
GobDecode-12 7.81ms ± 1% 7.80ms ± 1% ~ (p=0.425 n=18+19)
GobEncode-12 6.53ms ± 1% 6.53ms ± 1% ~ (p=0.817 n=19+19)
Gzip-12 312ms ± 1% 312ms ± 2% ~ (p=0.967 n=19+20)
Gunzip-12 42.0ms ± 1% 41.9ms ± 1% ~ (p=0.172 n=19+19)
HTTPClientServer-12 63.7µs ± 1% 63.8µs ± 1% ~ (p=0.639 n=19+19)
JSONEncode-12 16.4ms ± 1% 16.4ms ± 1% ~ (p=0.954 n=19+19)
JSONDecode-12 58.5ms ± 1% 57.8ms ± 1% -1.27% (p=0.000 n=18+19)
Mandelbrot200-12 3.86ms ± 1% 3.88ms ± 0% +0.44% (p=0.000 n=18+18)
GoParse-12 3.67ms ± 2% 3.66ms ± 1% -0.52% (p=0.001 n=18+19)
RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 0% ~ (p=0.257 n=19+18)
RegexpMatchEasy0_1K-12 347ns ± 1% 347ns ± 1% ~ (p=0.527 n=18+18)
RegexpMatchEasy1_32-12 83.7ns ± 2% 83.1ns ± 2% ~ (p=0.096 n=18+19)
RegexpMatchEasy1_1K-12 509ns ± 1% 505ns ± 1% -0.75% (p=0.000 n=18+19)
RegexpMatchMedium_32-12 130ns ± 2% 129ns ± 1% ~ (p=0.962 n=20+20)
RegexpMatchMedium_1K-12 39.5µs ± 2% 39.4µs ± 1% ~ (p=0.376 n=20+19)
RegexpMatchHard_32-12 2.04µs ± 0% 2.04µs ± 1% ~ (p=0.195 n=18+17)
RegexpMatchHard_1K-12 61.4µs ± 1% 61.4µs ± 1% ~ (p=0.885 n=19+19)
Revcomp-12 540ms ± 2% 542ms ± 4% ~ (p=0.552 n=19+17)
Template-12 69.6ms ± 1% 71.2ms ± 1% +2.39% (p=0.000 n=20+20)
TimeParse-12 357ns ± 1% 357ns ± 1% ~ (p=0.883 n=18+20)
TimeFormat-12 379ns ± 1% 362ns ± 1% -4.53% (p=0.000 n=18+19)
[Geo mean] 62.0µs 61.8µs -0.44%
name old time/op new time/op delta
XBenchGarbage-12 5.89ms ± 2% 5.81ms ± 2% -1.41% (p=0.000 n=19+18)
Change-Id: I96b31cca6ae77c30693a891cff3fe663fa2447a0
Reviewed-on: https://go-review.googlesource.com/17748
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-12-11 15:49:14 -07:00
|
|
|
next_sample int32 // trigger heap sample after allocating this many bytes
|
|
|
|
local_scan uintptr // bytes of scannable heap allocated
|
runtime: clear tiny alloc cache in mark term, not sweep term
The tiny alloc cache is maintained in a pointer from non-GC'd memory
(mcache) to heap memory and hence must be handled carefully.
Currently we clear the tiny alloc cache during sweep termination and,
if it is assigned to a non-nil value during concurrent marking, we
depend on a write barrier to keep the new value alive. However, while
the compiler currently always generates this write barrier, we're
treading on thin ice because write barriers may not happen for writes
to non-heap memory (e.g., typedmemmove). Without this lucky write
barrier, the GC may free a current tiny block while it's still
reachable by the tiny allocator, leading to later memory corruption.
Change this code so that, rather than depending on the write barrier,
we simply clear the tiny cache during mark termination when we're
clearing all of the other mcaches. If the current tiny block is
reachable from regular pointers, it will be retained; if it isn't
reachable from regular pointers, it may be freed, but that's okay
because there won't be any pointers in non-GC'd memory to it.
Change-Id: I8230980d8612c35c2997b9705641a1f9f865f879
Reviewed-on: https://go-review.googlesource.com/16962
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-16 13:20:59 -07:00
|
|
|
|
2015-02-19 11:38:46 -07:00
|
|
|
// Allocator cache for tiny objects w/o pointers.
|
|
|
|
// See "Tiny allocator" comment in malloc.go.
|
runtime: clear tiny alloc cache in mark term, not sweep term
The tiny alloc cache is maintained in a pointer from non-GC'd memory
(mcache) to heap memory and hence must be handled carefully.
Currently we clear the tiny alloc cache during sweep termination and,
if it is assigned to a non-nil value during concurrent marking, we
depend on a write barrier to keep the new value alive. However, while
the compiler currently always generates this write barrier, we're
treading on thin ice because write barriers may not happen for writes
to non-heap memory (e.g., typedmemmove). Without this lucky write
barrier, the GC may free a current tiny block while it's still
reachable by the tiny allocator, leading to later memory corruption.
Change this code so that, rather than depending on the write barrier,
we simply clear the tiny cache during mark termination when we're
clearing all of the other mcaches. If the current tiny block is
reachable from regular pointers, it will be retained; if it isn't
reachable from regular pointers, it may be freed, but that's okay
because there won't be any pointers in non-GC'd memory to it.
Change-Id: I8230980d8612c35c2997b9705641a1f9f865f879
Reviewed-on: https://go-review.googlesource.com/16962
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-16 13:20:59 -07:00
|
|
|
|
|
|
|
// tiny points to the beginning of the current tiny block, or
|
|
|
|
// nil if there is no current tiny block.
|
|
|
|
//
|
|
|
|
// tiny is a heap pointer. Since mcache is in non-GC'd memory,
|
|
|
|
// we handle it by clearing it in releaseAll during mark
|
|
|
|
// termination.
|
2015-11-16 13:31:50 -07:00
|
|
|
tiny uintptr
|
2015-02-19 11:38:46 -07:00
|
|
|
tinyoffset uintptr
|
|
|
|
local_tinyallocs uintptr // number of tiny allocs not counted in other stats
|
|
|
|
|
|
|
|
// The rest is not accessed on every malloc.
|
|
|
|
alloc [_NumSizeClasses]*mspan // spans to allocate from
|
|
|
|
|
|
|
|
stackcache [_NumStackOrders]stackfreelist
|
|
|
|
|
|
|
|
// Local allocator stats, flushed during GC.
|
|
|
|
local_nlookup uintptr // number of pointer lookups
|
|
|
|
local_largefree uintptr // bytes freed for large objects (>maxsmallsize)
|
|
|
|
local_nlargefree uintptr // number of frees for large objects (>maxsmallsize)
|
|
|
|
local_nsmallfree [_NumSizeClasses]uintptr // number of frees for small objects (<=maxsmallsize)
|
|
|
|
}
|
|
|
|
|
|
|
|
// A gclink is a node in a linked list of blocks, like mlink,
|
|
|
|
// but it is opaque to the garbage collector.
|
|
|
|
// The GC does not trace the pointers during collection,
|
|
|
|
// and the compiler does not emit write barriers for assignments
|
|
|
|
// of gclinkptr values. Code should store references to gclinks
|
|
|
|
// as gclinkptr, not as *gclink.
|
|
|
|
type gclink struct {
|
|
|
|
next gclinkptr
|
|
|
|
}
|
|
|
|
|
|
|
|
// A gclinkptr is a pointer to a gclink, but it is opaque
|
|
|
|
// to the garbage collector.
|
|
|
|
type gclinkptr uintptr
|
|
|
|
|
|
|
|
// ptr returns the *gclink form of p.
|
|
|
|
// The result should be used for accessing fields, not stored
|
|
|
|
// in other data structures.
|
|
|
|
func (p gclinkptr) ptr() *gclink {
|
|
|
|
return (*gclink)(unsafe.Pointer(p))
|
|
|
|
}
|
|
|
|
|
|
|
|
type stackfreelist struct {
|
|
|
|
list gclinkptr // linked list of free stacks
|
|
|
|
size uintptr // total size of stacks in list
|
|
|
|
}
|
|
|
|
|
2014-11-11 15:05:02 -07:00
|
|
|
// dummy MSpan that contains no free objects.
|
|
|
|
var emptymspan mspan
|
|
|
|
|
|
|
|
func allocmcache() *mcache {
|
|
|
|
lock(&mheap_.lock)
|
2015-11-11 17:13:51 -07:00
|
|
|
c := (*mcache)(mheap_.cachealloc.alloc())
|
2014-11-11 15:05:02 -07:00
|
|
|
unlock(&mheap_.lock)
|
|
|
|
memclr(unsafe.Pointer(c), unsafe.Sizeof(*c))
|
|
|
|
for i := 0; i < _NumSizeClasses; i++ {
|
|
|
|
c.alloc[i] = &emptymspan
|
|
|
|
}
|
2015-09-14 15:03:45 -06:00
|
|
|
c.next_sample = nextSample()
|
2014-11-11 15:05:02 -07:00
|
|
|
return c
|
|
|
|
}
|
|
|
|
|
|
|
|
func freemcache(c *mcache) {
|
[dev.cc] runtime: delete scalararg, ptrarg; rename onM to systemstack
Scalararg and ptrarg are not "signal safe".
Go code filling them out can be interrupted by a signal,
and then the signal handler runs, and if it also ends up
in Go code that uses scalararg or ptrarg, now the old
values have been smashed.
For the pieces of code that do need to run in a signal handler,
we introduced onM_signalok, which is really just onM
except that the _signalok is meant to convey that the caller
asserts that scalarg and ptrarg will be restored to their old
values after the call (instead of the usual behavior, zeroing them).
Scalararg and ptrarg are also untyped and therefore error-prone.
Go code can always pass a closure instead of using scalararg
and ptrarg; they were only really necessary for C code.
And there's no more C code.
For all these reasons, delete scalararg and ptrarg, converting
the few remaining references to use closures.
Once those are gone, there is no need for a distinction between
onM and onM_signalok, so replace both with a single function
equivalent to the current onM_signalok (that is, it can be called
on any of the curg, g0, and gsignal stacks).
The name onM and the phrase 'm stack' are misnomers,
because on most system an M has two system stacks:
the main thread stack and the signal handling stack.
Correct the misnomer by naming the replacement function systemstack.
Fix a few references to "M stack" in code.
The main motivation for this change is to eliminate scalararg/ptrarg.
Rick and I have already seen them cause problems because
the calling sequence m.ptrarg[0] = p is a heap pointer assignment,
so it gets a write barrier. The write barrier also uses onM, so it has
all the same problems as if it were being invoked by a signal handler.
We worked around this by saving and restoring the old values
and by calling onM_signalok, but there's no point in keeping this nice
home for bugs around any longer.
This CL also changes funcline to return the file name as a result
instead of filling in a passed-in *string. (The *string signature is
left over from when the code was written in and called from C.)
That's arguably an unrelated change, except that once I had done
the ptrarg/scalararg/onM cleanup I started getting false positives
about the *string argument escaping (not allowed in package runtime).
The compiler is wrong, but the easiest fix is to write the code like
Go code instead of like C code. I am a bit worried that the compiler
is wrong because of some use of uninitialized memory in the escape
analysis. If that's the reason, it will go away when we convert the
compiler to Go. (And if not, we'll debug it the next time.)
LGTM=khr
R=r, khr
CC=austin, golang-codereviews, iant, rlh
https://golang.org/cl/174950043
2014-11-12 12:54:31 -07:00
|
|
|
systemstack(func() {
|
2015-11-11 17:13:51 -07:00
|
|
|
c.releaseAll()
|
2014-11-11 15:05:02 -07:00
|
|
|
stackcache_clear(c)
|
2014-11-15 06:00:38 -07:00
|
|
|
|
|
|
|
// NOTE(rsc,rlh): If gcworkbuffree comes back, we need to coordinate
|
|
|
|
// with the stealing of gcworkbufs during garbage collection to avoid
|
|
|
|
// a race where the workbuf is double-freed.
|
|
|
|
// gcworkbuffree(c.gcworkbuf)
|
|
|
|
|
2014-11-11 15:05:02 -07:00
|
|
|
lock(&mheap_.lock)
|
|
|
|
purgecachedstats(c)
|
2015-11-11 17:13:51 -07:00
|
|
|
mheap_.cachealloc.free(unsafe.Pointer(c))
|
2014-11-11 15:05:02 -07:00
|
|
|
unlock(&mheap_.lock)
|
|
|
|
})
|
|
|
|
}
|
|
|
|
|
|
|
|
// Gets a span that has a free object in it and assigns it
|
2016-03-01 16:21:55 -07:00
|
|
|
// to be the cached span for the given sizeclass. Returns this span.
|
2015-11-11 17:13:51 -07:00
|
|
|
func (c *mcache) refill(sizeclass int32) *mspan {
|
2014-11-11 15:05:02 -07:00
|
|
|
_g_ := getg()
|
|
|
|
|
|
|
|
_g_.m.locks++
|
|
|
|
// Return the current cached span to the central lists.
|
|
|
|
s := c.alloc[sizeclass]
|
2016-02-11 11:57:58 -07:00
|
|
|
|
2016-02-16 15:16:43 -07:00
|
|
|
if uintptr(s.allocCount) != s.nelems {
|
2016-02-11 11:57:58 -07:00
|
|
|
throw("refill of span with free space remaining")
|
2014-11-11 15:05:02 -07:00
|
|
|
}
|
2016-02-11 11:57:58 -07:00
|
|
|
|
2014-11-11 15:05:02 -07:00
|
|
|
if s != &emptymspan {
|
|
|
|
s.incache = false
|
|
|
|
}
|
|
|
|
|
|
|
|
// Get a new cached span from the central lists.
|
2015-11-11 17:13:51 -07:00
|
|
|
s = mheap_.central[sizeclass].mcentral.cacheSpan()
|
2014-11-11 15:05:02 -07:00
|
|
|
if s == nil {
|
2014-12-27 21:58:00 -07:00
|
|
|
throw("out of memory")
|
2014-11-11 15:05:02 -07:00
|
|
|
}
|
2016-02-11 11:57:58 -07:00
|
|
|
|
2016-02-16 15:16:43 -07:00
|
|
|
if uintptr(s.allocCount) == s.nelems {
|
2016-02-11 11:57:58 -07:00
|
|
|
throw("span has no free space")
|
2014-11-11 15:05:02 -07:00
|
|
|
}
|
2016-02-11 11:57:58 -07:00
|
|
|
|
2014-11-11 15:05:02 -07:00
|
|
|
c.alloc[sizeclass] = s
|
|
|
|
_g_.m.locks--
|
|
|
|
return s
|
|
|
|
}
|
|
|
|
|
2015-11-11 17:13:51 -07:00
|
|
|
func (c *mcache) releaseAll() {
|
2014-11-11 15:05:02 -07:00
|
|
|
for i := 0; i < _NumSizeClasses; i++ {
|
|
|
|
s := c.alloc[i]
|
|
|
|
if s != &emptymspan {
|
2015-11-11 17:13:51 -07:00
|
|
|
mheap_.central[i].mcentral.uncacheSpan(s)
|
2014-11-11 15:05:02 -07:00
|
|
|
c.alloc[i] = &emptymspan
|
|
|
|
}
|
|
|
|
}
|
runtime: clear tiny alloc cache in mark term, not sweep term
The tiny alloc cache is maintained in a pointer from non-GC'd memory
(mcache) to heap memory and hence must be handled carefully.
Currently we clear the tiny alloc cache during sweep termination and,
if it is assigned to a non-nil value during concurrent marking, we
depend on a write barrier to keep the new value alive. However, while
the compiler currently always generates this write barrier, we're
treading on thin ice because write barriers may not happen for writes
to non-heap memory (e.g., typedmemmove). Without this lucky write
barrier, the GC may free a current tiny block while it's still
reachable by the tiny allocator, leading to later memory corruption.
Change this code so that, rather than depending on the write barrier,
we simply clear the tiny cache during mark termination when we're
clearing all of the other mcaches. If the current tiny block is
reachable from regular pointers, it will be retained; if it isn't
reachable from regular pointers, it may be freed, but that's okay
because there won't be any pointers in non-GC'd memory to it.
Change-Id: I8230980d8612c35c2997b9705641a1f9f865f879
Reviewed-on: https://go-review.googlesource.com/16962
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-16 13:20:59 -07:00
|
|
|
// Clear tinyalloc pool.
|
2015-11-16 13:31:50 -07:00
|
|
|
c.tiny = 0
|
runtime: clear tiny alloc cache in mark term, not sweep term
The tiny alloc cache is maintained in a pointer from non-GC'd memory
(mcache) to heap memory and hence must be handled carefully.
Currently we clear the tiny alloc cache during sweep termination and,
if it is assigned to a non-nil value during concurrent marking, we
depend on a write barrier to keep the new value alive. However, while
the compiler currently always generates this write barrier, we're
treading on thin ice because write barriers may not happen for writes
to non-heap memory (e.g., typedmemmove). Without this lucky write
barrier, the GC may free a current tiny block while it's still
reachable by the tiny allocator, leading to later memory corruption.
Change this code so that, rather than depending on the write barrier,
we simply clear the tiny cache during mark termination when we're
clearing all of the other mcaches. If the current tiny block is
reachable from regular pointers, it will be retained; if it isn't
reachable from regular pointers, it may be freed, but that's okay
because there won't be any pointers in non-GC'd memory to it.
Change-Id: I8230980d8612c35c2997b9705641a1f9f865f879
Reviewed-on: https://go-review.googlesource.com/16962
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-16 13:20:59 -07:00
|
|
|
c.tinyoffset = 0
|
2014-11-11 15:05:02 -07:00
|
|
|
}
|