2008-12-18 16:42:28 -07:00
|
|
|
// Copyright 2009 The Go Authors. All rights reserved.
|
|
|
|
// Use of this source code is governed by a BSD-style
|
|
|
|
// license that can be found in the LICENSE file.
|
|
|
|
|
|
|
|
// Memory allocator, based on tcmalloc.
|
|
|
|
// http://goog-perftools.sourceforge.net/doc/tcmalloc.html
|
|
|
|
|
|
|
|
// The main allocator works in runs of pages.
|
|
|
|
// Small allocation sizes (up to and including 32 kB) are
|
|
|
|
// rounded to one of about 100 size classes, each of which
|
|
|
|
// has its own free list of objects of exactly that size.
|
|
|
|
// Any free page of memory can be split into a set of objects
|
|
|
|
// of one size class, which are then managed using free list
|
|
|
|
// allocators.
|
|
|
|
//
|
|
|
|
// The allocator's data structures are:
|
|
|
|
//
|
|
|
|
// FixAlloc: a free-list allocator for fixed-size objects,
|
|
|
|
// used to manage storage used by the allocator.
|
|
|
|
// MHeap: the malloc heap, managed at page (4096-byte) granularity.
|
|
|
|
// MSpan: a run of pages managed by the MHeap.
|
|
|
|
// MCentral: a shared free list for a given size class.
|
|
|
|
// MCache: a per-thread (in Go, per-M) cache for small objects.
|
|
|
|
// MStats: allocation statistics.
|
|
|
|
//
|
|
|
|
// Allocating a small object proceeds up a hierarchy of caches:
|
|
|
|
//
|
|
|
|
// 1. Round the size up to one of the small size classes
|
|
|
|
// and look in the corresponding MCache free list.
|
|
|
|
// If the list is not empty, allocate an object from it.
|
|
|
|
// This can all be done without acquiring a lock.
|
|
|
|
//
|
|
|
|
// 2. If the MCache free list is empty, replenish it by
|
|
|
|
// taking a bunch of objects from the MCentral free list.
|
|
|
|
// Moving a bunch amortizes the cost of acquiring the MCentral lock.
|
|
|
|
//
|
|
|
|
// 3. If the MCentral free list is empty, replenish it by
|
|
|
|
// allocating a run of pages from the MHeap and then
|
|
|
|
// chopping that memory into a objects of the given size.
|
|
|
|
// Allocating many objects amortizes the cost of locking
|
|
|
|
// the heap.
|
|
|
|
//
|
|
|
|
// 4. If the MHeap is empty or has no page runs large enough,
|
|
|
|
// allocate a new group of pages (at least 1MB) from the
|
|
|
|
// operating system. Allocating a large run of pages
|
|
|
|
// amortizes the cost of talking to the operating system.
|
|
|
|
//
|
|
|
|
// Freeing a small object proceeds up the same hierarchy:
|
|
|
|
//
|
|
|
|
// 1. Look up the size class for the object and add it to
|
|
|
|
// the MCache free list.
|
|
|
|
//
|
|
|
|
// 2. If the MCache free list is too long or the MCache has
|
|
|
|
// too much memory, return some to the MCentral free lists.
|
|
|
|
//
|
|
|
|
// 3. If all the objects in a given span have returned to
|
|
|
|
// the MCentral list, return that span to the page heap.
|
|
|
|
//
|
|
|
|
// 4. If the heap has too much memory, return some to the
|
|
|
|
// operating system.
|
|
|
|
//
|
2008-12-19 04:13:39 -07:00
|
|
|
// TODO(rsc): Step 4 is not implemented.
|
2008-12-18 16:42:28 -07:00
|
|
|
//
|
|
|
|
// Allocating and freeing a large object uses the page heap
|
|
|
|
// directly, bypassing the MCache and MCentral free lists.
|
|
|
|
//
|
2010-02-10 01:00:12 -07:00
|
|
|
// The small objects on the MCache and MCentral free lists
|
|
|
|
// may or may not be zeroed. They are zeroed if and only if
|
|
|
|
// the second word of the object is zero. The spans in the
|
|
|
|
// page heap are always zeroed. When a span full of objects
|
|
|
|
// is returned to the page heap, the objects that need to be
|
|
|
|
// are zeroed first. There are two main benefits to delaying the
|
|
|
|
// zeroing this way:
|
|
|
|
//
|
|
|
|
// 1. stack frames allocated from the small object lists
|
|
|
|
// can avoid zeroing altogether.
|
|
|
|
// 2. the cost of zeroing when reusing a small object is
|
|
|
|
// charged to the mutator, not the garbage collector.
|
|
|
|
//
|
2008-12-18 16:42:28 -07:00
|
|
|
// This C code was written with an eye toward translating to Go
|
|
|
|
// in the future. Methods have the form Type_Method(Type *t, ...).
|
|
|
|
|
|
|
|
typedef struct MCentral MCentral;
|
|
|
|
typedef struct MHeap MHeap;
|
|
|
|
typedef struct MSpan MSpan;
|
|
|
|
typedef struct MStats MStats;
|
2008-12-19 04:13:39 -07:00
|
|
|
typedef struct MLink MLink;
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
enum
|
|
|
|
{
|
|
|
|
PageShift = 12,
|
|
|
|
PageSize = 1<<PageShift,
|
|
|
|
PageMask = PageSize - 1,
|
|
|
|
};
|
|
|
|
typedef uintptr PageID; // address >> PageShift
|
|
|
|
|
|
|
|
enum
|
|
|
|
{
|
2011-02-02 21:03:47 -07:00
|
|
|
// Computed constant. The definition of MaxSmallSize and the
|
|
|
|
// algorithm in msize.c produce some number of different allocation
|
|
|
|
// size classes. NumSizeClasses is that number. It's needed here
|
|
|
|
// because there are static arrays of this length; when msize runs its
|
|
|
|
// size choosing algorithm it double-checks that NumSizeClasses agrees.
|
|
|
|
NumSizeClasses = 61,
|
|
|
|
|
2008-12-18 16:42:28 -07:00
|
|
|
// Tunable constants.
|
|
|
|
MaxSmallSize = 32<<10,
|
|
|
|
|
|
|
|
FixAllocChunk = 128<<10, // Chunk size for FixAlloc
|
|
|
|
MaxMCacheListLen = 256, // Maximum objects on MCacheList
|
|
|
|
MaxMCacheSize = 2<<20, // Maximum bytes in one MCache
|
|
|
|
MaxMHeapList = 1<<(20 - PageShift), // Maximum page length for fixed-size list in MHeap.
|
|
|
|
HeapAllocChunk = 1<<20, // Chunk size for heap growth
|
|
|
|
|
2011-01-28 13:03:26 -07:00
|
|
|
// Number of bits in page to span calculations (4k pages).
|
|
|
|
// On 64-bit, we limit the arena to 16G, so 22 bits suffices.
|
|
|
|
// On 32-bit, we don't bother limiting anything: 20 bits for 4G.
|
2009-03-30 01:01:07 -06:00
|
|
|
#ifdef _64BIT
|
2011-01-28 13:03:26 -07:00
|
|
|
MHeapMap_Bits = 22,
|
2009-03-30 01:01:07 -06:00
|
|
|
#else
|
2011-01-28 13:03:26 -07:00
|
|
|
MHeapMap_Bits = 20,
|
2009-03-30 01:01:07 -06:00
|
|
|
#endif
|
runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 07:40:01 -06:00
|
|
|
|
|
|
|
// Max number of threads to run garbage collection.
|
|
|
|
// 2, 3, and 4 are all plausible maximums depending
|
2012-01-10 20:49:11 -07:00
|
|
|
// on the hardware details of the machine. The garbage
|
|
|
|
// collector scales well to 4 cpus.
|
|
|
|
MaxGcproc = 4,
|
2011-01-28 13:03:26 -07:00
|
|
|
};
|
2008-12-18 16:42:28 -07:00
|
|
|
|
2008-12-19 04:13:39 -07:00
|
|
|
// A generic linked list of blocks. (Typically the block is bigger than sizeof(MLink).)
|
|
|
|
struct MLink
|
|
|
|
{
|
|
|
|
MLink *next;
|
|
|
|
};
|
|
|
|
|
2009-12-07 16:52:14 -07:00
|
|
|
// SysAlloc obtains a large chunk of zeroed memory from the
|
|
|
|
// operating system, typically on the order of a hundred kilobytes
|
2011-01-28 13:03:26 -07:00
|
|
|
// or a megabyte. If the pointer argument is non-nil, the caller
|
|
|
|
// wants a mapping there or nowhere.
|
2008-12-18 16:42:28 -07:00
|
|
|
//
|
|
|
|
// SysUnused notifies the operating system that the contents
|
|
|
|
// of the memory region are no longer needed and can be reused
|
|
|
|
// for other purposes. The program reserves the right to start
|
|
|
|
// accessing those pages in the future.
|
|
|
|
//
|
|
|
|
// SysFree returns it unconditionally; this is only used if
|
|
|
|
// an out-of-memory error has been detected midway through
|
|
|
|
// an allocation. It is okay if SysFree is a no-op.
|
2011-01-28 13:03:26 -07:00
|
|
|
//
|
|
|
|
// SysReserve reserves address space without allocating memory.
|
|
|
|
// If the pointer passed to it is non-nil, the caller wants the
|
|
|
|
// reservation there, but SysReserve can still choose another
|
|
|
|
// location if that one is unavailable.
|
|
|
|
//
|
|
|
|
// SysMap maps previously reserved address space for use.
|
2008-12-18 16:42:28 -07:00
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void* runtime·SysAlloc(uintptr nbytes);
|
|
|
|
void runtime·SysFree(void *v, uintptr nbytes);
|
|
|
|
void runtime·SysUnused(void *v, uintptr nbytes);
|
2011-01-28 13:03:26 -07:00
|
|
|
void runtime·SysMap(void *v, uintptr nbytes);
|
|
|
|
void* runtime·SysReserve(void *v, uintptr nbytes);
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
// FixAlloc is a simple free-list allocator for fixed size objects.
|
|
|
|
// Malloc uses a FixAlloc wrapped around SysAlloc to manages its
|
|
|
|
// MCache and MSpan objects.
|
|
|
|
//
|
|
|
|
// Memory returned by FixAlloc_Alloc is not zeroed.
|
|
|
|
// The caller is responsible for locking around FixAlloc calls.
|
2009-01-28 16:22:16 -07:00
|
|
|
// Callers can keep state in the object but the first word is
|
|
|
|
// smashed by freeing and reallocating.
|
2008-12-18 16:42:28 -07:00
|
|
|
struct FixAlloc
|
|
|
|
{
|
|
|
|
uintptr size;
|
|
|
|
void *(*alloc)(uintptr);
|
2009-01-28 16:22:16 -07:00
|
|
|
void (*first)(void *arg, byte *p); // called first time p is returned
|
|
|
|
void *arg;
|
2008-12-19 04:13:39 -07:00
|
|
|
MLink *list;
|
2008-12-18 16:42:28 -07:00
|
|
|
byte *chunk;
|
|
|
|
uint32 nchunk;
|
2010-03-29 14:06:26 -06:00
|
|
|
uintptr inuse; // in-use bytes now
|
|
|
|
uintptr sys; // bytes obtained from system
|
2008-12-18 16:42:28 -07:00
|
|
|
};
|
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void runtime·FixAlloc_Init(FixAlloc *f, uintptr size, void *(*alloc)(uintptr), void (*first)(void*, byte*), void *arg);
|
|
|
|
void* runtime·FixAlloc_Alloc(FixAlloc *f);
|
|
|
|
void runtime·FixAlloc_Free(FixAlloc *f, void *p);
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
|
|
|
|
// Statistics.
|
2010-02-08 15:32:22 -07:00
|
|
|
// Shared with Go: if you edit this structure, also edit extern.go.
|
2008-12-18 16:42:28 -07:00
|
|
|
struct MStats
|
|
|
|
{
|
2011-07-18 12:52:57 -06:00
|
|
|
// General statistics.
|
2010-03-29 14:06:26 -06:00
|
|
|
uint64 alloc; // bytes allocated and still in use
|
|
|
|
uint64 total_alloc; // bytes allocated (even if freed)
|
2011-07-18 12:52:57 -06:00
|
|
|
uint64 sys; // bytes obtained from system (should be sum of xxx_sys below, no locking, approximate)
|
2010-03-29 14:06:26 -06:00
|
|
|
uint64 nlookup; // number of pointer lookups
|
|
|
|
uint64 nmalloc; // number of mallocs
|
2011-01-19 11:41:42 -07:00
|
|
|
uint64 nfree; // number of frees
|
runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 07:40:01 -06:00
|
|
|
|
2010-03-29 14:06:26 -06:00
|
|
|
// Statistics about malloc heap.
|
|
|
|
// protected by mheap.Lock
|
|
|
|
uint64 heap_alloc; // bytes allocated and still in use
|
|
|
|
uint64 heap_sys; // bytes obtained from system
|
|
|
|
uint64 heap_idle; // bytes in idle spans
|
|
|
|
uint64 heap_inuse; // bytes in non-idle spans
|
2012-02-16 11:30:04 -07:00
|
|
|
uint64 heap_released; // bytes released to the OS
|
2010-09-07 07:57:22 -06:00
|
|
|
uint64 heap_objects; // total number of allocated objects
|
2010-03-29 14:06:26 -06:00
|
|
|
|
|
|
|
// Statistics about allocation of low-level fixed-size structures.
|
|
|
|
// Protected by FixAlloc locks.
|
|
|
|
uint64 stacks_inuse; // bootstrap stacks
|
|
|
|
uint64 stacks_sys;
|
|
|
|
uint64 mspan_inuse; // MSpan structures
|
|
|
|
uint64 mspan_sys;
|
|
|
|
uint64 mcache_inuse; // MCache structures
|
|
|
|
uint64 mcache_sys;
|
2010-03-29 18:30:07 -06:00
|
|
|
uint64 buckhash_sys; // profiling bucket hash table
|
runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 07:40:01 -06:00
|
|
|
|
2010-03-29 14:06:26 -06:00
|
|
|
// Statistics about garbage collector.
|
|
|
|
// Protected by stopping the world during GC.
|
|
|
|
uint64 next_gc; // next GC (in heap_alloc time)
|
2012-02-16 11:30:04 -07:00
|
|
|
uint64 last_gc; // last GC (in absolute time)
|
2011-01-19 11:41:42 -07:00
|
|
|
uint64 pause_total_ns;
|
|
|
|
uint64 pause_ns[256];
|
2010-02-08 15:32:22 -07:00
|
|
|
uint32 numgc;
|
2009-01-26 18:37:05 -07:00
|
|
|
bool enablegc;
|
2010-02-08 15:32:22 -07:00
|
|
|
bool debuggc;
|
runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 07:40:01 -06:00
|
|
|
|
2010-03-29 14:06:26 -06:00
|
|
|
// Statistics about allocation size classes.
|
2010-02-08 15:32:22 -07:00
|
|
|
struct {
|
|
|
|
uint32 size;
|
|
|
|
uint64 nmalloc;
|
|
|
|
uint64 nfree;
|
|
|
|
} by_size[NumSizeClasses];
|
2008-12-18 16:42:28 -07:00
|
|
|
};
|
2010-02-03 17:31:34 -07:00
|
|
|
|
2012-02-06 11:16:26 -07:00
|
|
|
#define mstats runtime·memStats /* name shared with Go */
|
2008-12-18 16:42:28 -07:00
|
|
|
extern MStats mstats;
|
|
|
|
|
|
|
|
|
|
|
|
// Size classes. Computed and initialized by InitSizes.
|
|
|
|
//
|
|
|
|
// SizeToClass(0 <= n <= MaxSmallSize) returns the size class,
|
|
|
|
// 1 <= sizeclass < NumSizeClasses, for n.
|
|
|
|
// Size class 0 is reserved to mean "not small".
|
|
|
|
//
|
|
|
|
// class_to_size[i] = largest size in class i
|
|
|
|
// class_to_allocnpages[i] = number of pages to allocate when
|
runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 07:40:01 -06:00
|
|
|
// making new objects in class i
|
2008-12-18 16:42:28 -07:00
|
|
|
// class_to_transfercount[i] = number of objects to move when
|
|
|
|
// taking a bunch of objects out of the central lists
|
|
|
|
// and putting them in the thread free list.
|
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
int32 runtime·SizeToClass(int32);
|
|
|
|
extern int32 runtime·class_to_size[NumSizeClasses];
|
|
|
|
extern int32 runtime·class_to_allocnpages[NumSizeClasses];
|
|
|
|
extern int32 runtime·class_to_transfercount[NumSizeClasses];
|
|
|
|
extern void runtime·InitSizes(void);
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
|
|
|
|
// Per-thread (in Go, per-M) cache for small objects.
|
|
|
|
// No locking needed because it is per-thread (per-M).
|
|
|
|
typedef struct MCacheList MCacheList;
|
|
|
|
struct MCacheList
|
|
|
|
{
|
2008-12-19 04:13:39 -07:00
|
|
|
MLink *list;
|
2008-12-18 16:42:28 -07:00
|
|
|
uint32 nlist;
|
2008-12-19 04:13:39 -07:00
|
|
|
uint32 nlistmin;
|
2008-12-18 16:42:28 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
struct MCache
|
|
|
|
{
|
|
|
|
MCacheList list[NumSizeClasses];
|
|
|
|
uint64 size;
|
2011-07-18 12:52:57 -06:00
|
|
|
int64 local_cachealloc; // bytes allocated (or freed) from cache since last lock of heap
|
|
|
|
int64 local_objects; // objects allocated (or freed) from cache since last lock of heap
|
2011-07-18 12:56:22 -06:00
|
|
|
int64 local_alloc; // bytes allocated (or freed) since last lock of heap
|
2011-07-18 12:52:57 -06:00
|
|
|
int64 local_total_alloc; // bytes allocated (even if freed) since last lock of heap
|
|
|
|
int64 local_nmalloc; // number of mallocs since last lock of heap
|
|
|
|
int64 local_nfree; // number of frees since last lock of heap
|
|
|
|
int64 local_nlookup; // number of pointer lookups since last lock of heap
|
2010-03-23 21:48:23 -06:00
|
|
|
int32 next_sample; // trigger heap sample after allocating this many bytes
|
2011-07-18 12:52:57 -06:00
|
|
|
// Statistics about allocation size classes since last lock of heap
|
|
|
|
struct {
|
|
|
|
int64 nmalloc;
|
|
|
|
int64 nfree;
|
|
|
|
} local_by_size[NumSizeClasses];
|
runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 07:40:01 -06:00
|
|
|
|
2008-12-18 16:42:28 -07:00
|
|
|
};
|
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void* runtime·MCache_Alloc(MCache *c, int32 sizeclass, uintptr size, int32 zeroed);
|
|
|
|
void runtime·MCache_Free(MCache *c, void *p, int32 sizeclass, uintptr size);
|
|
|
|
void runtime·MCache_ReleaseAll(MCache *c);
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
// An MSpan is a run of pages.
|
|
|
|
enum
|
|
|
|
{
|
|
|
|
MSpanInUse = 0,
|
2009-01-28 16:22:16 -07:00
|
|
|
MSpanFree,
|
|
|
|
MSpanListHead,
|
|
|
|
MSpanDead,
|
2008-12-18 16:42:28 -07:00
|
|
|
};
|
|
|
|
struct MSpan
|
|
|
|
{
|
|
|
|
MSpan *next; // in a span linked list
|
|
|
|
MSpan *prev; // in a span linked list
|
2012-02-16 11:30:04 -07:00
|
|
|
MSpan *allnext; // in the list of all spans
|
2008-12-18 16:42:28 -07:00
|
|
|
PageID start; // starting page number
|
|
|
|
uintptr npages; // number of pages in span
|
2009-01-13 10:55:24 -07:00
|
|
|
MLink *freelist; // list of free objects
|
2008-12-18 16:42:28 -07:00
|
|
|
uint32 ref; // number of allocated objects in this span
|
|
|
|
uint32 sizeclass; // size class
|
2009-01-28 16:22:16 -07:00
|
|
|
uint32 state; // MSpanInUse etc
|
2012-02-16 11:30:04 -07:00
|
|
|
int64 unusedsince; // First time spotted by GC in MSpanFree state
|
|
|
|
uintptr npreleased; // number of pages released to the OS
|
|
|
|
byte *limit; // end of data in span
|
2008-12-18 16:42:28 -07:00
|
|
|
};
|
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void runtime·MSpan_Init(MSpan *span, PageID start, uintptr npages);
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
// Every MSpan is in one doubly-linked list,
|
|
|
|
// either one of the MHeap's free lists or one of the
|
|
|
|
// MCentral's span lists. We use empty MSpan structures as list heads.
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void runtime·MSpanList_Init(MSpan *list);
|
|
|
|
bool runtime·MSpanList_IsEmpty(MSpan *list);
|
|
|
|
void runtime·MSpanList_Insert(MSpan *list, MSpan *span);
|
|
|
|
void runtime·MSpanList_Remove(MSpan *span); // from whatever list it is in
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
|
|
|
|
// Central list of free objects of a given size.
|
|
|
|
struct MCentral
|
|
|
|
{
|
|
|
|
Lock;
|
|
|
|
int32 sizeclass;
|
|
|
|
MSpan nonempty;
|
|
|
|
MSpan empty;
|
|
|
|
int32 nfree;
|
|
|
|
};
|
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void runtime·MCentral_Init(MCentral *c, int32 sizeclass);
|
|
|
|
int32 runtime·MCentral_AllocList(MCentral *c, int32 n, MLink **first);
|
|
|
|
void runtime·MCentral_FreeList(MCentral *c, int32 n, MLink *first);
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
// Main malloc heap.
|
|
|
|
// The heap itself is the "free[]" and "large" arrays,
|
|
|
|
// but all the other global data is here too.
|
|
|
|
struct MHeap
|
|
|
|
{
|
|
|
|
Lock;
|
|
|
|
MSpan free[MaxMHeapList]; // free lists of given length
|
|
|
|
MSpan large; // free lists length >= MaxMHeapList
|
2009-01-28 16:22:16 -07:00
|
|
|
MSpan *allspans;
|
2008-12-18 16:42:28 -07:00
|
|
|
|
|
|
|
// span lookup
|
2011-01-28 13:03:26 -07:00
|
|
|
MSpan *map[1<<MHeapMap_Bits];
|
2010-02-10 01:00:12 -07:00
|
|
|
|
2009-12-03 18:22:23 -07:00
|
|
|
// range of addresses we might see in the heap
|
2011-01-28 13:03:26 -07:00
|
|
|
byte *bitmap;
|
2011-02-02 21:03:47 -07:00
|
|
|
uintptr bitmap_mapped;
|
2011-01-28 13:03:26 -07:00
|
|
|
byte *arena_start;
|
|
|
|
byte *arena_used;
|
|
|
|
byte *arena_end;
|
runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 07:40:01 -06:00
|
|
|
|
2008-12-18 16:42:28 -07:00
|
|
|
// central free lists for small size classes.
|
|
|
|
// the union makes sure that the MCentrals are
|
2011-10-06 09:42:51 -06:00
|
|
|
// spaced CacheLineSize bytes apart, so that each MCentral.Lock
|
2008-12-18 16:42:28 -07:00
|
|
|
// gets its own cache line.
|
|
|
|
union {
|
|
|
|
MCentral;
|
2011-10-06 09:42:51 -06:00
|
|
|
byte pad[CacheLineSize];
|
2008-12-18 16:42:28 -07:00
|
|
|
} central[NumSizeClasses];
|
|
|
|
|
|
|
|
FixAlloc spanalloc; // allocator for Span*
|
|
|
|
FixAlloc cachealloc; // allocator for MCache*
|
|
|
|
};
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
extern MHeap runtime·mheap;
|
2008-12-18 16:42:28 -07:00
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void runtime·MHeap_Init(MHeap *h, void *(*allocator)(uintptr));
|
|
|
|
MSpan* runtime·MHeap_Alloc(MHeap *h, uintptr npage, int32 sizeclass, int32 acct);
|
|
|
|
void runtime·MHeap_Free(MHeap *h, MSpan *s, int32 acct);
|
2011-01-28 13:03:26 -07:00
|
|
|
MSpan* runtime·MHeap_Lookup(MHeap *h, void *v);
|
|
|
|
MSpan* runtime·MHeap_LookupMaybe(MHeap *h, void *v);
|
2011-02-02 21:03:47 -07:00
|
|
|
void runtime·MGetSizeClassInfo(int32 sizeclass, uintptr *size, int32 *npages, int32 *nobj);
|
2011-01-28 13:03:26 -07:00
|
|
|
void* runtime·MHeap_SysAlloc(MHeap *h, uintptr n);
|
2011-02-02 21:03:47 -07:00
|
|
|
void runtime·MHeap_MapBits(MHeap *h);
|
2012-02-16 11:30:04 -07:00
|
|
|
void runtime·MHeap_Scavenger(void);
|
2009-01-26 18:37:05 -07:00
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void* runtime·mallocgc(uintptr size, uint32 flag, int32 dogc, int32 zeroed);
|
2011-02-02 21:03:47 -07:00
|
|
|
int32 runtime·mlookup(void *v, byte **base, uintptr *size, MSpan **s);
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void runtime·gc(int32 force);
|
2011-02-02 21:03:47 -07:00
|
|
|
void runtime·markallocated(void *v, uintptr n, bool noptr);
|
|
|
|
void runtime·checkallocated(void *v, uintptr n);
|
|
|
|
void runtime·markfreed(void *v, uintptr n);
|
|
|
|
void runtime·checkfreed(void *v, uintptr n);
|
|
|
|
int32 runtime·checking;
|
|
|
|
void runtime·markspan(void *v, uintptr size, uintptr n, bool leftover);
|
|
|
|
void runtime·unmarkspan(void *v, uintptr size);
|
|
|
|
bool runtime·blockspecial(void*);
|
2011-10-06 09:42:51 -06:00
|
|
|
void runtime·setblockspecial(void*, bool);
|
2011-07-18 12:52:57 -06:00
|
|
|
void runtime·purgecachedstats(M*);
|
2009-01-26 18:37:05 -07:00
|
|
|
|
|
|
|
enum
|
|
|
|
{
|
2011-02-02 21:03:47 -07:00
|
|
|
// flags to malloc
|
|
|
|
FlagNoPointers = 1<<0, // no pointers here
|
|
|
|
FlagNoProfiling = 1<<1, // must not profile
|
|
|
|
FlagNoGC = 1<<2, // must not free or scan for pointers
|
2009-01-26 18:37:05 -07:00
|
|
|
};
|
2010-03-23 21:48:23 -06:00
|
|
|
|
runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries. The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.
The symbols left alone are:
** known to cgo **
_cgo_free
_cgo_malloc
libcgo_thread_start
initcgo
ncgocall
** known to linker **
_rt0_$GOARCH
_rt0_$GOARCH_$GOOS
text
etext
data
end
pclntab
epclntab
symtab
esymtab
** known to C compiler **
_divv
_modv
_div64by32
etc (arch specific)
Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.
Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.
R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 12:00:19 -06:00
|
|
|
void runtime·MProf_Malloc(void*, uintptr);
|
|
|
|
void runtime·MProf_Free(void*, uintptr);
|
2012-02-22 19:45:01 -07:00
|
|
|
void runtime·MProf_GC(void);
|
runtime: fix spurious deadlock reporting
Fixes #2337.
Unfortunate sequence of events is:
1. maxcpu=2, mcpu=1, grunning=1
2. starttheworld creates an extra M:
maxcpu=2, mcpu=2, grunning=1
4. the goroutine calls runtime.GOMAXPROCS(1)
maxcpu=1, mcpu=2, grunning=1
5. since it sees mcpu>maxcpu, it calls gosched()
6. schedule() deschedules the goroutine:
maxcpu=1, mcpu=1, grunning=0
7. schedule() call getnextandunlock() which
fails to pick up the goroutine again,
because canaddcpu() fails, because mcpu==maxcpu
8. then it sees that grunning==0,
reports deadlock and terminates
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5191044
2011-10-06 09:10:14 -06:00
|
|
|
int32 runtime·helpgc(bool*);
|
runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 07:40:01 -06:00
|
|
|
void runtime·gchelper(void);
|
2010-03-23 21:48:23 -06:00
|
|
|
|
2011-10-06 09:42:51 -06:00
|
|
|
bool runtime·getfinalizer(void *p, bool del, void (**fn)(void*), int32 *nret);
|
|
|
|
void runtime·walkfintab(void (*fn)(void*));
|