1
0
mirror of https://github.com/golang/go synced 2024-10-04 10:21:21 -06:00
Commit Graph

270 Commits

Author SHA1 Message Date
Carl Shapiro
f056daf075 cmd/5g, cmd/5l, cmd/6g, cmd/6l, cmd/8g, cmd/8l, cmd/gc, runtime: generate pointer maps by liveness analysis
This change allows the garbage collector to examine stack
slots that are determined as live and containing a pointer
value by the garbage collector.  This results in a mean
reduction of 65% in the number of stack slots scanned during
an invocation of "GOGC=1 all.bash".

Unfortunately, this does not yet allow garbage collection to
be precise for the stack slots computed as live.  Pointers
confound the determination of what definitions reach a given
instruction.  In general, this problem is not solvable without
runtime cost but some advanced cooperation from the compiler
might mitigate common cases.

R=golang-dev, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/14430048
2013-12-05 17:35:22 -08:00
Carl Shapiro
0368a7ceb6 runtime: move stack scanning into the parallel mark phase
This change reduces the cost of the stack scanning by frames.
It moves the stack scanning from the serial root enumeration
phase to the parallel tracing phase.  The output that follows
are timings for the issue 6482 benchmark

Baseline

BenchmarkGoroutineSelect	      50	 108027405 ns/op
BenchmarkGoroutineBlocking	      50	  89573332 ns/op
BenchmarkGoroutineForRange	      20	  95614116 ns/op
BenchmarkGoroutineIdle		      20	 122809512 ns/op

Stack scan by frames, non-parallel

BenchmarkGoroutineSelect	      20	 297138929 ns/op
BenchmarkGoroutineBlocking	      20	 301137599 ns/op
BenchmarkGoroutineForRange	      10	 312499469 ns/op
BenchmarkGoroutineIdle		      10	 209428876 ns/op

Stack scan by frames, parallel

BenchmarkGoroutineSelect	      20	 183938431 ns/op
BenchmarkGoroutineBlocking	      20	 170109999 ns/op
BenchmarkGoroutineForRange	      20	 179628882 ns/op
BenchmarkGoroutineIdle		      20	 157541498 ns/op

The remaining performance disparity is due to inefficiencies
in gentraceback and its callees.  The effect was isolated by
using a parallel stack scan where scanstack was modified to do
a conservative scan of the stack segments without gentraceback
followed by a call of gentrackback with a no-op callback.

The output that follows are the top-10 most frequent tops of
stacks as determined by the Linux perf record facility.

Baseline

+  25.19%  gc.test  gc.test            [.] runtime.xchg
+  19.00%  gc.test  gc.test            [.] scanblock
+   8.53%  gc.test  gc.test            [.] scanstack
+   8.46%  gc.test  gc.test            [.] flushptrbuf
+   5.08%  gc.test  gc.test            [.] procresize
+   3.57%  gc.test  gc.test            [.] runtime.chanrecv
+   2.94%  gc.test  gc.test            [.] dequeue
+   2.74%  gc.test  gc.test            [.] addroots
+   2.25%  gc.test  gc.test            [.] runtime.ready
+   1.33%  gc.test  gc.test            [.] runtime.cas64

Gentraceback

+  18.12%  gc.test  gc.test             [.] runtime.xchg
+  14.68%  gc.test  gc.test             [.] scanblock
+   8.20%  gc.test  gc.test             [.] runtime.gentraceback
+   7.38%  gc.test  gc.test             [.] flushptrbuf
+   6.84%  gc.test  gc.test             [.] scanstack
+   5.92%  gc.test  gc.test             [.] runtime.findfunc
+   3.62%  gc.test  gc.test             [.] procresize
+   3.15%  gc.test  gc.test             [.] readvarint
+   1.92%  gc.test  gc.test             [.] addroots
+   1.87%  gc.test  gc.test             [.] runtime.chanrecv

R=golang-dev, dvyukov, rsc
CC=golang-dev
https://golang.org/cl/17410043
2013-12-03 14:12:55 -08:00
Keith Randall
139cc96a57 runtime: markfreed's error reports should be prefixed with "markfreed", not "markallocated".
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/14441055
2013-10-09 13:28:47 -07:00
Russ Cox
c3dadca977 runtime: do not scan stack by frames during garbage collection
Walking the stack by frames is ~3x more expensive
than not, and since it didn't end up being precise,
there is not enough benefit to outweigh the cost.

This is the conservative choice: this CL makes the
stack scanning behavior the same as it was in Go 1.1.

Add benchmarks to package runtime so that we have
them when we re-enable this feature during the
Go 1.3 development.

benchmark                     old ns/op    new ns/op    delta
BenchmarkGoroutineSelect        3194909      1272092  -60.18%
BenchmarkGoroutineBlocking      3120282       866366  -72.23%
BenchmarkGoroutineForRange      3256179       939902  -71.13%
BenchmarkGoroutineIdle          2005571       482982  -75.92%

The Go 1 benchmarks, just to add more data.
As far as I can tell the changes are mainly noise.

benchmark                         old ns/op    new ns/op    delta
BenchmarkBinaryTree17            4409403046   4414734932   +0.12%
BenchmarkFannkuch11              3407708965   3378306120   -0.86%
BenchmarkFmtFprintfEmpty                100           99   -0.60%
BenchmarkFmtFprintfString               242          239   -1.24%
BenchmarkFmtFprintfInt                  204          206   +0.98%
BenchmarkFmtFprintfIntInt               320          316   -1.25%
BenchmarkFmtFprintfPrefixedInt          295          299   +1.36%
BenchmarkFmtFprintfFloat                442          435   -1.58%
BenchmarkFmtManyArgs                   1246         1216   -2.41%
BenchmarkGobDecode                 10186951     10051210   -1.33%
BenchmarkGobEncode                 16504381     16445650   -0.36%
BenchmarkGzip                     447030885    447056865   +0.01%
BenchmarkGunzip                   111056154    111696305   +0.58%
BenchmarkHTTPClientServer             89973        93040   +3.41%
BenchmarkJSONEncode                28174182     27933893   -0.85%
BenchmarkJSONDecode               106353777    110443817   +3.85%
BenchmarkMandelbrot200              4822289      4806083   -0.34%
BenchmarkGoParse                    6102436      6142734   +0.66%
BenchmarkRegexpMatchEasy0_32            133          132   -0.75%
BenchmarkRegexpMatchEasy0_1K            372          373   +0.27%
BenchmarkRegexpMatchEasy1_32            113          111   -1.77%
BenchmarkRegexpMatchEasy1_1K            964          940   -2.49%
BenchmarkRegexpMatchMedium_32           202          205   +1.49%
BenchmarkRegexpMatchMedium_1K         68862        68858   -0.01%
BenchmarkRegexpMatchHard_32            3480         3407   -2.10%
BenchmarkRegexpMatchHard_1K          108255       112614   +4.03%
BenchmarkRevcomp                  751393035    743929976   -0.99%
BenchmarkTemplate                 139637041    135402220   -3.03%
BenchmarkTimeParse                      479          475   -0.84%
BenchmarkTimeFormat                     460          466   +1.30%

benchmark                          old MB/s     new MB/s  speedup
BenchmarkGobDecode                    75.34        76.36    1.01x
BenchmarkGobEncode                    46.50        46.67    1.00x
BenchmarkGzip                         43.41        43.41    1.00x
BenchmarkGunzip                      174.73       173.73    0.99x
BenchmarkJSONEncode                   68.87        69.47    1.01x
BenchmarkJSONDecode                   18.25        17.57    0.96x
BenchmarkGoParse                       9.49         9.43    0.99x
BenchmarkRegexpMatchEasy0_32         239.58       241.74    1.01x
BenchmarkRegexpMatchEasy0_1K        2749.74      2738.00    1.00x
BenchmarkRegexpMatchEasy1_32         282.49       286.32    1.01x
BenchmarkRegexpMatchEasy1_1K        1062.00      1088.96    1.03x
BenchmarkRegexpMatchMedium_32          4.93         4.86    0.99x
BenchmarkRegexpMatchMedium_1K         14.87        14.87    1.00x
BenchmarkRegexpMatchHard_32            9.19         9.39    1.02x
BenchmarkRegexpMatchHard_1K            9.46         9.09    0.96x
BenchmarkRevcomp                     338.26       341.65    1.01x
BenchmarkTemplate                     13.90        14.33    1.03x

Fixes #6482.

R=golang-dev, dave, r
CC=golang-dev
https://golang.org/cl/14257043
2013-10-02 11:59:53 -04:00
Russ Cox
30ecb4cd05 build: disable precise collection of stack frames
The code for call site-specific pointer bitmaps was not ready in time,
but the zeroing required without it is too expensive to use by default.
We will have to wait for precise collection of stack frames until Go 1.3.

The precise collection can be re-enabled by

        GOEXPERIMENT=precisestack ./all.bash

but that will not be the default for a Go 1.2 build.

Fixes #6087.

R=golang-dev, jeremyjackins, dan.kortschak, r
CC=golang-dev
https://golang.org/cl/13677045
2013-09-16 20:26:10 -04:00
Dmitriy Vyukov
a33ef8d11b runtime: account for all sys memory in MemStats
Currently lots of sys allocations are not accounted in any of XxxSys,
including GC bitmap, spans table, GC roots blocks, GC finalizer blocks,
iface table, netpoll descriptors and more. Up to ~20% can unaccounted.
This change introduces 2 new stats: GCSys and OtherSys for GC metadata
and all other misc allocations, respectively.
Also ensures that all XxxSys indeed sum up to Sys. All sys memory allocation
functions require the stat for accounting, so that it's impossible to miss something.
Also fix updating of mcache_sys/inuse, they were not updated after deallocation.

test/bench/garbage/parser before:
Sys		670064344
HeapSys		610271232
StackSys	65536
MSpanSys	14204928
MCacheSys	16384
BuckHashSys	1439992

after:
Sys		670064344
HeapSys		610271232
StackSys	65536
MSpanSys	14188544
MCacheSys	16384
BuckHashSys	3194304
GCSys		39198688
OtherSys	3129656

Fixes #5799.

R=rsc, dave, alex.brainman
CC=golang-dev
https://golang.org/cl/12946043
2013-09-06 16:55:40 -04:00
Keith Randall
23f9751e83 runtime: clean up map code. Remove hashmap.h.
Use cnew/cnewarray instead of mallocgc.

R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/13396045
2013-08-31 14:09:34 -07:00
Keith Randall
fb376021be runtime: record type information for hashtable internal structures.
Remove all hashtable-specific GC code.

Fixes bug 6119.

R=cshapiro, dvyukov, khr
CC=golang-dev
https://golang.org/cl/13078044
2013-08-31 09:09:50 -07:00
Carl Shapiro
c51152f438 runtime: check bitmap word for allocated bit in markonly
When searching for an allocated bit, flushptrbuf would search
backward in the bitmap word containing the bit of pointer
being looked-up before searching the span.  This extra check
was not replicated in markonly which, instead, after not
finding an allocated bit for a pointer would directly look in
the span.

Using statistics generated from godoc, before this change span
lookups were, on average, more common than word lookups.  It
was common for markonly to consult spans for one third of its
pointer lookups.  With this change in place, what were
previously span lookups are overwhelmingly become by the word
lookups making the total number of span lookups a relatively
small fraction of the whole.

This change also introduces some statistics gathering about
lookups guarded by the CollectStats enum.

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/13311043
2013-08-29 13:52:38 -07:00
Keith Randall
ed467db6d8 cmd/cc,runtime: change preprocessor to expand macros inside of
#pragma textflag and #pragma dataflag directives.
Update dataflag directives to use symbols instead of integer constants.

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/13310043
2013-08-29 12:36:59 -07:00
Keith Randall
d0dd420a24 runtime: rename FlagNoPointers to FlagNoScan. Better represents
the use of the flag, especially for objects which actually do have
pointers but we don't want the GC to scan them.

R=golang-dev, cshapiro
CC=golang-dev
https://golang.org/cl/13181045
2013-08-23 17:28:47 -07:00
Dmitriy Vyukov
dfdd1ba028 runtime: do not trigger GC on g0
GC acquires worldsema, which is a goroutine-level semaphore
which parks goroutines. g0 can not be parked.
Fixes #6193.

R=khr, khr
CC=golang-dev
https://golang.org/cl/12880045
2013-08-22 02:17:45 +04:00
Carl Shapiro
87fdb8fb9a undo CL 13010045 / 04f8101b46dd
Update the original change but do not read interface types in
the arguments area.  Once the arguments area is zeroed as the
locals area is we can safely read interface type values there
too.

««« original CL description
undo CL 12785045 / 71ce80dc4195

This has broken the 32-bit builds.

««« original CL description
cmd/gc, runtime: use type information to scan interface values

R=golang-dev, rsc, dvyukov
CC=golang-dev
https://golang.org/cl/12785045
»»»

R=khr, golang-dev, khr
CC=golang-dev
https://golang.org/cl/13010045
»»»

R=khr, khr
CC=golang-dev
https://golang.org/cl/13073045
2013-08-21 13:51:00 -07:00
Carl Shapiro
ca2d32b46d undo CL 12785045 / 71ce80dc4195
This has broken the 32-bit builds.

««« original CL description
cmd/gc, runtime: use type information to scan interface values

R=golang-dev, rsc, dvyukov
CC=golang-dev
https://golang.org/cl/12785045
»»»

R=khr, golang-dev, khr
CC=golang-dev
https://golang.org/cl/13010045
2013-08-19 14:16:55 -07:00
Keith Randall
6401e0f83f runtime: don't run finalizers if we're still on the g0 stack.
R=golang-dev, rsc, dvyukov, khr
CC=golang-dev
https://golang.org/cl/11386044
2013-08-19 12:20:50 -07:00
Carl Shapiro
21ea5103a4 cmd/gc, runtime: use type information to scan interface values
R=golang-dev, rsc, dvyukov
CC=golang-dev
https://golang.org/cl/12785045
2013-08-19 10:19:59 -07:00
Russ Cox
5822e7848a runtime: make SetFinalizer(x, f) accept any f for which f(x) is valid
Originally the requirement was f(x) where f's argument is
exactly x's type.

CL 11858043 relaxed the requirement in a non-standard
way: f's argument must be exactly x's type or interface{}.

If we're going to relax the requirement, it should be done
in a way consistent with the rest of Go. This CL allows f's
argument to have any type for which x is assignable;
that's the same requirement the compiler would impose
if compiling f(x) directly.

Fixes #5368.

R=dvyukov, bradfitz, pieter
CC=golang-dev
https://golang.org/cl/12895043
2013-08-14 14:54:31 -04:00
Carl Shapiro
abc516e420 cmd/cc, cmd/gc, runtime: Uniquely encode iface and eface pointers in the pointer map.
Prior to this change, pointer maps encoded the disposition of
a word using a single bit.  A zero signaled a non-pointer
value and a one signaled a pointer value.  Interface values,
which are a effectively a union type, were conservatively
labeled as a pointer.

This change widens the logical element size of the pointer map
to two bits per word.  As before, zero signals a non-pointer
value and one signals a pointer value.  Additionally, a two
signals an iface pointer and a three signals an eface pointer.

Following other changes to the runtime, values two and three
will allow a type information to drive interpretation of the
subsequent word so only those interface values containing a
pointer value will be scanned.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12689046
2013-08-09 16:48:12 -07:00
Dmitriy Vyukov
23e15f7253 net: add special netFD mutex
The mutex, fdMutex, handles locking and lifetime of sysfd,
and serializes Read and Write methods.
This allows to strip 2 sync.Mutex.Lock calls,
2 sync.Mutex.Unlock calls, 1 defer and some amount
of misc overhead from every network operation.

On linux/amd64, Intel E5-2690:
benchmark                             old ns/op    new ns/op    delta
BenchmarkTCP4Persistent                    9595         9454   -1.47%
BenchmarkTCP4Persistent-2                  8978         8772   -2.29%
BenchmarkTCP4ConcurrentReadWrite           4900         4625   -5.61%
BenchmarkTCP4ConcurrentReadWrite-2         2603         2500   -3.96%

In general it strips 70-500 ns from every network operation depending
on processor model. On my relatively new E5-2690 it accounts to ~5%
of network op cost.

Fixes #6074.

R=golang-dev, bradfitz, alex.brainman, iant, mikioh.mikioh
CC=golang-dev
https://golang.org/cl/12418043
2013-08-09 21:43:00 +04:00
Carl Shapiro
73c93a404c cmd/cc, cmd/gc, runtime: emit bitmaps for scanning locals.
Previously, all word aligned locations in the local variables
area were scanned as conservative roots.  With this change, a
bitmap is generated describing the locations of pointer values
in local variables.

With this change the argument bitmap information has been
changed to only store information about arguments.  The locals
member, has been removed.  In its place, the bitmap data for
local variables is now used to store the size of locals.  If
the size is negative, the magnitude indicates the size of the
local variables area.

R=rsc
CC=golang-dev
https://golang.org/cl/12328044
2013-08-07 12:47:01 -07:00
Dmitriy Vyukov
9c0500b466 runtime: use gcpc/gcsp during traceback of goroutines in syscalls
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.

This is the same as reverted change 12250043
with save marked as textflag 7.
The problem was that if save calls morestack,
then subsequent lessstack spoils g->sched.pc/sp.
And that bad values were remembered in g->syscallpc/sp.
Entersyscallblock had the same problem,
but it was never triggered to date.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12478043
2013-08-06 13:38:44 +04:00
Dmitriy Vyukov
f38ff9e5ea undo CL 12250043 / e911f94c4902
Break all 386 builders.

««« original CL description
runtime: use gcpc/gcsp during traceback of goroutines in syscalls
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12250043
»»»

R=rsc
CC=golang-dev
https://golang.org/cl/12424045
2013-08-05 23:33:50 +04:00
Dmitriy Vyukov
d5ab784611 runtime: remove singleproc var
It was needed for the old scheduler,
because there temporary could be more threads than gomaxprocs.
In the new scheduler gomaxprocs is always respected.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12438043
2013-08-05 22:58:02 +04:00
Dmitriy Vyukov
f73972fa33 runtime: use gcpc/gcsp during traceback of goroutines in syscalls
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12250043
2013-08-05 22:55:54 +04:00
Dmitriy Vyukov
0a904a3f2e runtime: remove dead code
Remove dead code related to allocation of type metadata with SysAlloc.

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12311045
2013-08-04 23:32:06 +04:00
Pieter Droogendijk
6350e45892 runtime: allow SetFinalizer with a func(interface{})
Fixes #5368.

R=golang-dev, dvyukov
CC=golang-dev, rsc
https://golang.org/cl/11858043
2013-07-29 19:43:08 +04:00
Dmitriy Vyukov
f8a850b250 runtime: refactor mallocgc
Make it accept type, combine flags.
Several reasons for the change:
1. mallocgc and settype must be atomic wrt GC
2. settype is called from only one place now
3. it will help performance (eventually settype
functionality must be combined with markallocated)
4. flags are easier to read now (no mallocgc(sz, 0, 1, 0) anymore)

R=golang-dev, iant, nightlyone, rsc, dave, khr, bradfitz, r
CC=golang-dev
https://golang.org/cl/10136043
2013-07-26 21:17:24 +04:00
Russ Cox
48769bf546 runtime: use funcdata to supply garbage collection information
This CL introduces a FUNCDATA number for runtime-specific
garbage collection metadata, changes the C and Go compilers
to emit that metadata, and changes the runtime to expect it.

The old pseudo-instructions that carried this information
are gone, as is the linker code to process them.

R=golang-dev, dvyukov, cshapiro
CC=golang-dev
https://golang.org/cl/11406044
2013-07-19 16:04:09 -04:00
Dmitriy Vyukov
eb04df75cd runtime: prevent GC from seeing the contents of a frame in runfinq
This holds the last finalized object and arguments to its finalizer.
Fixes #5348.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/11454044
2013-07-19 18:01:33 +04:00
Russ Cox
c3de91bb15 cmd/ld, runtime: use new contiguous pcln table
R=golang-dev, r, dave
CC=golang-dev
https://golang.org/cl/11494043
2013-07-18 10:43:22 -04:00
Dmitriy Vyukov
5887f142a3 runtime: more reliable preemption
Currently preemption signal g->stackguard0==StackPreempt
can be lost if it is received when preemption is disabled
(e.g. m->lock!=0). This change duplicates the preemption
signal in g->preempt and restores g->stackguard0
when preemption is enabled.
Update #543.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/10792043
2013-07-17 12:52:37 -04:00
Russ Cox
5d363c6357 cmd/ld, runtime: new in-memory symbol table format
Design at http://golang.org/s/go12symtab.

This enables some cleanup of the garbage collector metadata
that will be done in future CLs.

This CL does not move the old symtab and pclntab back into
an unmapped section of the file. That's a bit tricky and will be
done separately.

Fixes #4020.

R=golang-dev, dave, cshapiro, iant, r
CC=golang-dev, nigeltao
https://golang.org/cl/11085043
2013-07-16 09:41:38 -04:00
Dmitriy Vyukov
4b536a550f runtime: introduce GODEBUG env var
Currently it replaces GOGCTRACE env var (GODEBUG=gctrace=1).
The plan is to extend it with other type of debug tracing,
e.g. GODEBUG=gctrace=1,schedtrace=100.

R=rsc
CC=bradfitz, daniel.morsing, gobot, golang-dev
https://golang.org/cl/10026045
2013-06-28 18:37:06 +04:00
Russ Cox
6fa3c89b77 runtime: record proper goroutine state during stack split
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.

For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:

        TEXT myfunc
                check for split
                jmpcond nosplit
                call morestack
        nosplit:
                sub $xxx, sp

Now an extra instruction is inserted after the call:

        TEXT myfunc
        start:
                check for split
                jmpcond nosplit
                call morestack
                jmp start
        nosplit:
                sub $xxx, sp

The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.

The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.

The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.

Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)

runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
        morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
        sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow

goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
        /Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
        /Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
        /Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
        /Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
        /Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8

And here is a seg fault during oldstack:

SIGSEGV: segmentation violation
PC=0x1b2a6

runtime.oldstack()
        /Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
        /Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22

goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
        /Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
        /Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
        /Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
        strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8

goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
        /Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
        /Users/rsc/g/go/src/pkg/runtime/proc.c:166

rax     0x23ccc0
rbx     0x23ccc0
rcx     0x0
rdx     0x38
rdi     0x2102c0170
rsi     0x221032cfe0
rbp     0x221032cfa0
rsp     0x7fff5fbff5b0
r8      0x2102c0120
r9      0x221032cfa0
r10     0x221032c000
r11     0x104ce8
r12     0xe5c80
r13     0x1be82baac718
r14     0x13091135f7d69200
r15     0x0
rip     0x1b2a6
rflags  0x10246
cs      0x2b
fs      0x0
gs      0x0

Fixes #5723.

R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
2013-06-27 11:32:01 -04:00
Dmitriy Vyukov
94dc963b55 runtime: fix race condition between GC and setGCPercent
If first GC runs concurrently with setGCPercent,
it can overwrite gcpercent value with default.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/10242047
2013-06-15 16:07:06 +04:00
Keith Randall
de316388a7 runtime: garbage collector runs on g0 now.
No need to change to Grunnable state.
Add some more checks for Grunning state.

R=golang-dev, rsc, khr, dvyukov
CC=golang-dev
https://golang.org/cl/10186045
2013-06-14 11:42:51 -07:00
Russ Cox
d67e7e3acf runtime: add lr, ctxt, ret to Gobuf
Add gostartcall and gostartcallfn.
The old gogocall = gostartcall + gogo.
The old gogocallfn = gostartcallfn + gogo.

R=dvyukov, minux.ma
CC=golang-dev
https://golang.org/cl/10036044
2013-06-12 15:22:26 -04:00
Russ Cox
e58f798c0c runtime: adjust traceback / garbage collector boundary
The garbage collection routine addframeroots is duplicating
logic in the traceback routine that calls it, sometimes correctly,
sometimes incorrectly, sometimes incompletely.
Pass necessary information to addframeroots instead of
deriving it anew.

Should make addframeroots significantly more robust.
It's certainly smaller.

Also try to standardize on uintptr for saved pc, sp values.

Will make CL 10036044 trivial.

R=golang-dev, dave, dvyukov
CC=golang-dev
https://golang.org/cl/10169045
2013-06-12 08:49:38 -04:00
Dmitriy Vyukov
99922aba8b runtime: use persistentalloc instead of SysAlloc in GC
Especially important for Windows because it reserves VM
only in multiple of 64k.

R=golang-dev, alex.brainman
CC=golang-dev
https://golang.org/cl/10082048
2013-06-10 09:16:06 +04:00
Dmitriy Vyukov
5d637b83a9 runtime: speedup malloc stats collection
Count only number of frees, everything else is derivable
and does not need to be counted on every malloc.
benchmark                    old ns/op    new ns/op    delta
BenchmarkMalloc8                    68           66   -3.07%
BenchmarkMalloc16                   75           70   -6.48%
BenchmarkMallocTypeInfo8           102           97   -4.80%
BenchmarkMallocTypeInfo16          108          105   -2.78%

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/9776043
2013-06-06 14:56:50 +04:00
Keith Randall
71f061043d runtime/gc: Run garbage collector on g0 stack
instead of regular g stack. We do this so that the g stack
we're currently running on is no longer changing.  Cuts
the root set down a bit (g0 stacks are not scanned, and
we don't need to scan gc's internal state).  Also an
enabler for copyable stacks.

R=golang-dev, cshapiro, khr, 0xe2.0x9a.0x9b, dvyukov, rsc, iant
CC=golang-dev
https://golang.org/cl/9754044
2013-05-31 20:43:33 -07:00
Keith Randall
d6f89d735e runtime: set MSpan.limit properly for large spans.
Then use the limit to make sure MHeap_LookupMaybe & inlined
copies don't return a span if the pointer is beyond the limit.
Use this fact to optimize all call sites.

R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/9869045
2013-05-30 21:32:20 -07:00
Dmitriy Vyukov
e17281b397 runtime: rename mheap.maps to mheap.spans
as was dicussed in cl/9791044

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/9853046
2013-05-30 17:09:58 +04:00
Carl Shapiro
4e0a51c210 cmd/5l, cmd/6l, cmd/8l, cmd/gc, runtime: generate and use bitmaps of argument pointer locations
With this change the compiler emits a bitmap for each function
covering its stack frame arguments area.  If an argument word
is known to contain a pointer, a bit is set.  The garbage
collector reads this information when scanning the stack by
frames and uses it to ignores locations known to not contain a
pointer.

R=golang-dev, bradfitz, daniel.morsing, dvyukov, khr, khr, iant, cshapiro
CC=golang-dev
https://golang.org/cl/9223046
2013-05-28 17:59:10 -07:00
Dmitriy Vyukov
8bbb08533d runtime: make mheap statically allocated again
This depends on: 9791044: runtime: allocate page table lazily
Once page table is moved out of heap, the heap becomes small.
This removes unnecessary dereferences during heap access.
No logical changes.

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/9802043
2013-05-28 22:14:47 +04:00
Dmitriy Vyukov
671814b904 runtime: allocate page table lazily
This removes the 256MB memory allocation at startup,
which conflicts with ulimit.
Also will allow to eliminate an unnecessary memory dereference in GC,
because the page table is usually mapped at known address.
Update #5049.
Update #5236.

R=golang-dev, khr, r, khr, rsc
CC=golang-dev
https://golang.org/cl/9791044
2013-05-28 22:04:34 +04:00
Dmitriy Vyukov
2f5825d427 runtime: fix heap corruption during GC
The 'n' variable is used during rescan initiation in GC_END case,
but it's overwritten with chan capacity in GC_CHAN case.
As the result rescan is done with the wrong object size.
Fixes #5554.

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/9831043
2013-05-28 19:17:47 +04:00
Dmitriy Vyukov
72c4ee1a9d runtime: properly synchronize GC and finalizer goroutine
This is needed for preemptive scheduler, because the goroutine
can be preempted at surprising points.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/9376043
2013-05-22 23:04:46 +04:00
Dmitriy Vyukov
c075d82cca runtime: fix and speedup malloc stats
Currently per-sizeclass stats are lost for destroyed MCache's. This patch fixes this.
Also, only update mstats.heap_alloc on heap operations, because that's the only
stat that needs to be promptly updated. Everything else needs to be up-to-date only in ReadMemStats().

R=golang-dev, remyoudompheng, dave, iant
CC=golang-dev
https://golang.org/cl/9207047
2013-05-22 22:22:57 +04:00
Carl Shapiro
50ba6e13b4 runtime: fix scanning of not started goroutines
The stack scanner for not started goroutines ignored the arguments
area when its size was unknown.  With this change, the distance
between the stack pointer and the stack base will be used instead.

Fixes #5486

R=golang-dev, bradfitz, iant, dvyukov
CC=golang-dev
https://golang.org/cl/9440043
2013-05-16 10:42:39 -07:00
Dmitriy Vyukov
c6293d2106 runtime: fix GC scanning of slices
If a slice points to an array embedded in a struct,
the whole struct can be incorrectly scanned as the slice buffer.
Fixes #5443.

R=cshapiro, iant, r, cshapiro, minux.ma
CC=bradfitz, gobot, golang-dev
https://golang.org/cl/9372044
2013-05-15 23:50:32 +04:00
Carl Shapiro
0e6007e4f9 runtime: enable stack scanning by frames
Update #5134

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/9406046
2013-05-14 16:38:12 -07:00
Jan Ziak
db1c218d4f undo CL 8954044 / ad3c2ffb16d7
It works on i386, but fails on amd64 and arm.

««« original CL description
runtime: prevent the GC from seeing the content of a frame in runfinq()

Fixes #5348.

R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/8954044
»»»

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8695051
2013-04-25 18:12:09 +02:00
Jan Ziak
e9bbe3a8da runtime: prevent the GC from seeing the content of a frame in runfinq()
Fixes #5348.

R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/8954044
2013-04-25 13:39:09 +02:00
Ian Lance Taylor
709e03f43d runtime: add a hook to disable precise GC
This will let us ask people to rebuild the Go system without
precise GC, and then rebuild and retest their program, to see
if precise GC is causing whatever problem they are having.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8700043
2013-04-12 05:23:38 -07:00
Dmitriy Vyukov
bd1cd1ddea runtime: poor man's heap type info checker
It's not trivial to make a comprehensive check
due to inferior pointers, reflect, gob, etc.
But this is essentially what I've used to debug
the GC issues.
Update #5193.

R=golang-dev, iant, 0xe2.0x9a.0x9b, r
CC=golang-dev
https://golang.org/cl/8455043
2013-04-08 13:36:35 -07:00
Carl Shapiro
c80f5b9bf7 runtime: use a distinct pattern to mark free blocks in need of zeroing
R=golang-dev, dvyukov, khr, cshapiro
CC=golang-dev
https://golang.org/cl/8392043
2013-04-04 14:18:52 -07:00
Carl Shapiro
c676b8b27b cmd/ld, runtime: restrict stack root scan to locals and arguments
Updates #5134

R=golang-dev, bradfitz, cshapiro, daniel.morsing, ality, iant
CC=golang-dev
https://golang.org/cl/8022044
2013-03-28 14:36:23 -07:00
Jan Ziak
ec8caf696a runtime: mark strings without going through an intermediate buffer
R=rsc
CC=golang-dev
https://golang.org/cl/7949043
2013-03-21 19:00:02 +01:00
Dmitriy Vyukov
d4c80d19a8 runtime: faster parallel GC
Use per-thread work buffers instead of global mutex-protected pool. This eliminates contention from parallel scan phase.

benchmark                             old ns/op    new ns/op    delta
garbage.BenchmarkTree2-8               97100768     71417553  -26.45%
garbage.BenchmarkTree2LastPause-8     970931485    714103692  -26.45%
garbage.BenchmarkTree2Pause-8         469127802    345029253  -26.45%
garbage.BenchmarkParser-8            2880950854   2715456901   -5.74%
garbage.BenchmarkParserLastPause-8    137047399    103336476  -24.60%
garbage.BenchmarkParserPause-8         80686028     58922680  -26.97%

R=golang-dev, 0xe2.0x9a.0x9b, dave, adg, rsc, iant
CC=golang-dev
https://golang.org/cl/7816044
2013-03-21 12:48:02 +04:00
Jan Ziak
54dffda2b6 runtime: prevent garbage collection during hashmap insertion
Inserting a key-value pair into a hashmap storing keys or values
indirectly can cause the garbage collector to find the hashmap in
an inconsistent	state.

Fixes #5074.

R=golang-dev, minux.ma, rsc
CC=golang-dev
https://golang.org/cl/7913043
2013-03-19 22:17:39 +01:00
Jan Ziak
ab08eac5b7 runtime: optimize calls to addroot()
R=golang-dev, rsc
CC=dvyukov, golang-dev
https://golang.org/cl/7879043
2013-03-19 19:57:15 +01:00
Jan Ziak
6e69df6102 cmd/gc: support channel types in the garbage collector
R=golang-dev, dvyukov, rsc
CC=golang-dev
https://golang.org/cl/7473044
2013-03-19 19:51:03 +01:00
Jan Ziak
92c153d5f4 runtime: scan the type of an interface value
R=golang-dev, rsc, bradfitz
CC=golang-dev
https://golang.org/cl/7744047
2013-03-15 16:07:52 -04:00
Jan Ziak
ee3c88482b runtime: remove struct BitTarget
R=golang-dev
CC=dvyukov, golang-dev, rsc
https://golang.org/cl/7845043
2013-03-15 12:37:40 -04:00
Jan Ziak
2924638d26 runtime: replace lock() with casp() in the GC
Note: BitTarget will be removed by a forthcoming changeset.

R=golang-dev, dvyukov
CC=golang-dev, rsc
https://golang.org/cl/7837044
2013-03-15 09:02:36 +01:00
Dmitriy Vyukov
433824d808 runtime: fix misaligned 64-bit atomic
Fixes #4869.
Fixes #5007.
Update #5005.

R=golang-dev, 0xe2.0x9a.0x9b, bradfitz
CC=golang-dev
https://golang.org/cl/7534044
2013-03-10 20:46:11 +04:00
Russ Cox
e0deb2ef7f undo CL 7301062 / 9742f722b558
broke arm garbage collector

traceback_arm fails with a missing pc. It needs CL 7494043.
But that only makes the build break later, this time with
"invalid freelist". Roll back until it can be fixed correctly.

««« original CL description
runtime: restrict stack root scan to locals and arguments

R=rsc
CC=golang-dev
https://golang.org/cl/7301062
»»»

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/7493044
2013-03-05 15:36:40 -05:00
Carl Shapiro
db018bf77c runtime: restrict stack root scan to locals and arguments
R=rsc
CC=golang-dev
https://golang.org/cl/7301062
2013-03-04 19:48:50 -08:00
Jan Ziak
cea46387b9 runtime: add garbage collector statistics
If the constant CollectStats is non-zero and GOGCTRACE=1
the garbage collector will print basic statistics about executed
GC instructions.

R=golang-dev, dvyukov
CC=golang-dev, rsc
https://golang.org/cl/7413049
2013-03-04 16:54:37 +01:00
Dmitriy Vyukov
779c45a507 runtime: improved scheduler
Distribute runnable queues, memory cache
and cache of dead G's per processor.
Faster non-blocking syscall enter/exit.
More conservative worker thread blocking/unblocking.

R=dave, bradfitz, remyoudompheng, rsc
CC=golang-dev
https://golang.org/cl/7314062
2013-03-01 13:49:16 +02:00
Jan Ziak
94955f9b40 runtime: check the value returned by runtime·SysAlloc
R=golang-dev, rsc
CC=golang-dev, minux.ma
https://golang.org/cl/7424047
2013-03-01 00:21:08 -05:00
Jan Ziak
01ab9a012a runtime: improve precision of GC_REGION
R=rsc
CC=golang-dev
https://golang.org/cl/7383054
2013-02-27 08:28:53 -08:00
Russ Cox
56a06db360 cmd/ld: change GC_CALL to 32-bit relative address
The current code uses 64-bit pc-relative on 64-bit systems,
but in ELF linkers there is no such thing, so we cannot
express this in a .o file. Change to 32-bit.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/7383055
2013-02-26 19:42:56 -08:00
Jan Ziak
a656f82071 runtime: precise garbage collection of channels
This changeset adds a mostly-precise garbage collection of channels.
The garbage collection support code in the linker isn't recognizing
channel types yet.

Fixes issue http://stackoverflow.com/questions/14712586/memory-consumption-skyrocket

R=dvyukov, rsc, bradfitz
CC=dave, golang-dev, minux.ma, remyoudompheng
https://golang.org/cl/7307086
2013-02-25 15:58:23 -05:00
Russ Cox
1903ad7189 cmd/gc, reflect, runtime: switch to indirect func value representation
Step 1 of http://golang.org/s/go11func.

R=golang-dev, r, daniel.morsing, remyoudompheng
CC=golang-dev
https://golang.org/cl/7393045
2013-02-21 17:01:13 -05:00
Russ Cox
8a6ff3ab34 runtime: allocate heap metadata at run time
Before, the mheap structure was in the bss,
but it's quite large (today, 256 MB, much of
which is never actually paged in), and it makes
Go binaries run afoul of exec-time bss size
limits on some BSD systems.

Fixes #4447.

R=golang-dev, dave, minux.ma, remyoudompheng, iant
CC=golang-dev
https://golang.org/cl/7307122
2013-02-15 14:27:03 -05:00
Russ Cox
e5dae3baaa runtime: tweak addfinroots to preserve original pointer
Use local variable so that stack trace will show value of v.

Fixes #4790.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/7300106
2013-02-13 22:31:45 -05:00
Jan Ziak
1e01fba2fc runtime: precise garbage collection of hashmaps
R=golang-dev, rsc
CC=dave, dvyukov, golang-dev, minux.ma, remyoudompheng
https://golang.org/cl/7252047
2013-02-08 16:00:33 -05:00
Russ Cox
472354f81e runtime/debug: add controls for garbage collector
Fixes #4090.

R=golang-dev, iant, bradfitz, dsymonds
CC=golang-dev
https://golang.org/cl/7229070
2013-02-04 00:00:55 -05:00
Elias Naur
44cf814d50 runtime, cmd/ld: make code more position-independent
Change the stack unwinding code to compensate for the dynamic
relocation of symbols.
Change the gc instruction GC_CALL to use a relative offset instead of
an absolute address.

R=golang-dev
CC=golang-dev
https://golang.org/cl/7248048
2013-02-01 11:24:49 -08:00
Sébastien Paolacci
09ea3b518e runtime: earlier detection of unused spans.
Mark candidate spans one GC pass earlier.

Move scavenger's code out from mgc0 and constrain it into mheap (where it belongs).

R=rsc, dvyukov, minux.ma
CC=golang-dev
https://golang.org/cl/7002049
2013-01-28 12:53:35 -05:00
Dmitriy Vyukov
fd32ac4bae runtime: account stop-the-world time in the "other" GOGCTRACE section
Currently it's summed to mark phase.
The change makes it easier to diagnose long stop-the-world phases.

R=golang-dev, bradfitz, dave
CC=golang-dev
https://golang.org/cl/7182043
2013-01-22 13:44:49 +04:00
Jan Ziak
059fed3dfb runtime: try to determine the actual type during garbage collection
If the scanned block has no typeinfo the garbage collector will attempt
to get the actual type of the block.

R=golang-dev, bradfitz, rsc
CC=golang-dev
https://golang.org/cl/7093045
2013-01-18 16:56:17 -05:00
Rémy Oudompheng
c13866db7f cmd/5c: fix handling of side effects when assigning a struct literal.
Also undo revision a5b96b602690 used to workaround the bug.

Fixes #4643.

R=rsc, golang-dev, dave, minux.ma, lucio.dere, bradfitz
CC=golang-dev
https://golang.org/cl/7090043
2013-01-12 09:16:50 +01:00
Rémy Oudompheng
6e981c181c runtime: work around 5c bug in GC code.
5c miscompiles *p++ = struct_literal.

R=dave, golang-dev
CC=golang-dev
https://golang.org/cl/7065069
2013-01-11 00:59:44 +01:00
Jan Ziak
9204eb4d3c runtime: interpret type information during garbage collection
R=rsc, dvyukov, remyoudompheng, dave, minux.ma, bradfitz
CC=golang-dev
https://golang.org/cl/6945069
2013-01-10 15:45:46 -05:00
Dmitriy Vyukov
f82db7d9e4 runtime: less aggressive per-thread stack segment caching
Introduce global stack segment cache and limit per-thread cache size.
This greatly reduces StackSys memory on workloads that create lots of threads.

benchmark                      old ns/op    new ns/op    delta
BenchmarkStackGrowth                 665          656   -1.35%
BenchmarkStackGrowth-2               333          328   -1.50%
BenchmarkStackGrowth-4               224          172  -23.21%
BenchmarkStackGrowth-8               124           91  -26.13%
BenchmarkStackGrowth-16               82           47  -41.94%
BenchmarkStackGrowth-32               73           40  -44.79%

BenchmarkStackGrowthDeep           97231        94391   -2.92%
BenchmarkStackGrowthDeep-2         47230        58562  +23.99%
BenchmarkStackGrowthDeep-4         24993        49356  +97.48%
BenchmarkStackGrowthDeep-8         15105        30072  +99.09%
BenchmarkStackGrowthDeep-16        10005        15623  +56.15%
BenchmarkStackGrowthDeep-32        12517        13069   +4.41%

TestStackMem#1,MB                  310          12       -96.13%
TestStackMem#2,MB                  296          14       -95.27%
TestStackMem#3,MB                  479          14       -97.08%

TestStackMem#1,sec                 3.22         2.26     -29.81%
TestStackMem#2,sec                 2.43         2.15     -11.52%
TestStackMem#3,sec                 2.50         2.38      -4.80%

R=sougou, no.smile.face, rsc
CC=golang-dev, msolomon
https://golang.org/cl/7029044
2013-01-10 09:57:06 +04:00
Jan Ziak
89ec208ee8 runtime: introduce typedefs and delete struct keywords in mgc0.c
R=rsc
CC=golang-dev
https://golang.org/cl/7029055
2013-01-04 10:20:50 -05:00
Jingcheng Zhang
70e967b7bc runtime: use "mp" and "gp" instead of "m" and "g" for local variable name to avoid confusion with the global "m" and "g".
R=golang-dev, minux.ma, rsc
CC=bradfitz, golang-dev
https://golang.org/cl/6939064
2012-12-19 00:30:29 +08:00
Jan Ziak
013fa63c90 runtime: struct Obj in mgc0.c and buffers in scanblock()
Details:

- This CL is the conceptual skeleton of code found in CL 6114046

- The garbage collector uses struct Obj to specify memory blocks

- scanblock() is putting found memory blocks into an intermediate buffer
  (xbuf) before adding/flushing them to the main work buffer (wbuf)

- The main loop in scanblock() is replaced with a skeleton code that
  in the future will be able to recognize the type of objects and
  thus will improve the garbage collector's precision.
  For now, all objects are simply sequences of pointers so
  the precision of the garbage collector remains unchanged.

- The code plugs .gcdata and .gcbss sections into the garbage collector.
  scanblock() in this CL is unable to make any use of this.

R=rsc, dvyukov, remyoudompheng
CC=dave, golang-dev, minux.ma
https://golang.org/cl/6856121
2012-12-16 19:32:12 -05:00
Jan Ziak
51b8edcb37 runtime: use reflect·call() to enter the function gc()
Garbage collection code (to be merged later) is calling functions
which have many local variables. This increases the probability that
the stack capacity won't be big enough to hold the local variables.
So, start gc() on a bigger stack to eliminate a potentially large number
of calls to runtime·morestack().

R=rsc, remyoudompheng, dsymonds, minux.ma, iant, iant
CC=golang-dev
https://golang.org/cl/6846044
2012-11-27 13:04:59 -05:00
Dmitriy Vyukov
063c13a34c runtime/race: more precise handling of finalizers
Currently race detector runtime just disables race detection in the finalizer goroutine.
It has false positives when a finalizer writes to shared memory -- the race with finalizer is reported in a normal goroutine that accesses the same memory.
After this change I am going to synchronize the finalizer goroutine with the rest of the world in racefingo(). This is closer to what happens in reality and so
does not have false positives.
And also add README file with instructions how to build the runtime.

R=golang-dev, minux.ma, rsc
CC=golang-dev
https://golang.org/cl/6810095
2012-11-14 16:58:10 +04:00
Jan Ziak
e0c9d04aec runtime: add memorydump() debugging function
R=golang-dev
CC=golang-dev, remyoudompheng, rsc
https://golang.org/cl/6780059
2012-11-01 12:56:25 -04:00
Dmitriy Vyukov
320df44f04 runtime: switch to 64-bit goroutine ids
Fixes #4275.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6759053
2012-10-26 10:13:06 +04:00
Russ Cox
f76f120324 cmd/ld: use 64-bit alignment for large data and bss objects
Check for specific, important misalignment in garbage collector.
Not a complete fix for issue 599 but an important workaround.

Update #599.

R=golang-dev, iant, dvyukov
CC=golang-dev
https://golang.org/cl/6641049
2012-10-09 12:50:06 -04:00
Dmitriy Vyukov
2f6cbc74f1 race: runtime changes
This is a part of a bigger change that adds data race detection feature:
https://golang.org/cl/6456044

R=rsc
CC=gobot, golang-dev
https://golang.org/cl/6535050
2012-10-07 22:05:32 +04:00
Jan Ziak
f8c58373e5 runtime: add types to MSpan
R=rsc
CC=golang-dev
https://golang.org/cl/6554060
2012-09-24 20:08:05 -04:00
Russ Cox
0b08c9483f runtime: prepare for 64-bit ints
This CL makes the runtime understand that the type of
the len or cap of a map, slice, or string is 'int', not 'int32',
and it is also careful to distinguish between function arguments
and results of type 'int' vs type 'int32'.

In the runtime, the new typedefs 'intgo' and 'uintgo' refer
to Go int and uint. The C types int and uint continue to be
unavailable (cause intentional compile errors).

This CL does not change the meaning of int, but it should make
the eventual change of the meaning of int on amd64 a bit
smoother.

Update #2188.

R=iant, r, dave, remyoudompheng
CC=golang-dev
https://golang.org/cl/6551067
2012-09-24 14:58:34 -04:00
Dmitriy Vyukov
f20fd87384 runtime: refactor goroutine blocking
The change is a preparation for the new scheduler.
It introduces runtime.park() function,
that will atomically unlock the mutex and park the goroutine.
It will allow to remove the racy readyonstop flag
that is difficult to implement w/o the global scheduler mutex.

R=rsc, remyoudompheng, dave
CC=golang-dev
https://golang.org/cl/6501077
2012-09-18 21:15:46 +04:00
Jan Ziak
54193689cc cmd/ld: fix compilation when GOARCH != GOHOSTARCH
R=rsc, dave, minux.ma
CC=golang-dev
https://golang.org/cl/6493123
2012-09-17 17:18:21 -04:00
Dmitriy Vyukov
ed516df4e4 runtime: add freemcache() function
It will be required for scheduler that maintains
GOMAXPROCS MCache's.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6350062
2012-07-01 13:10:01 +04:00
Dmitriy Vyukov
902911bcff runtime: fix potential GC deadlock
The issue seems to not be triggered right now,
but I've seen the deadlock after some other legal
modifications to runtime.
So I think we are safer this way.

R=rsc, r
CC=golang-dev
https://golang.org/cl/6339051
2012-06-25 11:08:09 +04:00
Dave Cheney
09f48db3e1 runtime: use uintptr for block length in scanblock
Using an int64 for a block size doesn't make
sense on 32bit platforms but extracts a performance
penalty dealing with double word quantities on Arm.

linux/arm

benchmark                 old ns/op    new ns/op    delta
BenchmarkGobDecode        155401600    144589300   -6.96%
BenchmarkGobEncode         72772220     62460940  -14.17%
BenchmarkGzip               5822632      2604797  -55.26%
BenchmarkGunzip              326321       151721  -53.51%

benchmark                  old MB/s     new MB/s  speedup
BenchmarkGobDecode             4.94         5.31    1.07x
BenchmarkGobEncode            10.55        12.29    1.16x

R=golang-dev, rsc, bradfitz
CC=golang-dev
https://golang.org/cl/6272047
2012-06-05 18:55:14 +10:00
Jan Ziak
334bf95f9e runtime: update field types in preparation for GC changes
R=rsc, remyoudompheng, minux.ma, ality
CC=golang-dev
https://golang.org/cl/6242061
2012-05-30 13:07:52 -04:00
Rémy Oudompheng
348087877c runtime: do not unset the special bit after finalization.
A block with finalizer might also be profiled. The special bit
is needed to unregister the block from the profile. It will be
unset only when the block is freed.

Fixes #3668.

R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6249066
2012-05-30 08:04:11 +02:00
Dmitriy Vyukov
b0702bd0db runtime: faster GC mark phase
Also bump MaxGcproc to 8.

benchmark             old ns/op    new ns/op    delta
Parser               3796323000   3763880000   -0.85%
Parser-2             3591752500   3518560250   -2.04%
Parser-4             3423825250   3334955250   -2.60%
Parser-8             3304585500   3267014750   -1.14%
Parser-16            3313615750   3286160500   -0.83%

Tree                  984128500    942501166   -4.23%
Tree-2                932564444    883266222   -5.29%
Tree-4                835831000    799912777   -4.30%
Tree-8                819238500    789717333   -3.73%
Tree-16               880837833    837840055   -5.13%

Tree2                 604698100    579716900   -4.13%
Tree2-2               372414500    356765200   -4.20%
Tree2-4               187488100    177455900   -5.56%
Tree2-8               136315300    102086700  -25.11%
Tree2-16               93725900     76705800  -22.18%

ParserPause           157441210    166202783   +5.56%
ParserPause-2          93842650     85199900   -9.21%
ParserPause-4          56844404     53535684   -5.82%
ParserPause-8          35739446     30767613  -16.15%
ParserPause-16         32718255     27212441  -16.83%

TreePause              29610557     29787725   +0.60%
TreePause-2            24001659     20674421  -13.86%
TreePause-4            15114887     12842781  -15.03%
TreePause-8            13128725     10741747  -22.22%
TreePause-16           16131360     12506901  -22.47%

Tree2Pause           2673350920   2651045280   -0.83%
Tree2Pause-2         1796999200   1709350040   -4.88%
Tree2Pause-4         1163553320   1090706480   -6.67%
Tree2Pause-8          987032520    858916360  -25.11%
Tree2Pause-16         864758560    809567480   -6.81%

ParserLastPause       280537000    289047000   +3.03%
ParserLastPause-2     183030000    166748000   -8.90%
ParserLastPause-4     105817000     91552000  -13.48%
ParserLastPause-8      65127000     53288000  -18.18%
ParserLastPause-16     45258000     38334000  -15.30%

TreeLastPause          45072000     51449000  +12.39%
TreeLastPause-2        39269000     37866000   -3.57%
TreeLastPause-4        23564000     20649000  -12.37%
TreeLastPause-8        20881000     15807000  -24.30%
TreeLastPause-16       23297000     17309000  -25.70%

Tree2LastPause       6046912000   5797120000   -4.13%
Tree2LastPause-2     3724034000   3567592000   -4.20%
Tree2LastPause-4     1874831000   1774524000   -5.65%
Tree2LastPause-8     1363108000   1020809000  -12.79%
Tree2LastPause-16     937208000    767019000  -22.18%

R=rsc, 0xe2.0x9a.0x9b
CC=golang-dev
https://golang.org/cl/6223050
2012-05-24 10:55:50 +04:00
Dmitriy Vyukov
845aa1fc2c runtime: faster GC sweep phase
benchmark                              old ns/op    new ns/op    delta

garbage.BenchmarkParser               3731065750   3715543750   -0.41%
garbage.BenchmarkParser-2             3631299750   3495248500   -3.75%
garbage.BenchmarkParser-4             3386486000   3339353000   -1.39%
garbage.BenchmarkParser-8             3267632000   3286422500   +0.58%
garbage.BenchmarkParser-16            3299203000   3316081750   +0.51%

garbage.BenchmarkTree                  977532888    919453833   -5.94%
garbage.BenchmarkTree-2                919948555    853478000   -7.23%
garbage.BenchmarkTree-4                841329000    790207000   -6.08%
garbage.BenchmarkTree-8                787792777    740380666   -6.01%
garbage.BenchmarkTree-16               899257166    846594555   -5.86%

garbage.BenchmarkTree2                 574876300    571885800   -0.52%
garbage.BenchmarkTree2-2               348162700    345888900   -0.65%
garbage.BenchmarkTree2-4               184912500    179137000   -3.22%
garbage.BenchmarkTree2-8               104243900    103485600   -0.73%
garbage.BenchmarkTree2-16               97269500     85137100  -14.25%

garbage.BenchmarkParserPause           141101976    157746974  +11.80%
garbage.BenchmarkParserPause-2         103096051     83043048  -19.45%
garbage.BenchmarkParserPause-4          52153133     45951111  -11.89%
garbage.BenchmarkParserPause-8          36730190     38901024   +5.91%
garbage.BenchmarkParserPause-16         32678875     29578585   -9.49%

garbage.BenchmarkTreePause              29487065     29648439   +0.55%
garbage.BenchmarkTreePause-2            22443494     21306159   -5.07%
garbage.BenchmarkTreePause-4            15799691     14985647   -5.15%
garbage.BenchmarkTreePause-8            10768112     9531420   -12.97%
garbage.BenchmarkTreePause-16           16329891     15205158   -6.89%

garbage.BenchmarkTree2Pause           2586957240   2577533200   -0.36%
garbage.BenchmarkTree2Pause-2         1683383760   1673923800   -0.56%
garbage.BenchmarkTree2Pause-4         1102860320   1074040280   -2.68%
garbage.BenchmarkTree2Pause-8          902627920    886122400   -1.86%
garbage.BenchmarkTree2Pause-16         856470920    804152320   -6.50%

garbage.BenchmarkParserLastPause       277316000    280839000   +1.25%
garbage.BenchmarkParserLastPause-2     179446000    163687000   -8.78%
garbage.BenchmarkParserLastPause-4     106752000     94144000  -11.81%
garbage.BenchmarkParserLastPause-8      57758000     61640000   +6.72%
garbage.BenchmarkParserLastPause-16     51235000     42552000  -16.95%

garbage.BenchmarkTreeLastPause          45244000     50786000  +12.25%
garbage.BenchmarkTreeLastPause-2        37163000     34654000   -6.75%
garbage.BenchmarkTreeLastPause-4        24178000     21967000   -9.14%
garbage.BenchmarkTreeLastPause-8        20390000     15648000  -30.30%
garbage.BenchmarkTreeLastPause-16       22398000     20180000   -9.90%

garbage.BenchmarkTree2LastPause       5748706000   5718809000   -0.52%
garbage.BenchmarkTree2LastPause-2     3481570000   3458844000   -0.65%
garbage.BenchmarkTree2LastPause-4     1849073000   1791330000   -3.22%
garbage.BenchmarkTree2LastPause-8     1042375000   1034811000   -0.73%
garbage.BenchmarkTree2LastPause-16     972637000    851323000  -14.25%

There is also visible improvement in consumed CPU time:
tree2 -heapsize=8000000000 -cpus=12
before: 248.74user 6.36system 0:52.74elapsed 483%CPU
after:  229.86user 6.33system 0:51.08elapsed 462%CPU
-1.66s of real time, but -18.91s of consumed CPU time

R=golang-dev
CC=golang-dev
https://golang.org/cl/6215065
2012-05-22 13:35:52 -04:00
Dmitriy Vyukov
01826280eb runtime: refactor helpgc functionality in preparation for parallel GC
Parallel GC needs to know in advance how many helper threads will be there.
Hopefully it's the last patch before I can tackle parallel sweep phase.
The benchmarks are unaffected.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6200064
2012-05-15 19:10:16 +04:00
Dmitriy Vyukov
4945fc8e40 runtime: speedup GC sweep phase (batch free)
benchmark                             old ns/op    new ns/op    delta
garbage.BenchmarkParser              4370050250   3779668750  -13.51%
garbage.BenchmarkParser-2            3713087000   3628771500   -2.27%
garbage.BenchmarkParser-4            3519755250   3406349750   -3.22%
garbage.BenchmarkParser-8            3386627750   3319144000   -1.99%

garbage.BenchmarkTree                 493585529    408102411  -17.32%
garbage.BenchmarkTree-2               500487176    402285176  -19.62%
garbage.BenchmarkTree-4               473238882    361484058  -23.61%
garbage.BenchmarkTree-8               486977823    368334823  -24.36%

garbage.BenchmarkTree2                 31446600     31203200   -0.77%
garbage.BenchmarkTree2-2               21469000     21077900   -1.82%
garbage.BenchmarkTree2-4               11007600     10899100   -0.99%
garbage.BenchmarkTree2-8                7692400      7032600   -8.58%

garbage.BenchmarkParserPause          241863263    163249450  -32.50%
garbage.BenchmarkParserPause-2        120135418    112981575   -5.95%
garbage.BenchmarkParserPause-4         83411552     64580700  -22.58%
garbage.BenchmarkParserPause-8         51870697     42207244  -18.63%

garbage.BenchmarkTreePause             20940474     13147011  -37.22%
garbage.BenchmarkTreePause-2           20115124     11146715  -44.59%
garbage.BenchmarkTreePause-4           17217584      7486327  -56.52%
garbage.BenchmarkTreePause-8           18258845      7400871  -59.47%

garbage.BenchmarkTree2Pause           174067190    172674190   -0.80%
garbage.BenchmarkTree2Pause-2         131175809    130615761   -0.43%
garbage.BenchmarkTree2Pause-4          95406666     93972047   -1.50%
garbage.BenchmarkTree2Pause-8          86056095     85334952   -0.84%

garbage.BenchmarkParserLastPause      329932000    324790000   -1.56%
garbage.BenchmarkParserLastPause-2    209383000    210456000   +0.51%
garbage.BenchmarkParserLastPause-4    113981000    112921000   -0.93%
garbage.BenchmarkParserLastPause-8     77967000     76625000   -1.72%

garbage.BenchmarkTreeLastPause         29752000     18444000  -38.01%
garbage.BenchmarkTreeLastPause-2       24274000     14766000  -39.17%
garbage.BenchmarkTreeLastPause-4       19565000      8726000  -55.40%
garbage.BenchmarkTreeLastPause-8       21956000     10530000  -52.04%

garbage.BenchmarkTree2LastPause       314411000    311945000   -0.78%
garbage.BenchmarkTree2LastPause-2     214641000    210836000   -1.77%
garbage.BenchmarkTree2LastPause-4     110024000    108943000   -0.98%
garbage.BenchmarkTree2LastPause-8      76873000     70263000   -8.60%

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5991049
2012-04-12 12:01:24 +04:00
Dmitriy Vyukov
342658bbb6 runtime: preparation for parallel GC
make MHeap.allspans an array instead on a linked-list,
it's required for parallel for

benchmark                              old ns/op    new ns/op    delta

garbage.BenchmarkTree                  494435529    487962705   -1.31%
garbage.BenchmarkTree-2                499652705    485358000   -2.86%
garbage.BenchmarkTree-4                468482117    454093117   -3.07%
garbage.BenchmarkTree-8                488533235    471872470   -3.41%
garbage.BenchmarkTree-16               507835176    492558470   -3.01%

garbage.BenchmarkTree2                  31453900     31404300   -0.16%
garbage.BenchmarkTree2-2                21440600     21477000   +0.17%
garbage.BenchmarkTree2-4                10982000     11117400   +1.23%
garbage.BenchmarkTree2-8                 7544700      7456700   -1.17%
garbage.BenchmarkTree2-16                7049500      6805700   -3.46%

garbage.BenchmarkParser               4448988000   4453264000   +0.10%
garbage.BenchmarkParser-2             4086045000   4057948000   -0.69%
garbage.BenchmarkParser-4             3677365000   3661246000   -0.44%
garbage.BenchmarkParser-8             3517253000   3540190000   +0.65%
garbage.BenchmarkParser-16            3506562000   3463478000   -1.23%

garbage.BenchmarkTreePause              20969784     21100238   +0.62%
garbage.BenchmarkTreePause-2            20215875     20139572   -0.38%
garbage.BenchmarkTreePause-4            17240709     16683624   -3.23%
garbage.BenchmarkTreePause-8            18196386     17639306   -3.06%
garbage.BenchmarkTreePause-16           20621158     20215056   -1.97%

garbage.BenchmarkTree2Pause            173992142    173872380   -0.07%
garbage.BenchmarkTree2Pause-2          131281904    131366666   +0.06%
garbage.BenchmarkTree2Pause-4           93484952     95109619   +1.74%
garbage.BenchmarkTree2Pause-8           88950523     86533333   -2.72%
garbage.BenchmarkTree2Pause-16          86071238     84089190   -2.30%

garbage.BenchmarkParserPause           135815000    135255952   -0.41%
garbage.BenchmarkParserPause-2          92691523     91451428   -1.34%
garbage.BenchmarkParserPause-4          53392190     51611904   -3.33%
garbage.BenchmarkParserPause-8          36059523     35116666   -2.61%
garbage.BenchmarkParserPause-16         30174300     27340600   -9.39%

garbage.BenchmarkTreeLastPause          28420000     29142000   +2.54%
garbage.BenchmarkTreeLastPause-2        23514000     26779000  +13.89%
garbage.BenchmarkTreeLastPause-4        21773000     18660000  -14.30%
garbage.BenchmarkTreeLastPause-8        24072000     21276000  -11.62%
garbage.BenchmarkTreeLastPause-16       25149000     28541000  +13.49%

garbage.BenchmarkTree2LastPause        314491000    313982000   -0.16%
garbage.BenchmarkTree2LastPause-2      214363000    214715000   +0.16%
garbage.BenchmarkTree2LastPause-4      109778000    111115000   +1.22%
garbage.BenchmarkTree2LastPause-8       75390000     74522000   -1.15%
garbage.BenchmarkTree2LastPause-16      70333000     67880000   -3.49%

garbage.BenchmarkParserLastPause       327247000    326815000   -0.13%
garbage.BenchmarkParserLastPause-2     217039000    212529000   -2.08%
garbage.BenchmarkParserLastPause-4     119722000    111535000   -6.84%
garbage.BenchmarkParserLastPause-8      70806000     69613000   -1.68%
garbage.BenchmarkParserLastPause-16     62813000     48009000  -23.57%

R=rsc, r
CC=golang-dev
https://golang.org/cl/5992055
2012-04-09 13:05:43 +04:00
Dmitriy Vyukov
f09e63a2a0 runtime: add memory prefetching to GC
benchmark                              old ns/op    new ns/op    delta

garbage.BenchmarkParser               4448988000   4370531000   -1.76%
garbage.BenchmarkParser-2             4086045000   4023083000   -1.54%
garbage.BenchmarkParser-4             3677365000   3667020000   -0.28%
garbage.BenchmarkParser-8             3517253000   3543946000   +0.76%
garbage.BenchmarkParser-16            3506562000   3512518000   +0.17%

garbage.BenchmarkTree                  494435529    505784058   +2.30%
garbage.BenchmarkTree-2                499652705    502774823   +0.62%
garbage.BenchmarkTree-4                468482117    465713352   -0.59%
garbage.BenchmarkTree-8                488533235    482287000   -1.28%
garbage.BenchmarkTree-16               507835176    500654882   -1.41%

garbage.BenchmarkTree2                  31453900     28804600   -8.42%
garbage.BenchmarkTree2-2                21440600     19065800  -11.08%
garbage.BenchmarkTree2-4                10982000     10009100   -8.86%
garbage.BenchmarkTree2-8                 7544700      6479800  -14.11%
garbage.BenchmarkTree2-16                7049500      6163200  -12.57%

garbage.BenchmarkParserPause           135815000    125360666   -7.70%
garbage.BenchmarkParserPause-2          92691523     84365476   -8.98%
garbage.BenchmarkParserPause-4          53392190     46995809  -11.98%
garbage.BenchmarkParserPause-8          36059523     30998900  -14.03%
garbage.BenchmarkParserPause-16         30174300     27613350   -8.49%

garbage.BenchmarkTreePause              20969784     22568102   +7.62%
garbage.BenchmarkTreePause-2            20215875     20975130   +3.76%
garbage.BenchmarkTreePause-4            17240709     17180666   -0.35%
garbage.BenchmarkTreePause-8            18196386     18205870   +0.05%
garbage.BenchmarkTreePause-16           20621158     20486867   -0.65%

garbage.BenchmarkTree2Pause            173992142    159995285   -8.04%
garbage.BenchmarkTree2Pause-2          131281904    118013714  -10.11%
garbage.BenchmarkTree2Pause-4           93484952     85092666   -8.98%
garbage.BenchmarkTree2Pause-8           88950523     77340809  -13.05%
garbage.BenchmarkTree2Pause-16          86071238     76557952  -11.05%

garbage.BenchmarkParserLastPause       327247000    288205000  -11.93%
garbage.BenchmarkParserLastPause-2     217039000    187336000  -13.69%
garbage.BenchmarkParserLastPause-4     119722000    105069000  -12.24%
garbage.BenchmarkParserLastPause-8      70806000     64755000   -8.55%
garbage.BenchmarkParserLastPause-16     62813000     53486000  -14.85%

garbage.BenchmarkTreeLastPause          28420000     29735000   +4.63%
garbage.BenchmarkTreeLastPause-2        23514000     25427000   +8.14%
garbage.BenchmarkTreeLastPause-4        21773000     19548000  -10.22%
garbage.BenchmarkTreeLastPause-8        24072000     24046000   -0.11%
garbage.BenchmarkTreeLastPause-16       25149000     25291000   +0.56%

garbage.BenchmarkTree2LastPause        314491000    287988000   -8.43%
garbage.BenchmarkTree2LastPause-2      214363000    190616000  -11.08%
garbage.BenchmarkTree2LastPause-4      109778000    100052000   -8.86%
garbage.BenchmarkTree2LastPause-8       75390000     64753000  -14.11%
garbage.BenchmarkTree2LastPause-16      70333000     61484000  -12.58%

FTR, below are result with the empty prefetch function,
that is, single RET but no real prefetching.
It suggests that inlinable PREFETCH is worth pursuing.

benchmark                              old ns/op    new ns/op    delta

garbage.BenchmarkParser               4448988000   4560488000   +2.51%
garbage.BenchmarkParser-2             4086045000   4129728000   +1.07%
garbage.BenchmarkParser-4             3677365000   3728672000   +1.40%
garbage.BenchmarkParser-8             3517253000   3583968000   +1.90%
garbage.BenchmarkParser-16            3506562000   3591414000   +2.42%

garbage.BenchmarkTree                  494435529    499580882   +1.04%
garbage.BenchmarkTree-4                468482117    467387294   -0.23%
garbage.BenchmarkTree-8                488533235    478311117   -2.09%
garbage.BenchmarkTree-2                499652705    499324235   -0.07%
garbage.BenchmarkTree-16               507835176    502005705   -1.15%

garbage.BenchmarkTree2                  31453900     33296800   +5.86%
garbage.BenchmarkTree2-2                21440600     22466400   +4.78%
garbage.BenchmarkTree2-4                10982000     11402700   +3.83%
garbage.BenchmarkTree2-8                 7544700      7476500   -0.90%
garbage.BenchmarkTree2-16                7049500      7338200   +4.10%

garbage.BenchmarkParserPause           135815000    139529142   +2.73%
garbage.BenchmarkParserPause-2          92691523     95229190   +2.74%
garbage.BenchmarkParserPause-4          53392190     53083476   -0.58%
garbage.BenchmarkParserPause-8          36059523     34594800   -4.06%
garbage.BenchmarkParserPause-16         30174300     30063300   -0.37%

garbage.BenchmarkTreePause              20969784     21866920   +4.28%
garbage.BenchmarkTreePause-2            20215875     20731125   +2.55%
garbage.BenchmarkTreePause-4            17240709     17275837   +0.20%
garbage.BenchmarkTreePause-8            18196386     17898777   -1.64%
garbage.BenchmarkTreePause-16           20621158     20662772   +0.20%

garbage.BenchmarkTree2Pause            173992142    184336857   +5.95%
garbage.BenchmarkTree2Pause-2          131281904    138005714   +5.12%
garbage.BenchmarkTree2Pause-4           93484952     98449238   +5.31%
garbage.BenchmarkTree2Pause-8           88950523     89286095   +0.38%
garbage.BenchmarkTree2Pause-16          86071238     89568666   +4.06%

garbage.BenchmarkParserLastPause       327247000    342189000   +4.57%
garbage.BenchmarkParserLastPause-2     217039000    217224000   +0.09%
garbage.BenchmarkParserLastPause-4     119722000    121327000   +1.34%
garbage.BenchmarkParserLastPause-8      70806000     71941000   +1.60%
garbage.BenchmarkParserLastPause-16     62813000     60166000   -4.21%

garbage.BenchmarkTreeLastPause          28420000     27840000   -2.04%
garbage.BenchmarkTreeLastPause-2        23514000     27390000  +16.48%
garbage.BenchmarkTreeLastPause-4        21773000     21414000   -1.65%
garbage.BenchmarkTreeLastPause-8        24072000     21705000   -9.83%
garbage.BenchmarkTreeLastPause-16       25149000     23932000   -4.84%

garbage.BenchmarkTree2LastPause        314491000    332894000   +5.85%
garbage.BenchmarkTree2LastPause-2      214363000    224611000   +4.78%
garbage.BenchmarkTree2LastPause-4      109778000    113976000   +3.82%
garbage.BenchmarkTree2LastPause-8       75390000     67223000  -10.83%
garbage.BenchmarkTree2LastPause-16      70333000     73216000   +4.10%

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5991057
2012-04-07 17:02:44 +04:00
Dmitriy Vyukov
9903d6870f runtime: minor refactoring in preparation for parallel GC
factor sweepspan() out of sweep(), no logical changes

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5991047
2012-04-05 21:02:20 +04:00
Dmitriy Vyukov
d839a809b2 runtime: make GC stats per-M
This is factored out part of:
https://golang.org/cl/5279048/
(Parallel GC)

benchmark                             old ns/op    new ns/op    delta
garbage.BenchmarkParser              3999106750   3975026500   -0.60%
garbage.BenchmarkParser-2            3720553750   3719196500   -0.04%
garbage.BenchmarkParser-4            3502857000   3474980500   -0.80%
garbage.BenchmarkParser-8            3375448000   3341310500   -1.01%
garbage.BenchmarkParserLastPause      329401000    324097000   -1.61%
garbage.BenchmarkParserLastPause-2    208953000    214222000   +2.52%
garbage.BenchmarkParserLastPause-4    110933000    111656000   +0.65%
garbage.BenchmarkParserLastPause-8     71969000     78230000   +8.70%
garbage.BenchmarkParserPause          230808842    197237400  -14.55%
garbage.BenchmarkParserPause-2        123674365    125197595   +1.23%
garbage.BenchmarkParserPause-4         80518525     85710333   +6.45%
garbage.BenchmarkParserPause-8         58310243     56940512   -2.35%
garbage.BenchmarkTree2                 31471700     31289400   -0.58%
garbage.BenchmarkTree2-2               21536800     21086300   -2.09%
garbage.BenchmarkTree2-4               11074700     10880000   -1.76%
garbage.BenchmarkTree2-8                7568600      7351400   -2.87%
garbage.BenchmarkTree2LastPause       314664000    312840000   -0.58%
garbage.BenchmarkTree2LastPause-2     215319000    210815000   -2.09%
garbage.BenchmarkTree2LastPause-4     110698000    108751000   -1.76%
garbage.BenchmarkTree2LastPause-8      75635000     73463000   -2.87%
garbage.BenchmarkTree2Pause           174280857    173147571   -0.65%
garbage.BenchmarkTree2Pause-2         131332714    129665761   -1.27%
garbage.BenchmarkTree2Pause-4          93803095     93422904   -0.41%
garbage.BenchmarkTree2Pause-8          86242333     85146761   -1.27%

R=rsc
CC=golang-dev
https://golang.org/cl/5987045
2012-04-05 20:48:28 +04:00
Russ Cox
e4b02bfdc0 runtime: goroutine profile, stack dumps
R=golang-dev, r, r
CC=golang-dev
https://golang.org/cl/5687076
2012-02-22 21:45:01 -05:00
Russ Cox
5bcad92f07 ld: add NOPTRBSS for large, pointer-free uninitialized data
cc: add #pragma textflag to set it
runtime: mark mheap to go into noptr-bss.
        remove special case in garbage collector

Remove the ARM from.flag field created by CL 5687044.
The DUPOK flag was already in p->reg, so keep using that.

Otherwise test/nilptr.go creates a very large binary.
Should fix the arm build.
Diagnosed by minux.ma; replacement for CL 5690044.

R=golang-dev, minux.ma, r
CC=golang-dev
https://golang.org/cl/5686060
2012-02-21 22:08:42 -05:00
Sébastien Paolacci
5c598d3c9f runtime: release unused memory to the OS.
Periodically browse MHeap's freelists for long unused spans and release them if any.

Current hardcoded settings:
        - GC is forced if none occured over the last 2 minutes.
        - spans are handed back after 5 minutes of uselessness.

SysUnused (for Unix) is a wrapper on madvise MADV_DONTNEED on Linux and MADV_FREE on BSDs.

R=rsc, dvyukov, remyoudompheng
CC=golang-dev
https://golang.org/cl/5451057
2012-02-16 13:30:04 -05:00
Rémy Oudompheng
842c906e2e runtime: delete UpdateMemStats, replace with ReadMemStats(&stats).
Unexports runtime.MemStats and rename MemStatsType to MemStats.
The new accessor requires passing a pointer to a user-allocated
MemStats structure.

Fixes #2572.

R=bradfitz, rsc, bradfitz, gustavo
CC=golang-dev, remy
https://golang.org/cl/5616072
2012-02-06 19:16:26 +01:00
Russ Cox
a6d8b483b6 runtime: make garbage collector faster by deleting code
Suggested by Sanjay Ghemawat.  5-20% faster depending
on the benchmark.

Add tree2 garbage benchmark.
Update other garbage benchmarks to build again.

R=golang-dev, r, adg
CC=golang-dev
https://golang.org/cl/5530074
2012-01-10 19:49:11 -08:00
Russ Cox
851f30136d runtime: make more build-friendly
Collapse the arch,os-specific directories into the main directory
by renaming xxx/foo.c to foo_xxx.c, and so on.

There are no substantial edits here, except to the Makefile.
The assumption is that the Go tool will #define GOOS_darwin
and GOARCH_amd64 and will make any file named something
like signals_darwin.h available as signals_GOOS.h during the
build.  This replaces what used to be done with -I$(GOOS).

There is still work to be done to make runtime build with
standard tools, but this is a big step.  After this we will have
to write a script to generate all the generated files so they
can be checked in (instead of generated during the build).

R=r, iant, r, lucio.dere
CC=golang-dev
https://golang.org/cl/5490053
2011-12-16 15:33:58 -05:00
Russ Cox
8219cc9af8 runtime: fix memory leak in parallel garbage collector
The work buffer management used by the garbage
collector during parallel collections leaks buffers.
This CL tests for and fixes the leak.

R=golang-dev, dvyukov, r
CC=golang-dev
https://golang.org/cl/5254059
2011-10-12 13:23:34 -04:00
Hector Chu
8584445289 runtime: fix crash when returning from syscall during gc
gp->m can go from non-nil to nil when it re-enters schedule().

R=golang-dev
CC=golang-dev, rsc
https://golang.org/cl/5245042
2011-10-11 12:57:16 -04:00
Dmitriy Vyukov
c14b2689f0 runtime: faster finalizers
Linux/amd64, 2 x Intel Xeon E5620, 8 HT cores, 2.40GHz
benchmark                    old ns/op    new ns/op    delta
BenchmarkFinalizer              420.00       261.00  -37.86%
BenchmarkFinalizer-2            985.00       201.00  -79.59%
BenchmarkFinalizer-4           1077.00       244.00  -77.34%
BenchmarkFinalizer-8           1155.00       180.00  -84.42%
BenchmarkFinalizer-16          1182.00       184.00  -84.43%

BenchmarkFinalizerRun          2128.00      1378.00  -35.24%
BenchmarkFinalizerRun-2        1655.00      1418.00  -14.32%
BenchmarkFinalizerRun-4        1634.00      1522.00   -6.85%
BenchmarkFinalizerRun-8        2213.00      1581.00  -28.56%
BenchmarkFinalizerRun-16       2424.00      1599.00  -34.03%

Darwin/amd64, Intel L9600, 2 cores, 2.13GHz
benchmark                    old ns/op    new ns/op    delta
BenchmarkChanCreation          1451.00       926.00  -36.18%
BenchmarkChanCreation-2        3124.00      1412.00  -54.80%
BenchmarkChanCreation-4        6121.00      2628.00  -57.07%

BenchmarkFinalizer              684.00       420.00  -38.60%
BenchmarkFinalizer-2          11195.00       398.00  -96.44%
BenchmarkFinalizer-4          15862.00       654.00  -95.88%

BenchmarkFinalizerRun          2025.00      1397.00  -31.01%
BenchmarkFinalizerRun-2        3920.00      1447.00  -63.09%
BenchmarkFinalizerRun-4        9471.00      1545.00  -83.69%

R=golang-dev, cw, rsc
CC=golang-dev
https://golang.org/cl/4963057
2011-10-06 18:42:51 +03:00
Dmitriy Vyukov
5695915833 runtime: fix spurious deadlock reporting
Fixes #2337.
Unfortunate sequence of events is:
1. maxcpu=2, mcpu=1, grunning=1
2. starttheworld creates an extra M:
   maxcpu=2, mcpu=2, grunning=1
4. the goroutine calls runtime.GOMAXPROCS(1)
   maxcpu=1, mcpu=2, grunning=1
5. since it sees mcpu>maxcpu, it calls gosched()
6. schedule() deschedules the goroutine:
   maxcpu=1, mcpu=1, grunning=0
7. schedule() call getnextandunlock() which
   fails to pick up the goroutine again,
   because canaddcpu() fails, because mcpu==maxcpu
8. then it sees that grunning==0,
   reports deadlock and terminates

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5191044
2011-10-06 18:10:14 +03:00
Joel Sing
a5f064a3e1 gc: limit helper threads based on ncpu
When ncpu < 2, work.nproc is always 1 which results in infinite helper
threads being created if gomaxprocs > 1 and MaxGcproc > 1. Avoid this
by using the same limits as imposed helpgc().

R=golang-dev, rsc, dvyukov
CC=golang-dev
https://golang.org/cl/5176044
2011-10-05 12:08:28 -04:00
Russ Cox
d324f2143b runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.

On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 	 1 cpu, no atomics
32.27u 0.58s 32.86r 	 1 cpu, atomic instructions
33.04u 0.83s 27.47r 	 2 cpu

On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 	 1 cpu, no atomics
34.87u 1.12s 29.60r 	 2 cpu
36.00u 1.87s 28.43r 	 3 cpu
36.46u 2.34s 27.10r 	 4 cpu
38.28u 3.85s 26.92r 	 5 cpu
37.72u 5.25s 26.73r	 6 cpu
39.63u 7.11s 26.95r	 7 cpu
39.67u 8.10s 26.68r	 8 cpu

On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 	 1 cpu, no atomics
43.98u 2.95s 38.69r 	 2 cpu

On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 	 1 cpu, no atomics
57.15u 4.72s 51.54r 	 2 cpu

The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.

R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 09:40:01 -04:00
Hector Chu
5c30325983 runtime: eliminate handle churn when churning channels on Windows
The Windows implementation of the net package churns through a couple of channels for every read/write operation.  This translates into a lot of time spent in the kernel creating and deleting event objects.

R=rsc, dvyukov, alex.brainman, jp
CC=golang-dev
https://golang.org/cl/4997044
2011-09-14 20:23:21 -04:00
Russ Cox
03e9ea5b74 runtime: simplify stack traces
Make the stack traces more readable for new
Go programmers while preserving their utility for old hands.

- Change status number [4] to string.
- Elide frames in runtime package (internal details).
- Swap file:line and arguments.
- Drop 'created by' for main goroutine.
- Show goroutines in order of allocation:
  implies main goroutine first if nothing else.

There is no option to get the extra frames back.
Uncomment 'return 1' at the bottom of symtab.c.

$ 6.out
throw: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.main()
       /Users/rsc/g/go/src/pkg/runtime/x.go:22 +0x8a

goroutine 2 [select (no cases)]:
main.sel()
       /Users/rsc/g/go/src/pkg/runtime/x.go:11 +0x18
created by main.main
       /Users/rsc/g/go/src/pkg/runtime/x.go:19 +0x23

goroutine 3 [chan receive]:
main.recv(0xf8400010a0, 0x0)
       /Users/rsc/g/go/src/pkg/runtime/x.go:15 +0x2e
created by main.main
       /Users/rsc/g/go/src/pkg/runtime/x.go:20 +0x50

goroutine 4 [chan receive (nil chan)]:
main.recv(0x0, 0x0)
       /Users/rsc/g/go/src/pkg/runtime/x.go:15 +0x2e
created by main.main
       /Users/rsc/g/go/src/pkg/runtime/x.go:21 +0x66
$

$ 6.out index
panic: runtime error: index out of range

goroutine 1 [running]:
main.main()
        /Users/rsc/g/go/src/pkg/runtime/x.go:25 +0xb9
$

$ 6.out nil
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x22ca]

goroutine 1 [running]:
main.main()
        /Users/rsc/g/go/src/pkg/runtime/x.go:28 +0x211
$

$ 6.out panic
panic: panic

goroutine 1 [running]:
main.main()
        /Users/rsc/g/go/src/pkg/runtime/x.go:30 +0x101
$

R=golang-dev, qyzhai, n13m3y3r, r
CC=golang-dev
https://golang.org/cl/4907048
2011-08-22 23:26:39 -04:00
Dmitriy Vyukov
a2677cf363 runtime: fix GC bitmap corruption
The corruption can occur when GOMAXPROCS
is changed from >1 to 1, since GOMAXPROCS=1
does not imply there is only 1 goroutine running,
other goroutines can still be not parked after
the change.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4873050
2011-08-16 16:53:02 -04:00
Russ Cox
226fb099d9 runtime: add UpdateMemStats, use in tests
Drops mallocrep1.go back to a reasonable
amount of time.  (154 -> 0.8 seconds on my Mac)

Fixes #2085.

R=golang-dev, dvyukov, r
CC=golang-dev
https://golang.org/cl/4811045
2011-07-22 00:55:01 -04:00
Dmitriy Vyukov
66d5c9b1e9 runtime: add per-M caches for MemStats
Avoid touching centralized state during
memory manager opreations.

R=rsc
CC=golang-dev
https://golang.org/cl/4766042
2011-07-18 14:52:57 -04:00
Russ Cox
37b3494026 runtime: fix typo in gc bug fix
This time for sure.

R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/4437078
2011-04-28 00:20:37 -04:00
Russ Cox
370276a3e5 runtime: stack split + garbage collection bug
The g->sched.sp saved stack pointer and the
g->stackbase and g->stackguard stack bounds
can change even while "the world is stopped",
because a goroutine has to call functions (and
therefore might split its stack) when exiting a
system call to check whether the world is stopped
(and if so, wait until the world continues).

That means the garbage collector cannot access
those values safely (without a race) for goroutines
executing system calls.  Instead, save a consistent
triple in g->gcsp, g->gcstack, g->gcguard during
entersyscall and have the garbage collector refer
to those.

The old code was occasionally seeing (because of
the race) an sp and stk that did not correspond to
each other, so that stk - sp was not the number of
stack bytes following sp.  In that case, if sp < stk
then the call scanblock(sp, stk - sp) scanned too
many bytes (anything between the two pointers,
which pointed into different allocation blocks).
If sp > stk then stk - sp wrapped around.
On 32-bit, stk - sp is a uintptr (uint32) converted
to int64 in the call to scanblock, so a large (~4G)
but positive number.  Scanblock would try to scan
that many bytes and eventually fault accessing
unmapped memory.  On 64-bit, stk - sp is a uintptr (uint64)
promoted to int64 in the call to scanblock, so a negative
number.  Scanblock would not scan anything, possibly
causing in-use blocks to be freed.

In short, 32-bit platforms would have seen either
ineffective garbage collection or crashes during garbage
collection, while 64-bit platforms would have seen
either ineffective or incorrect garbage collection.
You can see the invalid arguments to scanblock in the
stack traces in issue 1620.

Fixes #1620.
Fixes #1746.

R=iant, r
CC=golang-dev
https://golang.org/cl/4437075
2011-04-27 23:21:12 -04:00
Ian Lance Taylor
6892155ded runtime: remove unused declarations from mgc0.c.
R=rsc
CC=golang-dev
https://golang.org/cl/4252063
2011-03-07 15:30:25 -08:00
Russ Cox
f9ca3b5d5b runtime: scheduler, cgo reorganization
* Change use of m->g0 stack (aka scheduler stack).
* Provide runtime.mcall(f) to invoke f() on m->g0 stack.
* Replace scheduler loop entry with runtime.mcall(schedule).

Runtime.mcall eliminates the need for fake scheduler states that
exist just to run a bit of code on the m->g0 stack
(Grecovery, Gstackalloc).

The elimination of the scheduler as a loop that stops and
starts using gosave and gogo fixes a bad interaction with the
way cgo uses the m->g0 stack.  Cgo runs external (gcc-compiled)
C functions on that stack, and then when calling back into Go,
it sets m->g0->sched.sp below the added call frames, so that
other uses of m->g0's stack will not interfere with those frames.
Unfortunately, gogo (longjmp) back to the scheduler loop at
this point would end up running scheduler with the lower
sp, which no longer points at a valid stack frame for
a call to scheduler.  If scheduler then wrote any function call
arguments or local variables to where it expected the stack
frame to be, it would overwrite other data on the stack.
I realized this possibility while debugging a problem with
calling complex Go code in a Go -> C -> Go cgo callback.
This wasn't the bug I was looking for, it turns out, but I believe
it is a real bug nonetheless.  Switching to runtime.mcall, which
only adds new frames to the stack and never jumps into
functions running in existing ones, fixes this bug.

* Move cgo-related code out of proc.c into cgocall.c.
* Add very large comment describing cgo call sequences.
* Simpilify, regularize cgo function implementations and names.
* Add test suite as misc/cgo/test.

Now the Go -> C path calls cgocall, which calls asmcgocall,
and the C -> Go path calls cgocallback, which calls cgocallbackg.

The shuffling, which affects mainly the callback case, moves
most of the callback implementation to cgocallback running
on the m->curg stack (not the m->g0 scheduler stack) and
only while accounted for with $GOMAXPROCS (between calls
to exitsyscall and entersyscall).

The previous callback code did not block in startcgocallback's
approximation to exitsyscall, so if, say, the garbage collector
were running, it would still barge in and start doing things
like call malloc.  Similarly endcgocallback's approximation of
entersyscall did not call matchmg to kick off new OS threads
when necessary, which caused the bug in issue 1560.

Fixes #1560.

R=iant
CC=golang-dev
https://golang.org/cl/4253054
2011-03-07 10:37:42 -05:00
Russ Cox
324cc3d040 runtime: record goroutine creation pc and display in traceback
package main

func main() {
        go func() { *(*int)(nil) = 0 }()
        select{}
}

panic: runtime error: invalid memory address or nil pointer dereference

[signal 0xb code=0x1 addr=0x0 pc=0x1c96]

runtime.panic+0xac /Users/rsc/g/go/src/pkg/runtime/proc.c:1083
        runtime.panic(0x11bf0, 0xf8400011f0)
runtime.panicstring+0xa3 /Users/rsc/g/go/src/pkg/runtime/runtime.c:116
        runtime.panicstring(0x29a57, 0x0)
runtime.sigpanic+0x144 /Users/rsc/g/go/src/pkg/runtime/darwin/thread.c:470
        runtime.sigpanic()
main._func_001+0x16 /Users/rsc/g/go/src/pkg/runtime/x.go:188
        main._func_001()
runtime.goexit /Users/rsc/g/go/src/pkg/runtime/proc.c:150
        runtime.goexit()
----- goroutine created by -----
main.main+0x3d /Users/rsc/g/go/src/pkg/runtime/x.go:4

goroutine 1 [4]:
runtime.gosched+0x77 /Users/rsc/g/go/src/pkg/runtime/proc.c:598
        runtime.gosched()
runtime.block+0x27 /Users/rsc/g/go/src/pkg/runtime/chan.c:680
        runtime.block()
main.main+0x44 /Users/rsc/g/go/src/pkg/runtime/x.go:5
        main.main()
runtime.mainstart+0xf /Users/rsc/g/go/src/pkg/runtime/amd64/asm.s:77
        runtime.mainstart()
runtime.goexit /Users/rsc/g/go/src/pkg/runtime/proc.c:150
        runtime.goexit()
----- goroutine created by -----
_rt0_amd64+0x8e /Users/rsc/g/go/src/pkg/runtime/amd64/asm.s:64

Fixes #1563.

R=r
CC=golang-dev
https://golang.org/cl/4243046
2011-03-02 13:42:02 -05:00
Russ Cox
b5dfac45ba runtime: always run stackalloc on scheduler stack
Avoids deadlocks like the one below, in which a stack split happened
in order to call lock(&stacks), but then the stack unsplit cannot run
because stacks is now locked.

The only code calling stackalloc that wasn't on a scheduler
stack already was malg, which creates a new goroutine.

runtime.futex+0x23 /home/rsc/g/go/src/pkg/runtime/linux/amd64/sys.s:139
       runtime.futex()
futexsleep+0x50 /home/rsc/g/go/src/pkg/runtime/linux/thread.c:51
       futexsleep(0x5b0188, 0x300000003, 0x100020000, 0x4159e2)
futexlock+0x85 /home/rsc/g/go/src/pkg/runtime/linux/thread.c:119
       futexlock(0x5b0188, 0x5b0188)
runtime.lock+0x56 /home/rsc/g/go/src/pkg/runtime/linux/thread.c:158
       runtime.lock(0x5b0188, 0x7f0d27b4a000)
runtime.stackfree+0x4d /home/rsc/g/go/src/pkg/runtime/malloc.goc:336
       runtime.stackfree(0x7f0d27b4a000, 0x1000, 0x8, 0x7fff37e1e218)
runtime.oldstack+0xa6 /home/rsc/g/go/src/pkg/runtime/proc.c:705
       runtime.oldstack()
runtime.lessstack+0x22 /home/rsc/g/go/src/pkg/runtime/amd64/asm.s:224
       runtime.lessstack()
----- lessstack called from goroutine 2 -----
runtime.lock+0x56 /home/rsc/g/go/src/pkg/runtime/linux/thread.c:158
       runtime.lock(0x5b0188, 0x40a5e2)
runtime.stackalloc+0x55 /home/rsc/g/go/src/pkg/runtime/malloc.c:316
       runtime.stackalloc(0x1000, 0x4055b0)
runtime.malg+0x3d /home/rsc/g/go/src/pkg/runtime/proc.c:803
       runtime.malg(0x1000, 0x40add9)
runtime.newproc1+0x12b /home/rsc/g/go/src/pkg/runtime/proc.c:854
       runtime.newproc1(0xf840027440, 0x7f0d27b49230, 0x0, 0x49f238, 0x40, ...)
runtime.newproc+0x2f /home/rsc/g/go/src/pkg/runtime/proc.c:831
       runtime.newproc(0x0, 0xf840027440, 0xf800000010, 0x44b059)
...

R=r, r2
CC=golang-dev
https://golang.org/cl/4216045
2011-02-23 15:51:20 -05:00
Russ Cox
250977690b runtime: fix memory allocator for GOMAXPROCS > 1
Bitmaps were not being updated safely.
Depends on 4188053.

Fixes #1504.
May fix issue 1479.

R=r, r2
CC=golang-dev
https://golang.org/cl/4184048
2011-02-16 13:21:20 -05:00
Russ Cox
1e063b32cd runtime: faster allocator, garbage collector
GC is still single-threaded.
Multiple threads will happen in another CL.

Garbage collection pauses are typically
about half as long as they were before this CL.

R=brainman, iant, r
CC=golang-dev
https://golang.org/cl/3975046
2011-02-02 23:03:47 -05:00
Russ Cox
4608feb18b runtime: simpler heap map, memory allocation
The old heap maps used a multilevel table, but that
was overkill: there are only 1M entries on a 32-bit
machine and we can arrange to use a dense address
range on a 64-bit machine.

The heap map is in bss.  The assumption is that if
we don't touch the pages they won't be mapped in.

Also moved some duplicated memory allocation
code out of the OS-specific files.

R=r
CC=golang-dev
https://golang.org/cl/4118042
2011-01-28 15:03:26 -05:00
Russ Cox
bcd910cfe2 runtime: add per-pause gc stats
R=r, r2
CC=golang-dev
https://golang.org/cl/3980042
2011-01-19 13:41:42 -05:00
Russ Cox
0c54225b51 remove nacl
The recent linker changes broke NaCl support
a month ago, and there are no known users of it.

The NaCl code can always be recovered from the
repository history.

R=adg, r
CC=golang-dev
https://golang.org/cl/3671042
2010-12-15 11:49:23 -05:00
Russ Cox
68b4255a96 runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries.  The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.

The symbols left alone are:

	** known to cgo **
	_cgo_free
	_cgo_malloc
	libcgo_thread_start
	initcgo
	ncgocall

	** known to linker **
	_rt0_$GOARCH
	_rt0_$GOARCH_$GOOS
	text
	etext
	data
	end
	pclntab
	epclntab
	symtab
	esymtab

	** known to C compiler **
	_divv
	_modv
	_div64by32
	etc (arch specific)

Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.

Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.

R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 14:00:19 -04:00
Russ Cox
d4cc557b0d runtime: use manual stack for garbage collection
Old code was using recursion to traverse object graph.
New code uses an explicit stack, cutting the per-pointer
footprint to two words during the recursion and avoiding
the standard allocator and stack splitting code.

in test/garbage:

Reduces parser runtime by 2-3%
Reduces Peano runtime by 40%
Increases tree runtime by 4-5%

R=r
CC=golang-dev
https://golang.org/cl/2150042
2010-09-07 09:57:22 -04:00
Ian Lance Taylor
385bfd4ca0 Remove unused declaration.
R=rsc
CC=golang-dev
https://golang.org/cl/1686054
2010-07-16 11:05:38 -07:00
Russ Cox
c6138efbcb runtime: closures, defer bug fix for Native Client
Enable package tests for Native Client build.

R=r
CC=golang-dev
https://golang.org/cl/957042
2010-04-22 17:52:22 -07:00
Russ Cox
214a55b06a runtime: switch state back to Grunning after recovery
Fixes #733.

R=r
CC=golang-dev
https://golang.org/cl/958041
2010-04-21 16:27:41 -07:00
Russ Cox
72157c300b runtime: fix bad status throw
when garbage collector sees recovering goroutine

Fixes #711.

R=r
CC=golang-dev
https://golang.org/cl/869045
2010-04-08 13:24:53 -07:00
Russ Cox
24c58174b2 runtime: use explicit flag when finalizer goroutine is waiting
Avoids spurious wakeups during other sleeping by that goroutine.
Fixes #711.

R=r
CC=golang-dev
https://golang.org/cl/902041
2010-04-07 20:38:02 -07:00
Russ Cox
4e28cfe970 runtime: run all finalizers in a single goroutine.
eliminate second pass of mark+sweep
by scanning finalizer table specially.

R=r
CC=golang-dev
https://golang.org/cl/782041
2010-03-26 14:15:30 -07:00