1
0
mirror of https://github.com/golang/go synced 2024-10-04 10:21:21 -06:00
Commit Graph

280 Commits

Author SHA1 Message Date
Anthony Martin
64e041652a runtime: call symtabinit earlier
Otherwise, we won't get a stack trace in some of the early init.

Here's one example:

        http://build.golang.org/log/a96d10f6aee1fa3e3ae51f41da46d414a7ab02de

After walking the stack by hand in acid, I was able to determine
that the stackalloc inside mpreinit was causing the throw.

LGTM=rsc
R=rsc, dvyukov
CC=golang-codereviews
https://golang.org/cl/72450044
2014-03-12 19:42:58 -07:00
Shenghou Ma
84570aa9a1 runtime: round stack size to power of 2.
Fixes build on windows/386 and plan9/386.
Fixes #7487.

LGTM=mattn.jp, dvyukov, rsc
R=golang-codereviews, mattn.jp, dvyukov, 0intro, rsc
CC=golang-codereviews
https://golang.org/cl/72360043
2014-03-07 15:11:16 -05:00
Dmitriy Vyukov
1a89e6388c runtime: refactor and fix stack management code
There are at least 3 bugs:
1. g->stacksize accounting is broken during copystack/shrinkstack
2. stktop->free is not properly maintained during copystack/shrinkstack
3. stktop->free logic is broken:
        we can have stktop->free==FixedStack,
        and we will free it into stack cache,
        but it actually comes from heap as the result of non-copying segment shrink
This shows as at least spurious races on race builders (maybe something else as well I don't know).

The idea behind the refactoring is to consolidate stacksize and
segment origin logic in stackalloc/stackfree.

Fixes #7490.

LGTM=rsc, khr
R=golang-codereviews, rsc, khr
CC=golang-codereviews
https://golang.org/cl/72440043
2014-03-07 20:52:29 +04:00
Dmitriy Vyukov
a1695d2ea3 runtime: use custom thunks for race calls instead of cgo
Implement custom assembly thunks for hot race calls (memory accesses and function entry/exit).
The thunks extract caller pc, verify that the address is in heap or global and switch to g0 stack.

Before:
ok  	regexp	3.692s
ok  	compress/bzip2	9.461s
ok  	encoding/json	6.380s
After:
ok  	regexp	2.229s (-40%)
ok  	compress/bzip2	4.703s (-50%)
ok  	encoding/json	3.629s (-43%)

For comparison, normal non-race build:
ok  	regexp	0.348s
ok  	compress/bzip2	0.304s
ok  	encoding/json	0.661s
Race build:
ok  	regexp	2.229s (+540%)
ok  	compress/bzip2	4.703s (+1447%)
ok  	encoding/json	3.629s (+449%)

Also removes some race-related special cases from cgocall and scheduler.
In long-term it will allow to remove cyclic runtime/race dependency on cmd/cgo.

Fixes #4249.
Fixes #7460.
Update #6508
Update #6688

R=iant, rsc, bradfitz
CC=golang-codereviews
https://golang.org/cl/55100044
2014-03-06 23:48:30 +04:00
Dmitriy Vyukov
8ca3372d7b runtime: fix bad g status after copystack
LGTM=khr
R=khr
CC=golang-codereviews, rsc
https://golang.org/cl/69870054
2014-03-06 21:33:19 +04:00
Russ Cox
1249d3a518 runtime: handle Go calls C calls Go panic correctly on windows/386
32-bit Windows uses "structured exception handling" (SEH) to
handle hardware faults: that there is a per-thread linked list
of fault handlers maintained in user space instead of
something like Unix's signal handlers. The structures in the
linked list are required to live on the OS stack, and the
usual discipline is that the function that pushes a record
(allocated from the current stack frame) onto the list pops
that record before returning. Not to pop the entry before
returning creates a dangling pointer error: the list head
points to a stack frame that no longer exists.

Go pushes an SEH record in the top frame of every OS thread,
and that record suffices for all Go execution on that thread,
at least until cgo gets involved.

If we call into C using cgo, that called C code may push its
own SEH records, but by the convention it must pop them before
returning back to the Go code. We assume it does, and that's
fine.

If the C code calls back into Go, we want the Go SEH handler
to become active again, not whatever C has set up. So
runtime.callbackasm1, which handles a call from C back into
Go, pushes a new SEH record before calling the Go code and
pops it when the Go code returns. That's also fine.

It can happen that when Go calls C calls Go like this, the
inner Go code panics. We allow a defer in the outer Go to
recover the panic, effectively wiping not only the inner Go
frames but also the C calls. This sequence was not popping the
SEH stack up to what it was before the cgo calls, so it was
creating the dangling pointer warned about above. When
eventually the m stack was used enough to overwrite the
dangling SEH records, the SEH chain was lost, and any future
panic would not end up in Go's handler.

The bug in TestCallbackPanic and friends was thus creating a
situation where TestSetPanicOnFault - which causes a hardware
fault - would not find the Go fault handler and instead crash
the binary.

Add checks to TestCallbackPanicLocked to diagnose the mistake
in that test instead of leaving a bad state for another test
case to stumble over.

Fix bug by restoring SEH chain during deferred "endcgo"
cleanup.

This bug is likely present in Go 1.2.1, but since it depends
on Go calling C calling Go, with the inner Go panicking and
the outer Go recovering the panic, it seems not important
enough to bother fixing before Go 1.3. Certainly no one has
complained.

Fixes #7470.

LGTM=alex.brainman
R=golang-codereviews, alex.brainman
CC=golang-codereviews, iant, khr
https://golang.org/cl/71440043
2014-03-05 11:10:40 -05:00
Keith Randall
e9445547b6 runtime: move stack shrinking until after sweepgen is incremented.
Before GC, we flush all the per-P allocation caches.  Doing
stack shrinking mid-GC causes these caches to fill up.  At the
end of gc, the sweepgen is incremented which causes all of the
data in these caches to be in a bad state (cached but not yet
swept).

Move the stack shrinking until after sweepgen is incremented,
so any caching that happens as part of shrinking is done with
already-swept data.

Reenable stack copying.

LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/69620043
2014-02-27 14:20:15 -08:00
Keith Randall
f50a87058b runtime: disable stack copying
TBR=dvyukov

TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/69080045
2014-02-27 01:45:22 -08:00
Keith Randall
1665b006a5 runtime: grow stack by copying
On stack overflow, if all frames on the stack are
copyable, we copy the frames to a new stack twice
as large as the old one.  During GC, if a G is using
less than 1/4 of its stack, copy the stack to a stack
half its size.

TODO
- Do something about C frames.  When a C frame is in the
  stack segment, it isn't copyable.  We allocate a new segment
  in this case.
  - For idempotent C code, we can abort it, copy the stack,
    then retry.  I'm working on a separate CL for this.
  - For other C code, we can raise the stackguard
    to the lowest Go frame so the next call that Go frame
    makes triggers a copy, which will then succeed.
- Pick a starting stack size?

The plan is that eventually we reach a point where the
stack contains only copyable frames.

LGTM=rsc
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/54650044
2014-02-26 23:28:44 -08:00
Russ Cox
e56c6e7535 runtime/debug: add SetPanicOnFault
SetPanicOnFault allows recovery from unexpected memory faults.
This can be useful if you are using a memory-mapped file
or probing the address space of the current program.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/66590044
2014-02-20 16:18:05 -05:00
Russ Cox
67c83db60d runtime: use goc2c as much as possible
Package runtime's C functions written to be called from Go
started out written in C using carefully constructed argument
lists and the FLUSH macro to write a result back to memory.

For some functions, the appropriate parameter list ended up
being architecture-dependent due to differences in alignment,
so we added 'goc2c', which takes a .goc file containing Go func
declarations but C bodies, rewrites the Go func declaration to
equivalent C declarations for the target architecture, adds the
needed FLUSH statements, and writes out an equivalent C file.
That C file is compiled as part of package runtime.

Native Client's x86-64 support introduces the most complex
alignment rules yet, breaking many functions that could until
now be portably written in C. Using goc2c for those avoids the
breakage.

Separately, Keith's work on emitting stack information from
the C compiler would require the hand-written functions
to add #pragmas specifying how many arguments are result
parameters. Using goc2c for those avoids maintaining #pragmas.

For both reasons, use goc2c for as many Go-called C functions
as possible.

This CL is a replay of the bulk of CL 15400047 and CL 15790043,
both of which were reviewed as part of the NaCl port and are
checked in to the NaCl branch. This CL is part of bringing the
NaCl code into the main tree.

No new code here, just reformatting and occasional movement
into .h files.

LGTM=r
R=dave, alex.brainman, r
CC=golang-codereviews
https://golang.org/cl/65220044
2014-02-20 15:58:47 -05:00
Russ Cox
53061193f1 cmd/gc, runtime: enable precisestack by default
[Repeat of CL 64100044, after 32-bit fix in CL 66170043.]

Precisestack makes stack collection completely precise,
in the sense that there are no "used and not set" errors
in the collection of stack frames, no times where the collector
reads a pointer from a stack word that has not actually been
initialized with a pointer (possibly a nil pointer) in that function.

The most important part is interfaces: precisestack means
that if reading an interface value, the interface value is guaranteed
to be initialized, meaning that the type word can be relied
upon to be either nil or a valid interface type word describing
the data word.

This requires additional zeroing of certain values on the stack
on entry, which right now costs about 5% overall execution
time in all.bash. That cost will come down before Go 1.3
(issue 7345).

There are at least two known garbage collector bugs right now,
issues 7343 and 7344. The first happens even without precisestack.
The second I have only seen with precisestack, but that does not
mean that precisestack is what causes it. In fact it is very difficult
to explain by what precisestack does directly. Precisestack may
be exacerbating an existing problem. Both of those issues are
marked for Go 1.3 as well.

The reasons for enabling precisestack now are to give it more
time to soak and because the copying stack work depends on it.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/65820044
2014-02-19 17:09:08 -05:00
Russ Cox
aad23e708c undo CL 64100044 / 04d062c2e81c
broke 32-bit builds

««« original CL description
cmd/gc, runtime: enable precisestack by default

Precisestack makes stack collection completely precise,
in the sense that there are no "used and not set" errors
in the collection of stack frames, no times where the collector
reads a pointer from a stack word that has not actually been
initialized with a pointer (possibly a nil pointer) in that function.

The most important part is interfaces: precisestack means
that if reading an interface value, the interface value is guaranteed
to be initialized, meaning that the type word can be relied
upon to be either nil or a valid interface type word describing
the data word.

This requires additional zeroing of certain values on the stack
on entry, which right now costs about 5% overall execution
time in all.bash. That cost will come down before Go 1.3
(issue 7345).

There are at least two known garbage collector bugs right now,
issues 7343 and 7344. The first happens even without precisestack.
The second I have only seen with precisestack, but that does not
mean that precisestack is what causes it. In fact it is very difficult
to explain by what precisestack does directly. Precisestack may
be exacerbating an existing problem. Both of those issues are
marked for Go 1.3 as well.

The reasons for enabling precisestack now are to give it more
time to soak and because the copying stack work depends on it.

LGTM=r
R=r
CC=golang-codereviews, iant, khr
https://golang.org/cl/64100044
»»»

TBR=r
CC=golang-codereviews
https://golang.org/cl/65230043
2014-02-17 21:34:58 -05:00
Russ Cox
ecf700b5ee cmd/gc, runtime: enable precisestack by default
Precisestack makes stack collection completely precise,
in the sense that there are no "used and not set" errors
in the collection of stack frames, no times where the collector
reads a pointer from a stack word that has not actually been
initialized with a pointer (possibly a nil pointer) in that function.

The most important part is interfaces: precisestack means
that if reading an interface value, the interface value is guaranteed
to be initialized, meaning that the type word can be relied
upon to be either nil or a valid interface type word describing
the data word.

This requires additional zeroing of certain values on the stack
on entry, which right now costs about 5% overall execution
time in all.bash. That cost will come down before Go 1.3
(issue 7345).

There are at least two known garbage collector bugs right now,
issues 7343 and 7344. The first happens even without precisestack.
The second I have only seen with precisestack, but that does not
mean that precisestack is what causes it. In fact it is very difficult
to explain by what precisestack does directly. Precisestack may
be exacerbating an existing problem. Both of those issues are
marked for Go 1.3 as well.

The reasons for enabling precisestack now are to give it more
time to soak and because the copying stack work depends on it.

LGTM=r
R=r
CC=golang-codereviews, iant, khr
https://golang.org/cl/64100044
2014-02-17 20:12:40 -05:00
Dmitriy Vyukov
47534ddc68 runtime: remove misleading message during crash
The following checkdead message is false positive:

$ go test -race -c runtime
$ ./runtime.test -test.cpu=2 -test.run=TestSmhasherWindowed -test.v
=== RUN TestSmhasherWindowed-2
checkdead: find g 18 in status 1
SIGABRT: abort
PC=0x42bff1

LGTM=rsc
R=golang-codereviews, gobot, rsc
CC=golang-codereviews, iant, khr
https://golang.org/cl/59490046
2014-02-14 13:24:48 +04:00
Dmitriy Vyukov
eca55f5ac0 runtime: fix windows cpu profiler
Currently it periodically fails with the following message.
The immediate cause is the wrong base register when obtaining g
in sys_windows_amd64/386.s.
But there are several secondary problems as well.

runtime: unknown pc 0x0 after stack split
panic: invalid memory address or nil pointer dereference
fatal error: panic during malloc
[signal 0xc0000005 code=0x0 addr=0x60 pc=0x42267a]

runtime stack:
runtime.panic(0x7914c0, 0xc862af)
        c:/src/perfer/work/windows-amd64-a15f344a9efa/go/src/pkg/runtime/panic.c:217 +0x2c
runtime: unexpected return pc for runtime.externalthreadhandler called from 0x0

R=rsc, alex.brainman
CC=golang-codereviews
https://golang.org/cl/63310043
2014-02-14 09:20:51 +04:00
Dmitriy Vyukov
5e72fae9b2 runtime: improve cpu profiles for GC/syscalls/cgo
Current "System->etext" is not very informative.
Add parent "GC" frame.
Replace un-unwindable syscall/cgo frames with Go stack that leads to the call.

LGTM=rsc
R=rsc, alex.brainman, ality
CC=golang-codereviews
https://golang.org/cl/61270043
2014-02-12 22:31:36 +04:00
Dmitriy Vyukov
373e1e94d8 runtime: fix crash during cpu profiling
mp->mcache can be concurrently modified by runtime·helpgc.
In such case sigprof can remember mcache=nil, then helpgc sets it to non-nil,
then sigprof restores it back to nil, GC crashes with nil mcache.

R=rsc
CC=golang-codereviews
https://golang.org/cl/58860044
2014-02-10 20:24:47 +04:00
Dmitriy Vyukov
0229dc6dbe runtime: do not cpu profile idle threads on windows
Currently this leads to a significant skew towards 'etext' entry,
since all idle threads are profiled every tick.
Before:
Total: 66608 samples
   63188  94.9%  94.9%    63188  94.9% etext
     278   0.4%  95.3%      278   0.4% sweepspan
     216   0.3%  95.6%      448   0.7% runtime.mallocgc
     122   0.2%  95.8%      122   0.2% scanblock
     113   0.2%  96.0%      113   0.2% net/textproto.canonicalMIMEHeaderKey
After:
Total: 8008 samples
    3949  49.3%  49.3%     3949  49.3% etext
     231   2.9%  52.2%      231   2.9% scanblock
     211   2.6%  54.8%      211   2.6% runtime.cas64
     182   2.3%  57.1%      408   5.1% runtime.mallocgc
     178   2.2%  59.3%      178   2.2% runtime.atomicload64

LGTM=alex.brainman
R=golang-codereviews, alex.brainman
CC=golang-codereviews
https://golang.org/cl/61250043
2014-02-10 15:40:55 +04:00
Dmitriy Vyukov
179d41fecc runtime: tune P retake logic
When GOMAXPROCS>1 the last P in syscall is never retaken
(because there are already idle P's -- npidle>0).
This prevents sysmon thread from sleeping.
On a darwin machine the program from issue 6673 constantly
consumes ~0.2% CPU. With this change it stably consumes 0.0% CPU.
Fixes #6673.

R=golang-codereviews, r
CC=bradfitz, golang-codereviews, iant, khr
https://golang.org/cl/56990045
2014-01-27 23:17:46 +04:00
Dmitriy Vyukov
f8e0057bb7 sync: scalable Pool
Introduce fixed-size P-local caches.
When local caches overflow/underflow a batch of items
is transferred to/from global mutex-protected cache.

benchmark                    old ns/op    new ns/op    delta
BenchmarkPool                    50554        22423  -55.65%
BenchmarkPool-4                 400359         5904  -98.53%
BenchmarkPool-16                403311         1598  -99.60%
BenchmarkPool-32                367310         1526  -99.58%

BenchmarkPoolOverlflow            5214         3633  -30.32%
BenchmarkPoolOverlflow-4         42663         9539  -77.64%
BenchmarkPoolOverlflow-8         46919        11385  -75.73%
BenchmarkPoolOverlflow-16        39454        13048  -66.93%

BenchmarkSprintfEmpty                    84           63  -25.68%
BenchmarkSprintfEmpty-2                 371           32  -91.13%
BenchmarkSprintfEmpty-4                 465           22  -95.25%
BenchmarkSprintfEmpty-8                 565           12  -97.77%
BenchmarkSprintfEmpty-16                498            5  -98.87%
BenchmarkSprintfEmpty-32                492            4  -99.04%

BenchmarkSprintfString                  259          229  -11.58%
BenchmarkSprintfString-2                574          144  -74.91%
BenchmarkSprintfString-4                651           77  -88.05%
BenchmarkSprintfString-8                868           47  -94.48%
BenchmarkSprintfString-16               825           33  -95.96%
BenchmarkSprintfString-32               825           30  -96.28%

BenchmarkSprintfInt                     213          188  -11.74%
BenchmarkSprintfInt-2                   448          138  -69.20%
BenchmarkSprintfInt-4                   624           52  -91.63%
BenchmarkSprintfInt-8                   691           31  -95.43%
BenchmarkSprintfInt-16                  724           18  -97.46%
BenchmarkSprintfInt-32                  718           16  -97.70%

BenchmarkSprintfIntInt                  311          282   -9.32%
BenchmarkSprintfIntInt-2                333          145  -56.46%
BenchmarkSprintfIntInt-4                642          110  -82.87%
BenchmarkSprintfIntInt-8                832           42  -94.90%
BenchmarkSprintfIntInt-16               817           24  -97.00%
BenchmarkSprintfIntInt-32               805           22  -97.17%

BenchmarkSprintfPrefixedInt             309          269  -12.94%
BenchmarkSprintfPrefixedInt-2           245          168  -31.43%
BenchmarkSprintfPrefixedInt-4           598           99  -83.36%
BenchmarkSprintfPrefixedInt-8           770           67  -91.23%
BenchmarkSprintfPrefixedInt-16          829           54  -93.49%
BenchmarkSprintfPrefixedInt-32          824           50  -93.83%

BenchmarkSprintfFloat                   418          398   -4.78%
BenchmarkSprintfFloat-2                 295          203  -31.19%
BenchmarkSprintfFloat-4                 585          128  -78.12%
BenchmarkSprintfFloat-8                 873           60  -93.13%
BenchmarkSprintfFloat-16                884           33  -96.24%
BenchmarkSprintfFloat-32                881           29  -96.62%

BenchmarkManyArgs                      1097         1069   -2.55%
BenchmarkManyArgs-2                     705          567  -19.57%
BenchmarkManyArgs-4                     792          319  -59.72%
BenchmarkManyArgs-8                     963          172  -82.14%
BenchmarkManyArgs-16                   1115          103  -90.76%
BenchmarkManyArgs-32                   1133           90  -92.03%

LGTM=rsc
R=golang-codereviews, bradfitz, minux.ma, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/46010043
2014-01-24 22:29:53 +04:00
Dmitriy Vyukov
9cbd2fb1aa runtime: remove locks from netpoll hotpaths
Introduces two-phase goroutine parking mechanism -- prepare to park, commit park.
This mechanism does not require backing mutex to protect wait predicate.
Use it in netpoll. See comment in netpoll.goc for details.
This slightly reduces contention between reader, writer and read/write io notifications;
and just eliminates a bunch of mutex operations from hotpaths, thus making then faster.

benchmark                             old ns/op    new ns/op    delta
BenchmarkTCP4ConcurrentReadWrite           2109         1945   -7.78%
BenchmarkTCP4ConcurrentReadWrite-2         1162         1113   -4.22%
BenchmarkTCP4ConcurrentReadWrite-4          798          755   -5.39%
BenchmarkTCP4ConcurrentReadWrite-8          803          748   -6.85%
BenchmarkTCP4Persistent                    9411         9240   -1.82%
BenchmarkTCP4Persistent-2                  5888         5813   -1.27%
BenchmarkTCP4Persistent-4                  4016         3968   -1.20%
BenchmarkTCP4Persistent-8                  3943         3857   -2.18%

R=golang-codereviews, mikioh.mikioh, gobot, iant, rsc
CC=golang-codereviews, khr
https://golang.org/cl/45700043
2014-01-22 11:27:16 +04:00
Dmitriy Vyukov
98b50b89a8 runtime: allocate goroutine ids in batches
Helps reduce contention on sched.goidgen.

benchmark                               old ns/op    new ns/op    delta
BenchmarkCreateGoroutines-16                  259          237   -8.49%
BenchmarkCreateGoroutinesParallel-16          127           43  -66.06%

R=golang-codereviews, dave, bradfitz, khr
CC=golang-codereviews, rsc
https://golang.org/cl/46970043
2014-01-22 10:34:36 +04:00
Dmitriy Vyukov
8a3c587dc1 runtime: fix and improve CPU profiling
- do not lose profiling signals when we have no mcache (possible for syscalls/cgo)
- do not lose any profiling signals on windows
- fix profiling of cgo programs on windows (they had no m->thread setup)
- properly setup tls in cgo programs on windows
- check _beginthread return value

Fixes #6417.
Fixes #6986.

R=alex.brainman, rsc
CC=golang-codereviews
https://golang.org/cl/44820047
2014-01-22 10:30:10 +04:00
Dmitriy Vyukov
cb133c6607 runtime: do not collect GC roots explicitly
Currently we collect (add) all roots into a global array in a single-threaded GC phase.
This hinders parallelism.
With this change we just kick off parallel for for number_of_goroutines+5 iterations.
Then parallel for callback decides whether it needs to scan stack of a goroutine
scan data segment, scan finalizers, etc. This eliminates the single-threaded phase entirely.
This requires to store all goroutines in an array instead of a linked list
(to allow direct indexing).
This CL also removes DebugScan functionality. It is broken because it uses
unbounded stack, so it can not run on g0. When it was working, I've found
it helpless for debugging issues because the two algorithms are too different now.
This change would require updating the DebugScan, so it's simpler to just delete it.

With 8 threads this change reduces GC pause by ~6%, while keeping cputime roughly the same.

garbage-8
allocated                 2987886      2989221      +0.04%
allocs                      62885        62887      +0.00%
cputime                  21286000     21272000      -0.07%
gc-pause-one             26633247     24885421      -6.56%
gc-pause-total             873570       811264      -7.13%
rss                     242089984    242515968      +0.18%
sys-gc                   13934336     13869056      -0.47%
sys-heap                205062144    205062144      +0.00%
sys-other                12628288     12628288      +0.00%
sys-stack                11534336     11927552      +3.41%
sys-total               243159104    243487040      +0.13%
time                      2809477      2740795      -2.44%

R=golang-codereviews, rsc
CC=cshapiro, golang-codereviews, khr
https://golang.org/cl/46860043
2014-01-21 13:06:57 +04:00
Dmitriy Vyukov
1ba04c171a runtime: per-P defer pool
Instead of a per-goroutine stack of defers for all sizes,
introduce per-P defer pool for argument sizes 8, 24, 40, 56, 72 bytes.

For a program that starts 1e6 goroutines and then joins then:
old: rss=6.6g virtmem=10.2g time=4.85s
new: rss=4.5g virtmem= 8.2g time=3.48s

R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/42750044
2014-01-21 11:20:23 +04:00
Dmitriy Vyukov
90eca36a23 runtime: ensure fair scheduling during frequent GCs
What was happenning is as follows:
Each writer goroutine always triggers GC during its scheduling quntum.
After GC goroutines are shuffled so that the timer goroutine is always second in the queue.
This repeats infinitely, causing timer goroutine starvation.
Fixes #7126.

R=golang-codereviews, shanemhansen, khr, khr
CC=golang-codereviews
https://golang.org/cl/53080043
2014-01-21 10:24:42 +04:00
Aram Hăvărneanu
a46b434931 runtime: add support for GOOS=solaris
R=alex.brainman, dave, jsing, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/35990043
2014-01-17 17:58:10 +13:00
Dmitriy Vyukov
c0b9e6218c runtime: output how long goroutines are blocked
Example of output:

goroutine 4 [sleep for 3 min]:
time.Sleep(0x34630b8a000)
        src/pkg/runtime/time.goc:31 +0x31
main.func·002()
        block.go:16 +0x2c
created by main.main
        block.go:17 +0x33

Full program and output are here:
http://play.golang.org/p/NEZdADI3Td

Fixes #6809.

R=golang-codereviews, khr, kamil.kisiel, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/50420043
2014-01-16 12:54:46 +04:00
Dmitriy Vyukov
4722b1cbd3 runtime: use lock-free ring for work queues
Use lock-free fixed-size ring for work queues
instead of an unbounded mutex-protected array.
The ring has single producer and multiple consumers.
If the ring overflows, work is put onto global queue.

benchmark              old ns/op    new ns/op    delta
BenchmarkMatmult               7            5  -18.12%
BenchmarkMatmult-4             2            2  -18.98%
BenchmarkMatmult-16            1            0  -12.84%

BenchmarkCreateGoroutines                     105           88  -16.10%
BenchmarkCreateGoroutines-4                   376          219  -41.76%
BenchmarkCreateGoroutines-16                  241          174  -27.80%
BenchmarkCreateGoroutinesParallel             103           87  -14.66%
BenchmarkCreateGoroutinesParallel-4           169          143  -15.38%
BenchmarkCreateGoroutinesParallel-16          158          151   -4.43%

R=golang-codereviews, rsc
CC=ddetlefs, devon.odell, golang-codereviews
https://golang.org/cl/46170044
2014-01-16 12:17:00 +04:00
Dmitriy Vyukov
e0dcf73d61 runtime: fix comment
Void function can not return false.

R=golang-codereviews, bradfitz
CC=golang-codereviews
https://golang.org/cl/52000043
2014-01-14 12:58:13 +04:00
Keith Randall
020b39c3f3 runtime: use special records hung off the MSpan to
record finalizers and heap profile info.  Enables
removing the special bit from the heap bitmap.  Also
provides a generic mechanism for annotating occasional
heap objects.

finalizers
        overhead      per obj
old	680 B	      80 B avg
new	16 B/span     48 B

profile
        overhead      per obj
old	32KB	      24 B + hash tables
new	16 B/span     24 B

R=cshapiro, khr, dvyukov, gobot
CC=golang-codereviews
https://golang.org/cl/13314053
2014-01-07 13:45:50 -08:00
Russ Cox
bc135f6492 runtime: fix crash in runtime.GoroutineProfile
This is a possible Go 1.2.1 candidate.

Fixes #6946.

R=iant, r
CC=golang-dev
https://golang.org/cl/41640043
2013-12-13 15:44:57 -05:00
Dmitriy Vyukov
f6329700ae runtime: remove nomemprof
Nomemprof seems to be unneeded now, there is no recursion.
If the recursion will be re-introduced, it will break loudly by deadlocking.
Fixes #6566.

R=golang-dev, minux.ma, rsc
CC=golang-dev
https://golang.org/cl/14695044
2013-10-18 10:45:19 +04:00
Ian Lance Taylor
88c448ba40 runtime: correct test for when to poll network
Fixes #6610.

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/14793043
2013-10-17 11:57:48 -07:00
Russ Cox
551ada4742 runtime: avoid allocation of internal panic values
If a fault happens in malloc, inevitably the next thing that happens
is a deadlock trying to allocate the panic value that says the fault
happened. Stop doing that, two ways.

First, reject panic in malloc just as we reject panic in garbage collection.

Second, runtime.panicstring was using an error implementation
backed by a Go string, so the interface held an allocated *string.
Since the actual errors are C strings, define a new error
implementation backed by a C char*, which needs no indirection
and therefore no allocation.

This second fix will avoid allocation for errors like nil panic derefs
or division by zero, so it is worth doing even though the first fix
should take care of faults during malloc.

Update #6419

R=golang-dev, dvyukov, dave
CC=golang-dev
https://golang.org/cl/13774043
2013-09-20 15:15:25 -04:00
Russ Cox
30ecb4cd05 build: disable precise collection of stack frames
The code for call site-specific pointer bitmaps was not ready in time,
but the zeroing required without it is too expensive to use by default.
We will have to wait for precise collection of stack frames until Go 1.3.

The precise collection can be re-enabled by

        GOEXPERIMENT=precisestack ./all.bash

but that will not be the default for a Go 1.2 build.

Fixes #6087.

R=golang-dev, jeremyjackins, dan.kortschak, r
CC=golang-dev
https://golang.org/cl/13677045
2013-09-16 20:26:10 -04:00
Russ Cox
6d68fc8eea runtime: fix CPU profiling on Windows
The test 'gp == m->curg' is not valid on Windows,
because the goroutine being profiled is not from the
current m.

TBR=golang-dev
CC=golang-dev
https://golang.org/cl/13718043
2013-09-15 12:05:24 -04:00
Russ Cox
439f9397fc runtime: avoid inconsistent goroutine state in profiler
Because profiling signals can arrive at any time, we must
handle the case where a profiling signal arrives halfway
through a goroutine switch. Luckily, although there is much
to think through, very little needs to change.

Fixes #6000.
Fixes #6015.

R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/13421048
2013-09-13 14:19:23 -04:00
Russ Cox
7276c02b41 runtime, cmd/gc, cmd/ld: ignore method wrappers in recover
Bug #1:

Issue 5406 identified an interesting case:
        defer iface.M()
may end up calling a wrapper that copies an indirect receiver
from the iface value and then calls the real M method. That's
two calls down, not just one, and so recover() == nil always
in the real M method, even during a panic.

[For the purposes of this entire discussion, a wrapper's
implementation is a function containing an ordinary call, not
the optimized tail call form that is somtimes possible. The
tail call does not create a second frame, so it is already
handled correctly.]

Fix this bug by introducing g->panicwrap, which counts the
number of bytes on current stack segment that are due to
wrapper calls that should not count against the recover
check. All wrapper functions must now adjust g->panicwrap up
on entry and back down on exit. This adds slightly to their
expense; on the x86 it is a single instruction at entry and
exit; on the ARM it is three. However, the alternative is to
make a call to recover depend on being able to walk the stack,
which I very much want to avoid. We have enough problems
walking the stack for garbage collection and profiling.
Also, if performance is critical in a specific case, it is already
faster to use a pointer receiver and avoid this kind of wrapper
entirely.

Bug #2:

The old code, which did not consider the possibility of two
calls, already contained a check to see if the call had split
its stack and so the panic-created segment was one behind the
current segment. In the wrapper case, both of the two calls
might split their stacks, so the panic-created segment can be
two behind the current segment.

Fix this by propagating the Stktop.panic flag forward during
stack splits instead of looking backward during recover.

Fixes #5406.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/13367052
2013-09-12 14:00:16 -04:00
Keith Randall
32b770b2c0 runtime: jump to badmcall instead of calling it.
This replaces the mcall frame with the badmcall frame instead of
leaving the mcall frame on the stack and adding the badmcall frame.
Because mcall is no longer on the stack, traceback will now report what
called mcall, which is what we would like to see in this situation.

R=golang-dev, cshapiro
CC=golang-dev
https://golang.org/cl/13012044
2013-08-29 15:53:34 -07:00
Russ Cox
665feeedcb runtime: impose thread count limit
Actually working to stay within the limit could cause subtle deadlocks.
Crashing avoids the subtlety.

Fixes #4056.

R=golang-dev, r, dvyukov
CC=golang-dev
https://golang.org/cl/13037043
2013-08-16 22:25:26 -04:00
Dmitriy Vyukov
187b9c695f runtime: fix goroutine stack accounting
Fixes #6166.
Fixes #6168.

R=golang-dev, bradfitz, remyoudompheng
CC=golang-dev
https://golang.org/cl/12927045
2013-08-16 21:04:05 +04:00
Russ Cox
757e0de89f runtime: impose stack size limit
The goal is to stop only those programs that would keep
going and run the machine out of memory, but before they do that.
1 GB on 64-bit, 250 MB on 32-bit.
That seems implausibly large, and it can be adjusted.

Fixes #2556.
Fixes #4494.
Fixes #5173.

R=khr, r, dvyukov
CC=golang-dev
https://golang.org/cl/12541052
2013-08-15 22:34:06 -04:00
Dmitriy Vyukov
f195ae94ca runtime: remove old preemption checks
runtime.gcwaiting checks are not needed anymore

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/12934043
2013-08-15 14:32:10 +04:00
Russ Cox
1a09d70e23 runtime: fix build on arm
Do not use ? :
I cannot say this enough.

TBR=dvyukov
CC=golang-dev
https://golang.org/cl/12903043
2013-08-13 19:37:54 -04:00
Dmitriy Vyukov
4f2e382c9f runtime: dump scheduler state if GODEBUG=schedtrace is set
The schedtrace value sets dump period in milliseconds.
In default mode the trace looks as follows:
SCHED 0ms: gomaxprocs=4 idleprocs=0 threads=3 idlethreads=0 runqueue=0 [1 0 0 0]
SCHED 1001ms: gomaxprocs=4 idleprocs=3 threads=6 idlethreads=3 runqueue=0 [0 0 0 0]
SCHED 2008ms: gomaxprocs=4 idleprocs=1 threads=6 idlethreads=1 runqueue=0 [0 1 0 0]
If GODEBUG=scheddetail=1 is set as well, then the detailed trace is printed:
SCHED 0ms: gomaxprocs=4 idleprocs=0 threads=3 idlethreads=0 runqueue=0 singleproc=0 gcwaiting=1 mlocked=0 nmspinning=0 stopwait=0 sysmonwait=0
  P0: status=3 tick=1 m=0 runqsize=1/128 gfreecnt=0
  P1: status=3 tick=0 m=-1 runqsize=0/128 gfreecnt=0
  P2: status=3 tick=0 m=-1 runqsize=0/128 gfreecnt=0
  P3: status=3 tick=0 m=-1 runqsize=0/128 gfreecnt=0
  M2: p=-1 curg=-1 mallocing=0 throwing=0 gcing=0 locks=1 dying=0 helpgc=0 spinning=0 lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 gcing=0 locks=1 dying=0 helpgc=0 spinning=0 lockedg=-1
  M0: p=0 curg=1 mallocing=0 throwing=0 gcing=0 locks=1 dying=0 helpgc=0 spinning=0 lockedg=1
  G1: status=2() m=0 lockedm=0
  G2: status=1() m=-1 lockedm=-1

R=golang-dev, raggi, rsc
CC=golang-dev
https://golang.org/cl/11435044
2013-08-14 00:30:55 +04:00
Dmitriy Vyukov
4961483e7d runtime: fix LockOSThread
Fixes #6100.

R=golang-dev, dave, bradfitz, rsc
CC=golang-dev
https://golang.org/cl/12703045
2013-08-13 22:37:04 +04:00
Dmitriy Vyukov
f9066fe1c0 runtime: more reliable preemption
Currently it's possible that a goroutine
that periodically executes non-blocking
cgo/syscalls is never preempted.
This change splits scheduler and syscall
ticks to prevent such situation.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12658045
2013-08-13 22:14:04 +04:00
Dmitriy Vyukov
cc4e6aad8e runtime: do no lose CPU profiling signals
Currently we lose lots of profiling signals.
Most notably, GC is not accounted at all.
But stack splits, scheduler, syscalls, etc are lost as well.
This creates seriously misleading profile.
With this change all profiling signals are accounted.
Now I see these additional entries that were previously absent:
161  29.7%  29.7%      164  30.3% syscall.Syscall
 12   2.2%  50.9%       12   2.2% scanblock
 11   2.0%  55.0%       11   2.0% markonly
 10   1.8%  58.9%       10   1.8% sweepspan
  2   0.4%  85.8%        2   0.4% runtime.newstack
It is still impossible to understand what causes stack splits,
but at least it's clear how many time is spent on them.
Update #2197.
Update #5659.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12179043
2013-08-13 22:12:02 +04:00