1
0
mirror of https://github.com/golang/go synced 2024-10-04 16:21:22 -06:00
Commit Graph

324 Commits

Author SHA1 Message Date
Russ Cox
5a23a7e52c runtime: enable 'bad pointer' check during garbage collection of Go stack frames
This is the same check we use during stack copying.
The check cannot be applied to C stack frames, even
though we do emit pointer bitmaps for the arguments,
because (1) the pointer bitmaps assume all arguments
are always live, not true of outputs during the prologue,
and (2) the pointer bitmaps encode interface values as
pointer pairs, not true of interfaces holding integers.

For the rest of the frames, however, we should hold ourselves
to the rule that a pointer marked live really is initialized.
The interface scanning already implicitly checks this
because it interprets the type word  as a valid type pointer.

This may slow things down a little because of the extra loads.
Or it may speed things up because we don't bother enqueuing
nil pointers anymore. Enough of the rest of the system is slow
right now that we can't measure it meaningfully.
Enable for now, even if it is slow, to shake out bugs in the
liveness bitmaps, and then decide whether to turn it off
for the Go 1.3 release (issue 7650 reminds us to do this).

The new m->traceback field lets us force printing of fp=
values on all goroutine stack traces when we detect a
bad pointer. This makes it easier to understand exactly
where in the frame the bad pointer is, so that we can trace
it back to a specific variable and determine what is wrong.

Update #7650

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80860044
2014-03-27 14:06:15 -04:00
Keith Randall
fff63c2448 runtime: WriteHeapDump dumps the heap to a file.
See http://golang.org/s/go13heapdump for the file format.

LGTM=rsc
R=rsc, bradfitz, dvyukov, khr
CC=golang-codereviews
https://golang.org/cl/37540043
2014-03-25 15:09:49 -07:00
Keith Randall
1b45cc45e3 runtime: redo stack map entries to avoid false retention
Change two-bit stack map entries to encode:
0 = dead
1 = scalar
2 = pointer
3 = multiword

If multiword, the two-bit entry for the following word encodes:
0 = string
1 = slice
2 = iface
3 = eface

That way, during stack scanning we can check if a string
is zero length or a slice has zero capacity.  We can avoid
following the contained pointer in those cases.  It is safe
to do so because it can never be dereferenced, and it is
desirable to do so because it may cause false retention
of the following block in memory.

Slice feature turned off until issue 7564 is fixed.

Update #7549

LGTM=rsc
R=golang-codereviews, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/76380043
2014-03-25 14:11:34 -07:00
Russ Cox
3750904a7e runtime: use VEH, not SEH, for windows/386 exception handling
Structured Exception Handling (SEH) was the first way to handle
exceptions (memory faults, divides by zero) on Windows.
The S might as well stand for "stack-based": the implementation
interprets stack addresses in a few different ways, and it gets
subtly confused by Go's management of stacks. It's also something
that requires active maintenance during cgo switches, and we've
had bugs in that maintenance in the past.

We have recently come to believe that SEH cannot work with
Go's stack usage. See http://golang.org/issue/7325 for details.

Vectored Exception Handling (VEH) is more like a Unix signal
handler: you set it once for the whole process and forget about it.

This CL drops all the SEH code and replaces it with VEH code.
Many special cases and 7 #ifdefs disappear.

VEH was introduced in Windows XP, so Go on windows/386 will
now require Windows XP or later. The previous requirement was
Windows 2000 or later. Windows 2000 immediately preceded
Windows XP, so Windows 2000 is the only affected version.
Microsoft stopped supporting Windows 2000 in 2010.
See http://golang.org/s/win2000-golang-nuts for details.

Fixes #7325.

LGTM=alex.brainman, r
R=golang-codereviews, alex.brainman, stephen.gutekanst, dave
CC=golang-codereviews, iant, r
https://golang.org/cl/74790043
2014-03-24 21:22:16 -04:00
Aram Hăvărneanu
199e703083 runtime: fix use after close race in Solaris network poller
The Solaris network poller uses event ports, which are
level-triggered. As such, it has to re-arm itself after each
wakeup. The arming mechanism (which runs in its own thread) raced
with the closing of a file descriptor happening in a different
thread. When a network file descriptor is about to be closed,
the network poller is awaken to give it a chance to remove its
association with the file descriptor. Because the poller always
re-armed itself, it raced with code that closed the descriptor.

This change makes the network poller check before re-arming if
the file descriptor is about to be closed, in which case it will
ignore the re-arming request. It uses the per-PollDesc lock in
order to serialize access to the PollDesc.

This change also adds extensive documentation describing the
Solaris implementation of the network poller.

Fixes #7410.

LGTM=dvyukov, iant
R=golang-codereviews, bradfitz, iant, dvyukov, aram.h, gobot
CC=golang-codereviews
https://golang.org/cl/69190044
2014-03-14 17:53:05 +04:00
Anthony Martin
41aa887be5 runtime: fix signal handling on Plan 9
LGTM=rsc
R=rsc, 0intro, aram, jeremyjackins, iant
CC=golang-codereviews, lucio.dere, minux.ma, paurea, r
https://golang.org/cl/9796043
2014-03-13 09:00:12 -07:00
Dmitriy Vyukov
e678ab4e37 runtime: detect stack split after fork
This check would allowed to easily prevent issue 7511.
Update #7511

LGTM=rsc
R=rsc, aram
CC=golang-codereviews
https://golang.org/cl/75260043
2014-03-13 17:41:08 +04:00
Dmitriy Vyukov
1569628725 runtime: harden conditions when runtime panics on crash
This is especially important for SetPanicOnCrash,
but also useful for e.g. nil deref in mallocgc.
Panics on such crashes can't lead to anything useful,
only to deadlocks, hangs and obscure crashes.
This is a copy of broken but already LGTMed
https://golang.org/cl/68540043/

TBR=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/75320043
2014-03-13 13:25:59 +04:00
Dave Cheney
6431be3fe4 runtime: more Native Client fixes
Thanks to Ian for spotting these.

runtime.h: define uintreg correctly.
stack.c: address warning caused by the type of uintreg being 32 bits on amd64p32.

Commentary (mainly for my own use)

nacl/amd64p32 defines a machine with 64bit registers, but address space is limited to a 4gb window (the window is placed randomly inside the full 48 bit virtual address space of a process). To cope with this 6c defines _64BIT and _64BITREG.

_64BITREG is always defined by 6c, so both GOARCH=amd64 and GOARCH=amd64p32 use 64bit wide registers.

However _64BIT itself is only defined when 6c is compiling for amd64 targets. The definition is elided for amd64p32 environments causing int, uint and other arch specific types to revert to their 32bit definitions.

LGTM=iant
R=iant, rsc, remyoudompheng
CC=golang-codereviews
https://golang.org/cl/72860046
2014-03-11 14:43:10 +11:00
Shenghou Ma
84570aa9a1 runtime: round stack size to power of 2.
Fixes build on windows/386 and plan9/386.
Fixes #7487.

LGTM=mattn.jp, dvyukov, rsc
R=golang-codereviews, mattn.jp, dvyukov, 0intro, rsc
CC=golang-codereviews
https://golang.org/cl/72360043
2014-03-07 15:11:16 -05:00
Dmitriy Vyukov
1a89e6388c runtime: refactor and fix stack management code
There are at least 3 bugs:
1. g->stacksize accounting is broken during copystack/shrinkstack
2. stktop->free is not properly maintained during copystack/shrinkstack
3. stktop->free logic is broken:
        we can have stktop->free==FixedStack,
        and we will free it into stack cache,
        but it actually comes from heap as the result of non-copying segment shrink
This shows as at least spurious races on race builders (maybe something else as well I don't know).

The idea behind the refactoring is to consolidate stacksize and
segment origin logic in stackalloc/stackfree.

Fixes #7490.

LGTM=rsc, khr
R=golang-codereviews, rsc, khr
CC=golang-codereviews
https://golang.org/cl/72440043
2014-03-07 20:52:29 +04:00
Dmitriy Vyukov
f946a7ca09 runtime: fix memory corruption and leak in recursive panic handling
Recursive panics leave dangling Panic structs in g->panic stack.
At best it leads to a Defer leak and incorrect output on a subsequent panic.
At worst it arbitrary corrupts heap.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/72480043
2014-03-07 20:50:30 +04:00
Dmitriy Vyukov
a1695d2ea3 runtime: use custom thunks for race calls instead of cgo
Implement custom assembly thunks for hot race calls (memory accesses and function entry/exit).
The thunks extract caller pc, verify that the address is in heap or global and switch to g0 stack.

Before:
ok  	regexp	3.692s
ok  	compress/bzip2	9.461s
ok  	encoding/json	6.380s
After:
ok  	regexp	2.229s (-40%)
ok  	compress/bzip2	4.703s (-50%)
ok  	encoding/json	3.629s (-43%)

For comparison, normal non-race build:
ok  	regexp	0.348s
ok  	compress/bzip2	0.304s
ok  	encoding/json	0.661s
Race build:
ok  	regexp	2.229s (+540%)
ok  	compress/bzip2	4.703s (+1447%)
ok  	encoding/json	3.629s (+449%)

Also removes some race-related special cases from cgocall and scheduler.
In long-term it will allow to remove cyclic runtime/race dependency on cmd/cgo.

Fixes #4249.
Fixes #7460.
Update #6508
Update #6688

R=iant, rsc, bradfitz
CC=golang-codereviews
https://golang.org/cl/55100044
2014-03-06 23:48:30 +04:00
Russ Cox
1249d3a518 runtime: handle Go calls C calls Go panic correctly on windows/386
32-bit Windows uses "structured exception handling" (SEH) to
handle hardware faults: that there is a per-thread linked list
of fault handlers maintained in user space instead of
something like Unix's signal handlers. The structures in the
linked list are required to live on the OS stack, and the
usual discipline is that the function that pushes a record
(allocated from the current stack frame) onto the list pops
that record before returning. Not to pop the entry before
returning creates a dangling pointer error: the list head
points to a stack frame that no longer exists.

Go pushes an SEH record in the top frame of every OS thread,
and that record suffices for all Go execution on that thread,
at least until cgo gets involved.

If we call into C using cgo, that called C code may push its
own SEH records, but by the convention it must pop them before
returning back to the Go code. We assume it does, and that's
fine.

If the C code calls back into Go, we want the Go SEH handler
to become active again, not whatever C has set up. So
runtime.callbackasm1, which handles a call from C back into
Go, pushes a new SEH record before calling the Go code and
pops it when the Go code returns. That's also fine.

It can happen that when Go calls C calls Go like this, the
inner Go code panics. We allow a defer in the outer Go to
recover the panic, effectively wiping not only the inner Go
frames but also the C calls. This sequence was not popping the
SEH stack up to what it was before the cgo calls, so it was
creating the dangling pointer warned about above. When
eventually the m stack was used enough to overwrite the
dangling SEH records, the SEH chain was lost, and any future
panic would not end up in Go's handler.

The bug in TestCallbackPanic and friends was thus creating a
situation where TestSetPanicOnFault - which causes a hardware
fault - would not find the Go fault handler and instead crash
the binary.

Add checks to TestCallbackPanicLocked to diagnose the mistake
in that test instead of leaving a bad state for another test
case to stumble over.

Fix bug by restoring SEH chain during deferred "endcgo"
cleanup.

This bug is likely present in Go 1.2.1, but since it depends
on Go calling C calling Go, with the inner Go panicking and
the outer Go recovering the panic, it seems not important
enough to bother fixing before Go 1.3. Certainly no one has
complained.

Fixes #7470.

LGTM=alex.brainman
R=golang-codereviews, alex.brainman
CC=golang-codereviews, iant, khr
https://golang.org/cl/71440043
2014-03-05 11:10:40 -05:00
Keith Randall
1665b006a5 runtime: grow stack by copying
On stack overflow, if all frames on the stack are
copyable, we copy the frames to a new stack twice
as large as the old one.  During GC, if a G is using
less than 1/4 of its stack, copy the stack to a stack
half its size.

TODO
- Do something about C frames.  When a C frame is in the
  stack segment, it isn't copyable.  We allocate a new segment
  in this case.
  - For idempotent C code, we can abort it, copy the stack,
    then retry.  I'm working on a separate CL for this.
  - For other C code, we can raise the stackguard
    to the lowest Go frame so the next call that Go frame
    makes triggers a copy, which will then succeed.
- Pick a starting stack size?

The plan is that eventually we reach a point where the
stack contains only copyable frames.

LGTM=rsc
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/54650044
2014-02-26 23:28:44 -08:00
Keith Randall
3b5278fca6 runtime: get rid of the settype buffer and lock.
MCaches	now hold a MSpan for each sizeclass which they have
exclusive access to allocate from, so no lock is needed.

Modifying the heap bitmaps also no longer requires a cas.

runtime.free gets more expensive.  But we don't use it
much any more.

It's not much faster on 1 processor, but it's a lot
faster on multiple processors.

benchmark                 old ns/op    new ns/op    delta
BenchmarkSetTypeNoPtr1           24           23   -0.42%
BenchmarkSetTypeNoPtr2           33           34   +0.89%
BenchmarkSetTypePtr1             51           49   -3.72%
BenchmarkSetTypePtr2             55           54   -1.98%

benchmark                old ns/op    new ns/op    delta
BenchmarkAllocation          52739        50770   -3.73%
BenchmarkAllocation-2        33957        34141   +0.54%
BenchmarkAllocation-3        33326        29015  -12.94%
BenchmarkAllocation-4        38105        25795  -32.31%
BenchmarkAllocation-5        68055        24409  -64.13%
BenchmarkAllocation-6        71544        23488  -67.17%
BenchmarkAllocation-7        68374        23041  -66.30%
BenchmarkAllocation-8        70117        20758  -70.40%

LGTM=rsc, dvyukov
R=dvyukov, bradfitz, khr, rsc
CC=golang-codereviews
https://golang.org/cl/46810043
2014-02-26 15:52:58 -08:00
Russ Cox
e8fe1cce66 runtime, net: fixes from CL 68490043 review
These are mistakes in the first big NaCl CL.

LGTM=minux.ma, iant
R=golang-codereviews, minux.ma, iant
CC=golang-codereviews
https://golang.org/cl/69200043
2014-02-26 12:21:31 -05:00
Dave Cheney
7c8280c9ef all: merge NaCl branch (part 1)
See golang.org/s/go13nacl for design overview.

This CL is the mostly mechanical changes from rsc's Go 1.2 based NaCl branch, specifically 39cb35750369 to 500771b477cf from https://code.google.com/r/rsc-go13nacl. This CL does not include working NaCl support, there are probably two or three more large merges to come.

CL 15750044 is not included as it involves more invasive changes to the linker which will need to be merged separately.

The exact change lists included are

15050047: syscall: support for Native Client
15360044: syscall: unzip implementation for Native Client
15370044: syscall: Native Client SRPC implementation
15400047: cmd/dist, cmd/go, go/build, test: support for Native Client
15410048: runtime: support for Native Client
15410049: syscall: file descriptor table for Native Client
15410050: syscall: in-memory file system for Native Client
15440048: all: update +build lines for Native Client port
15540045: cmd/6g, cmd/8g, cmd/gc: support for Native Client
15570045: os: support for Native Client
15680044: crypto/..., hash/crc32, reflect, sync/atomic: support for amd64p32
15690044: net: support for Native Client
15690048: runtime: support for fake time like on Go Playground
15690051: build: disable various tests on Native Client

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/68150047
2014-02-25 09:47:42 -05:00
Aram Hăvărneanu
50df136483 runtime, net: add support for GOOS=solaris
LGTM=dave, rsc
R=golang-codereviews, minux.ma, mikioh.mikioh, dave, iant, rsc
CC=golang-codereviews
https://golang.org/cl/36030043
2014-02-24 22:31:01 -05:00
Russ Cox
e56c6e7535 runtime/debug: add SetPanicOnFault
SetPanicOnFault allows recovery from unexpected memory faults.
This can be useful if you are using a memory-mapped file
or probing the address space of the current program.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/66590044
2014-02-20 16:18:05 -05:00
Russ Cox
67c83db60d runtime: use goc2c as much as possible
Package runtime's C functions written to be called from Go
started out written in C using carefully constructed argument
lists and the FLUSH macro to write a result back to memory.

For some functions, the appropriate parameter list ended up
being architecture-dependent due to differences in alignment,
so we added 'goc2c', which takes a .goc file containing Go func
declarations but C bodies, rewrites the Go func declaration to
equivalent C declarations for the target architecture, adds the
needed FLUSH statements, and writes out an equivalent C file.
That C file is compiled as part of package runtime.

Native Client's x86-64 support introduces the most complex
alignment rules yet, breaking many functions that could until
now be portably written in C. Using goc2c for those avoids the
breakage.

Separately, Keith's work on emitting stack information from
the C compiler would require the hand-written functions
to add #pragmas specifying how many arguments are result
parameters. Using goc2c for those avoids maintaining #pragmas.

For both reasons, use goc2c for as many Go-called C functions
as possible.

This CL is a replay of the bulk of CL 15400047 and CL 15790043,
both of which were reviewed as part of the NaCl port and are
checked in to the NaCl branch. This CL is part of bringing the
NaCl code into the main tree.

No new code here, just reformatting and occasional movement
into .h files.

LGTM=r
R=dave, alex.brainman, r
CC=golang-codereviews
https://golang.org/cl/65220044
2014-02-20 15:58:47 -05:00
Dmitriy Vyukov
5e72fae9b2 runtime: improve cpu profiles for GC/syscalls/cgo
Current "System->etext" is not very informative.
Add parent "GC" frame.
Replace un-unwindable syscall/cgo frames with Go stack that leads to the call.

LGTM=rsc
R=rsc, alex.brainman, ality
CC=golang-codereviews
https://golang.org/cl/61270043
2014-02-12 22:31:36 +04:00
Dmitriy Vyukov
2ea859a779 runtime: refactor level-triggered IO support
Remove GOOS_solaris ifdef from netpoll code,
instead introduce runtime edge/level triggered IO flag.
Replace armread/armwrite with a single arm(mode) function,
that's how all other interfaces look like and these functions
will need to do roughly the same thing anyway.

LGTM=rsc
R=golang-codereviews, dave, rsc
CC=golang-codereviews
https://golang.org/cl/55500044
2014-02-12 22:24:29 +04:00
Dmitriy Vyukov
e1ee04828d runtime: refactor chan code
1. Make internal chan functions static.
2. Move selgen local variable instead of a member of G struct.
3. Change "bool *pres/selected" parameter of chansend/chanrecv to "bool block",
   which is simpler, faster and less code.
-37 lines total.

LGTM=rsc
R=golang-codereviews, dave, gobot, rsc
CC=bradfitz, golang-codereviews, iant, khr
https://golang.org/cl/58610043
2014-02-12 22:21:38 +04:00
Dmitriy Vyukov
0229dc6dbe runtime: do not cpu profile idle threads on windows
Currently this leads to a significant skew towards 'etext' entry,
since all idle threads are profiled every tick.
Before:
Total: 66608 samples
   63188  94.9%  94.9%    63188  94.9% etext
     278   0.4%  95.3%      278   0.4% sweepspan
     216   0.3%  95.6%      448   0.7% runtime.mallocgc
     122   0.2%  95.8%      122   0.2% scanblock
     113   0.2%  96.0%      113   0.2% net/textproto.canonicalMIMEHeaderKey
After:
Total: 8008 samples
    3949  49.3%  49.3%     3949  49.3% etext
     231   2.9%  52.2%      231   2.9% scanblock
     211   2.6%  54.8%      211   2.6% runtime.cas64
     182   2.3%  57.1%      408   5.1% runtime.mallocgc
     178   2.2%  59.3%      178   2.2% runtime.atomicload64

LGTM=alex.brainman
R=golang-codereviews, alex.brainman
CC=golang-codereviews
https://golang.org/cl/61250043
2014-02-10 15:40:55 +04:00
Dmitriy Vyukov
86a3a54284 runtime: fix windows build
Currently windows crashes because early allocs in schedinit
try to allocate tiny memory blocks, but m->p is not yet setup.
I've considered calling procresize(1) earlier in schedinit,
but this refactoring is better and must fix the issue as well.
Fixes #7218.

R=golang-codereviews, r
CC=golang-codereviews
https://golang.org/cl/54570045
2014-01-28 00:26:56 +04:00
Dmitriy Vyukov
1fa7029425 runtime: combine small NoScan allocations
Combine NoScan allocations < 16 bytes into a single memory block.
Reduces number of allocations on json/garbage benchmarks by 10+%.

json-1
allocated                 8039872      7949194      -1.13%
allocs                     105774        93776     -11.34%
cputime                 156200000    100700000     -35.53%
gc-pause-one              4908873      3814853     -22.29%
gc-pause-total            2748969      2899288      +5.47%
rss                      52674560     43560960     -17.30%
sys-gc                    3796976      3256304     -14.24%
sys-heap                 43843584     35192832     -19.73%
sys-other                 5589312      5310784      -4.98%
sys-stack                  393216       393216      +0.00%
sys-total                53623088     44153136     -17.66%
time                    156193436    100886714     -35.41%
virtual-mem             256548864    256540672      -0.00%

garbage-1
allocated                 2996885      2932982      -2.13%
allocs                      62904        55200     -12.25%
cputime                  17470000     17400000      -0.40%
gc-pause-one            932757485    925806143      -0.75%
gc-pause-total            4663787      4629030      -0.75%
rss                    1151074304   1133670400      -1.51%
sys-gc                   66068352     65085312      -1.49%
sys-heap               1039728640   1024065536      -1.51%
sys-other                38038208     37485248      -1.45%
sys-stack                 8650752      8781824      +1.52%
sys-total              1152485952   1135417920      -1.48%
time                     17478088     17418005      -0.34%
virtual-mem            1343709184   1324204032      -1.45%

LGTM=iant, bradfitz
R=golang-codereviews, dave, iant, rsc, bradfitz
CC=golang-codereviews, khr
https://golang.org/cl/38750047
2014-01-24 22:35:11 +04:00
Keith Randall
be5d2d4432 runtime: Print elision message if we skipped frames on traceback.
Fixes bug 7180

R=golang-codereviews, dvyukov
CC=golang-codereviews, gri
https://golang.org/cl/55810044
2014-01-23 12:47:30 -08:00
Dmitriy Vyukov
9cbd2fb1aa runtime: remove locks from netpoll hotpaths
Introduces two-phase goroutine parking mechanism -- prepare to park, commit park.
This mechanism does not require backing mutex to protect wait predicate.
Use it in netpoll. See comment in netpoll.goc for details.
This slightly reduces contention between reader, writer and read/write io notifications;
and just eliminates a bunch of mutex operations from hotpaths, thus making then faster.

benchmark                             old ns/op    new ns/op    delta
BenchmarkTCP4ConcurrentReadWrite           2109         1945   -7.78%
BenchmarkTCP4ConcurrentReadWrite-2         1162         1113   -4.22%
BenchmarkTCP4ConcurrentReadWrite-4          798          755   -5.39%
BenchmarkTCP4ConcurrentReadWrite-8          803          748   -6.85%
BenchmarkTCP4Persistent                    9411         9240   -1.82%
BenchmarkTCP4Persistent-2                  5888         5813   -1.27%
BenchmarkTCP4Persistent-4                  4016         3968   -1.20%
BenchmarkTCP4Persistent-8                  3943         3857   -2.18%

R=golang-codereviews, mikioh.mikioh, gobot, iant, rsc
CC=golang-codereviews, khr
https://golang.org/cl/45700043
2014-01-22 11:27:16 +04:00
Dmitriy Vyukov
98b50b89a8 runtime: allocate goroutine ids in batches
Helps reduce contention on sched.goidgen.

benchmark                               old ns/op    new ns/op    delta
BenchmarkCreateGoroutines-16                  259          237   -8.49%
BenchmarkCreateGoroutinesParallel-16          127           43  -66.06%

R=golang-codereviews, dave, bradfitz, khr
CC=golang-codereviews, rsc
https://golang.org/cl/46970043
2014-01-22 10:34:36 +04:00
Dmitriy Vyukov
8a3c587dc1 runtime: fix and improve CPU profiling
- do not lose profiling signals when we have no mcache (possible for syscalls/cgo)
- do not lose any profiling signals on windows
- fix profiling of cgo programs on windows (they had no m->thread setup)
- properly setup tls in cgo programs on windows
- check _beginthread return value

Fixes #6417.
Fixes #6986.

R=alex.brainman, rsc
CC=golang-codereviews
https://golang.org/cl/44820047
2014-01-22 10:30:10 +04:00
Dmitriy Vyukov
cb133c6607 runtime: do not collect GC roots explicitly
Currently we collect (add) all roots into a global array in a single-threaded GC phase.
This hinders parallelism.
With this change we just kick off parallel for for number_of_goroutines+5 iterations.
Then parallel for callback decides whether it needs to scan stack of a goroutine
scan data segment, scan finalizers, etc. This eliminates the single-threaded phase entirely.
This requires to store all goroutines in an array instead of a linked list
(to allow direct indexing).
This CL also removes DebugScan functionality. It is broken because it uses
unbounded stack, so it can not run on g0. When it was working, I've found
it helpless for debugging issues because the two algorithms are too different now.
This change would require updating the DebugScan, so it's simpler to just delete it.

With 8 threads this change reduces GC pause by ~6%, while keeping cputime roughly the same.

garbage-8
allocated                 2987886      2989221      +0.04%
allocs                      62885        62887      +0.00%
cputime                  21286000     21272000      -0.07%
gc-pause-one             26633247     24885421      -6.56%
gc-pause-total             873570       811264      -7.13%
rss                     242089984    242515968      +0.18%
sys-gc                   13934336     13869056      -0.47%
sys-heap                205062144    205062144      +0.00%
sys-other                12628288     12628288      +0.00%
sys-stack                11534336     11927552      +3.41%
sys-total               243159104    243487040      +0.13%
time                      2809477      2740795      -2.44%

R=golang-codereviews, rsc
CC=cshapiro, golang-codereviews, khr
https://golang.org/cl/46860043
2014-01-21 13:06:57 +04:00
Dmitriy Vyukov
1ba04c171a runtime: per-P defer pool
Instead of a per-goroutine stack of defers for all sizes,
introduce per-P defer pool for argument sizes 8, 24, 40, 56, 72 bytes.

For a program that starts 1e6 goroutines and then joins then:
old: rss=6.6g virtmem=10.2g time=4.85s
new: rss=4.5g virtmem= 8.2g time=3.48s

R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/42750044
2014-01-21 11:20:23 +04:00
Aram Hăvărneanu
a46b434931 runtime: add support for GOOS=solaris
R=alex.brainman, dave, jsing, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/35990043
2014-01-17 17:58:10 +13:00
Dmitriy Vyukov
c0b9e6218c runtime: output how long goroutines are blocked
Example of output:

goroutine 4 [sleep for 3 min]:
time.Sleep(0x34630b8a000)
        src/pkg/runtime/time.goc:31 +0x31
main.func·002()
        block.go:16 +0x2c
created by main.main
        block.go:17 +0x33

Full program and output are here:
http://play.golang.org/p/NEZdADI3Td

Fixes #6809.

R=golang-codereviews, khr, kamil.kisiel, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/50420043
2014-01-16 12:54:46 +04:00
Dmitriy Vyukov
4722b1cbd3 runtime: use lock-free ring for work queues
Use lock-free fixed-size ring for work queues
instead of an unbounded mutex-protected array.
The ring has single producer and multiple consumers.
If the ring overflows, work is put onto global queue.

benchmark              old ns/op    new ns/op    delta
BenchmarkMatmult               7            5  -18.12%
BenchmarkMatmult-4             2            2  -18.98%
BenchmarkMatmult-16            1            0  -12.84%

BenchmarkCreateGoroutines                     105           88  -16.10%
BenchmarkCreateGoroutines-4                   376          219  -41.76%
BenchmarkCreateGoroutines-16                  241          174  -27.80%
BenchmarkCreateGoroutinesParallel             103           87  -14.66%
BenchmarkCreateGoroutinesParallel-4           169          143  -15.38%
BenchmarkCreateGoroutinesParallel-16          158          151   -4.43%

R=golang-codereviews, rsc
CC=ddetlefs, devon.odell, golang-codereviews
https://golang.org/cl/46170044
2014-01-16 12:17:00 +04:00
Keith Randall
020b39c3f3 runtime: use special records hung off the MSpan to
record finalizers and heap profile info.  Enables
removing the special bit from the heap bitmap.  Also
provides a generic mechanism for annotating occasional
heap objects.

finalizers
        overhead      per obj
old	680 B	      80 B avg
new	16 B/span     48 B

profile
        overhead      per obj
old	32KB	      24 B + hash tables
new	16 B/span     24 B

R=cshapiro, khr, dvyukov, gobot
CC=golang-codereviews
https://golang.org/cl/13314053
2014-01-07 13:45:50 -08:00
Alex Brainman
7f8a5057dd syscall: add NewCallbackCDecl again
Fixes #6338

R=golang-dev, kin.wilson.za, rsc
CC=golang-dev
https://golang.org/cl/36180044
2013-12-19 14:38:50 +11:00
Carl Shapiro
76c54c1193 runtime: add GODEBUG option for an electric fence like heap mode
When enabled this new debugging mode will allocate objects on
their own page and never recycle memory addresses.  This is an
essential tool to root cause a broad class of heap corruption.

R=golang-dev, dave, daniel.morsing, dvyukov, rsc, iant, cshapiro
CC=golang-dev
https://golang.org/cl/22060046
2013-12-06 14:40:45 -08:00
Carl Shapiro
48279bd567 runtime: add an allocation and free tracing for gc debugging
Output for an allocation and free (sweep) follows

MProf_Malloc(p=0xc2100210a0, size=0x50, type=0x0 <single object>)
        #0 0x46ee15 runtime.mallocgc /usr/local/google/home/cshapiro/go/src/pkg/runtime/malloc.goc:141
        #1 0x47004f runtime.settype_flush /usr/local/google/home/cshapiro/go/src/pkg/runtime/malloc.goc:612
        #2 0x45f92c gc /usr/local/google/home/cshapiro/go/src/pkg/runtime/mgc0.c:2071
        #3 0x45f89e mgc /usr/local/google/home/cshapiro/go/src/pkg/runtime/mgc0.c:2050
        #4 0x45258b runtime.mcall /usr/local/google/home/cshapiro/go/src/pkg/runtime/asm_amd64.s:179

MProf_Free(p=0xc2100210a0, size=0x50)
        #0 0x46ee15 runtime.mallocgc /usr/local/google/home/cshapiro/go/src/pkg/runtime/malloc.goc:141
        #1 0x47004f runtime.settype_flush /usr/local/google/home/cshapiro/go/src/pkg/runtime/malloc.goc:612
        #2 0x45f92c gc /usr/local/google/home/cshapiro/go/src/pkg/runtime/mgc0.c:2071
        #3 0x45f89e mgc /usr/local/google/home/cshapiro/go/src/pkg/runtime/mgc0.c:2050
        #4 0x45258b runtime.mcall /usr/local/google/home/cshapiro/go/src/pkg/runtime/asm_amd64.s:179

R=golang-dev, dvyukov, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/21990045
2013-12-03 14:42:38 -08:00
Keith Randall
3278dc158e runtime: pass key/value to map accessors by reference, not by value.
This change is part of the plan to get rid of all vararg C calls
which are a pain for getting exact stack scanning.

We allocate a chunk of zero memory to return a pointer to when a
map access doesn't find the key.  This is simpler than returning nil
and fixing things up in the caller.  Linker magic allocates a single
zero memory area that is shared by all (non-reflect-generated) map
types.

Passing things by reference gets rid of some copies, so it speeds
up code with big keys/values.

benchmark             old ns/op    new ns/op    delta
BenchmarkBigKeyMap           34           31   -8.48%
BenchmarkBigValMap           37           30  -18.62%
BenchmarkSmallKeyMap         26           23  -11.28%

R=golang-dev, dvyukov, khr, rsc
CC=golang-dev
https://golang.org/cl/14794043
2013-12-02 13:05:04 -08:00
Russ Cox
b88148b9a0 undo CL 19810043 / 352f3b7c9664
The CL causes misc/cgo/test to fail randomly.
I suspect that the problem is the use of a division instruction
in usleep, which can be called while trying to acquire an m
and therefore cannot store the denominator in m.
The solution to that would be to rewrite the code to use a
magic multiply instead of a divide, but now we're getting
pretty far off the original code.

Go back to the original in preparation for a different,
less efficient but simpler fix.

««« original CL description
cmd/5l, runtime: make ARM integer division profiler-friendly

The implementation of division constructed non-standard
stack frames that could not be handled by the traceback
routines.

CL 13239052 left the frames non-standard but fixed them
for the specific case of a divide-by-zero panic.
A profiling signal can arrive at any time, so that fix
is not sufficient.

Change the division to store the extra argument in the M struct
instead of in a new stack slot. That keeps the frames bog standard
at all times.

Also fix a related bug in the traceback code: when starting
a traceback, the LR register should be ignored if the current
function has already allocated its stack frame and saved the
original LR on the stack. The stack copy should be used, as the
LR register may have been modified.

Combined, these make the torture test from issue 6681 pass.

Fixes #6681.

R=golang-dev, r, josharian
CC=golang-dev
https://golang.org/cl/19810043
»»»

TBR=r
CC=golang-dev
https://golang.org/cl/20350043
2013-10-31 17:18:57 +00:00
Russ Cox
b0db472ea2 cmd/5l, runtime: make ARM integer division profiler-friendly
The implementation of division constructed non-standard
stack frames that could not be handled by the traceback
routines.

CL 13239052 left the frames non-standard but fixed them
for the specific case of a divide-by-zero panic.
A profiling signal can arrive at any time, so that fix
is not sufficient.

Change the division to store the extra argument in the M struct
instead of in a new stack slot. That keeps the frames bog standard
at all times.

Also fix a related bug in the traceback code: when starting
a traceback, the LR register should be ignored if the current
function has already allocated its stack frame and saved the
original LR on the stack. The stack copy should be used, as the
LR register may have been modified.

Combined, these make the torture test from issue 6681 pass.

Fixes #6681.

R=golang-dev, r, josharian
CC=golang-dev
https://golang.org/cl/19810043
2013-10-30 18:50:34 +00:00
Dmitriy Vyukov
f6329700ae runtime: remove nomemprof
Nomemprof seems to be unneeded now, there is no recursion.
If the recursion will be re-introduced, it will break loudly by deadlocking.
Fixes #6566.

R=golang-dev, minux.ma, rsc
CC=golang-dev
https://golang.org/cl/14695044
2013-10-18 10:45:19 +04:00
Russ Cox
551ada4742 runtime: avoid allocation of internal panic values
If a fault happens in malloc, inevitably the next thing that happens
is a deadlock trying to allocate the panic value that says the fault
happened. Stop doing that, two ways.

First, reject panic in malloc just as we reject panic in garbage collection.

Second, runtime.panicstring was using an error implementation
backed by a Go string, so the interface held an allocated *string.
Since the actual errors are C strings, define a new error
implementation backed by a C char*, which needs no indirection
and therefore no allocation.

This second fix will avoid allocation for errors like nil panic derefs
or division by zero, so it is worth doing even though the first fix
should take care of faults during malloc.

Update #6419

R=golang-dev, dvyukov, dave
CC=golang-dev
https://golang.org/cl/13774043
2013-09-20 15:15:25 -04:00
Carl Shapiro
16d6b6c771 runtime: export PCDATA value reader
This interface is required to use the PCDATA interface
implemented in Go 1.2.  While initially entirely private, the
FUNCDATA side of the interface has been made public.  This
change completes the FUNCDATA/PCDATA interface.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/13735043
2013-09-16 19:03:19 -07:00
Russ Cox
30ecb4cd05 build: disable precise collection of stack frames
The code for call site-specific pointer bitmaps was not ready in time,
but the zeroing required without it is too expensive to use by default.
We will have to wait for precise collection of stack frames until Go 1.3.

The precise collection can be re-enabled by

        GOEXPERIMENT=precisestack ./all.bash

but that will not be the default for a Go 1.2 build.

Fixes #6087.

R=golang-dev, jeremyjackins, dan.kortschak, r
CC=golang-dev
https://golang.org/cl/13677045
2013-09-16 20:26:10 -04:00
Russ Cox
7276c02b41 runtime, cmd/gc, cmd/ld: ignore method wrappers in recover
Bug #1:

Issue 5406 identified an interesting case:
        defer iface.M()
may end up calling a wrapper that copies an indirect receiver
from the iface value and then calls the real M method. That's
two calls down, not just one, and so recover() == nil always
in the real M method, even during a panic.

[For the purposes of this entire discussion, a wrapper's
implementation is a function containing an ordinary call, not
the optimized tail call form that is somtimes possible. The
tail call does not create a second frame, so it is already
handled correctly.]

Fix this bug by introducing g->panicwrap, which counts the
number of bytes on current stack segment that are due to
wrapper calls that should not count against the recover
check. All wrapper functions must now adjust g->panicwrap up
on entry and back down on exit. This adds slightly to their
expense; on the x86 it is a single instruction at entry and
exit; on the ARM it is three. However, the alternative is to
make a call to recover depend on being able to walk the stack,
which I very much want to avoid. We have enough problems
walking the stack for garbage collection and profiling.
Also, if performance is critical in a specific case, it is already
faster to use a pointer receiver and avoid this kind of wrapper
entirely.

Bug #2:

The old code, which did not consider the possibility of two
calls, already contained a check to see if the call had split
its stack and so the panic-created segment was one behind the
current segment. In the wrapper case, both of the two calls
might split their stacks, so the panic-created segment can be
two behind the current segment.

Fix this by propagating the Stktop.panic flag forward during
stack splits instead of looking backward during recover.

Fixes #5406.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/13367052
2013-09-12 14:00:16 -04:00
Russ Cox
ce9ddd0eee runtime: keep args and frame in struct Func
args is useful for printing tracebacks.

frame is not necessary anymore, but we might some day
get back to functions where the frame size does not vary
by program counter, and if so we'll need it. Avoid needing
to introduce a new struct format later by keeping it now.

Fixes #5907.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/13632051
2013-09-11 14:18:52 -04:00
Keith Randall
4487054751 runtime: clean up / align comment tabbing
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/13336046
2013-09-10 09:02:22 -07:00