Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.
For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:
TEXT myfunc
check for split
jmpcond nosplit
call morestack
nosplit:
sub $xxx, sp
Now an extra instruction is inserted after the call:
TEXT myfunc
start:
check for split
jmpcond nosplit
call morestack
jmp start
nosplit:
sub $xxx, sp
The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.
The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.
The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.
Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)
runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow
goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
/Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
/Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
/Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
/Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
/Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
/Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
/Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
/Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
/Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
/Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
/Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
/Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8
And here is a seg fault during oldstack:
SIGSEGV: segmentation violation
PC=0x1b2a6
runtime.oldstack()
/Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
/Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22
goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
/Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
/Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
/Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
/Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
/Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
/Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8
goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
/Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
/Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
/Users/rsc/g/go/src/pkg/runtime/proc.c:166
rax 0x23ccc0
rbx 0x23ccc0
rcx 0x0
rdx 0x38
rdi 0x2102c0170
rsi 0x221032cfe0
rbp 0x221032cfa0
rsp 0x7fff5fbff5b0
r8 0x2102c0120
r9 0x221032cfa0
r10 0x221032c000
r11 0x104ce8
r12 0xe5c80
r13 0x1be82baac718
r14 0x13091135f7d69200
r15 0x0
rip 0x1b2a6
rflags 0x10246
cs 0x2b
fs 0x0
gs 0x0
Fixes#5723.
R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
It was used to request large stack segment for GC
when it was running not on g0.
Now GC is running on g0 with large stack,
and it is not needed anymore.
R=golang-dev, dave
CC=golang-dev
https://golang.org/cl/10242045
Add gostartcall and gostartcallfn.
The old gogocall = gostartcall + gogo.
The old gogocallfn = gostartcallfn + gogo.
R=dvyukov, minux.ma
CC=golang-dev
https://golang.org/cl/10036044
The garbage collection routine addframeroots is duplicating
logic in the traceback routine that calls it, sometimes correctly,
sometimes incorrectly, sometimes incompletely.
Pass necessary information to addframeroots instead of
deriving it anew.
Should make addframeroots significantly more robust.
It's certainly smaller.
Also try to standardize on uintptr for saved pc, sp values.
Will make CL 10036044 trivial.
R=golang-dev, dave, dvyukov
CC=golang-dev
https://golang.org/cl/10169045
This is part of preemptive scheduler.
stackguard0 is checked in split stack checks and can be set to StackPreempt.
stackguard is not set to StackPreempt (holds the original value).
R=golang-dev, daniel.morsing, iant
CC=golang-dev
https://golang.org/cl/9875043
This is needed for preemptive scheduler, because during
stoptheworld we want to wait with timeout and re-preempt
M's on timeout.
R=golang-dev, remyoudompheng, iant
CC=golang-dev
https://golang.org/cl/9375043
With this change the compiler emits a bitmap for each function
covering its stack frame arguments area. If an argument word
is known to contain a pointer, a bit is set. The garbage
collector reads this information when scanning the stack by
frames and uses it to ignores locations known to not contain a
pointer.
R=golang-dev, bradfitz, daniel.morsing, dvyukov, khr, khr, iant, cshapiro
CC=golang-dev
https://golang.org/cl/9223046
When cgo is used, runtime creates an additional M to handle callbacks on threads not created by Go.
This effectively disabled deadlock detection, which is a right thing, because Go program can be blocked
and only serve callbacks on external threads.
This also disables deadlock detection under race detector, because it happens to use cgo.
With this change the additional M is created lazily on first cgo call. So deadlock detector
works for programs that import "C", "net" or "net/http/pprof" but do not use them in fact.
Also fixes deadlock detector under race detector.
It should be fine to create the M later, because C code can not call into Go before first cgo call,
because C code does not know when Go initialization has completed. So a Go program need to call into C
first either to create an external thread, or notify a thread created in global ctor that Go
initialization has completed.
Fixes#4973.
Fixes#5475.
R=golang-dev, minux.ma, iant
CC=golang-dev
https://golang.org/cl/9303046
This makes it an unsafe.Pointer in Go so the garbage collector
will treat it as a pointer to untyped data, not a pointer to
bytes.
R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/8286045
Fixes performance of the current windows network poller
with the new scheduler.
Gives runtime a hint when GetQueuedCompletionStatus() will block.
Fixes#5068.
benchmark old ns/op new ns/op delta
BenchmarkTCP4Persistent 4004000 33906 -99.15%
BenchmarkTCP4Persistent-2 21790 17513 -19.63%
BenchmarkTCP4Persistent-4 44760 34270 -23.44%
BenchmarkTCP4Persistent-6 45280 43000 -5.04%
R=golang-dev, alex.brainman, coocood, rsc
CC=golang-dev
https://golang.org/cl/7612045
Fixes#5061.
Current code relies on the fact that fd's are automatically removed from epoll set when closed. However, it is not true. Underlying file description is removed from epoll set only when *all* fd's referring to it are closed.
There are 2 bad consequences:
1. Kernel delivers notifications on already closed fd's.
2. The following sequence of events leads to error:
- add fd1 to epoll
- dup fd1 = fd2
- close fd1 (not removed from epoll since we've dup'ed the fd)
- dup fd2 = fd1 (get the same fd as fd1)
- add fd1 to epoll = EEXIST
So, if fd can be potentially dup'ed of fork'ed, it's necessary to explicitly remove the fd from epoll set.
R=golang-dev, bradfitz, dave
CC=golang-dev
https://golang.org/cl/7870043
Hashtable is arranged as an array of
8-entry buckets with chained overflow.
Each bucket has 8 extra hash bits
per key to provide quick lookup within
a bucket. Table is grown incrementally.
Update #3885
Go time drops from 0.51s to 0.34s.
R=r, rsc, m3b, dave, bradfitz, khr, ugorji, remyoudompheng
CC=golang-dev
https://golang.org/cl/7504044
This provides a way to generate core dumps when people need them.
The settings are:
GOTRACEBACK=0 no traceback on panic, just exit
GOTRACEBACK=1 default - traceback on panic, then exit
GOTRACEBACK=2 traceback including runtime frames on panic, then exit
GOTRACEBACK=crash traceback including runtime frames on panic, then crash
Fixes#3257.
R=golang-dev, devon.odell, r, daniel.morsing, ality
CC=golang-dev
https://golang.org/cl/7666044
Uses AES hardware instructions on 386/amd64 to implement
a fast hash function. Incorporates a random key to
thwart hash collision DOS attacks.
Depends on CL#7548043 for new assembly instructions.
Update #3885
Helps some by making hashing faster. Go time drops from
0.65s to 0.51s.
R=rsc, r, bradfitz, remyoudompheng, khr, dsymonds, minux.ma, elias.naur
CC=golang-dev
https://golang.org/cl/7543043
The issue was that scvg is assigned *after* the scavenger goroutine is started,
so when the scavenger calls entersyscall() the g==scvg check can fail.
Fixes#5025.
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/7629045
Change 231af8ac63aa (CL 7314062) made runtime.enteryscall()
set m->mcache = nil, which means that we can no longer use
syscall.errstr in syscall.Syscall and syscall.Syscall6, since it
requires a new buffer to be allocated for holding the error string.
Instead, we use pre-allocated per-M storage to hold error strings
from syscalls made while in entersyscall mode, and call
runtime.findnull to calculate the lengths.
Fixes#4994.
R=rsc, rminnich, ality, dvyukov, rminnich, r
CC=golang-dev
https://golang.org/cl/7567043
broke arm garbage collector
traceback_arm fails with a missing pc. It needs CL 7494043.
But that only makes the build break later, this time with
"invalid freelist". Roll back until it can be fixed correctly.
««« original CL description
runtime: restrict stack root scan to locals and arguments
R=rsc
CC=golang-dev
https://golang.org/cl/7301062
»»»
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/7493044
Putting the M initialization in multiple places will not scale.
Various code assumes mstart is the start already. Make it so.
R=golang-dev, devon.odell
CC=golang-dev
https://golang.org/cl/7420048
add per-P cache of dead G's
add global runnable queue (not used for now)
add list of idle P's (not used for now)
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7397061
runtime: add context argument to gogocall
Too many other things use AX, and at least one
(stack zeroing) cannot be moved onto a different
register. Use the less special DX instead.
Preparation for step 2 of http://golang.org/s/go11func.
Nothing interesting here, just split out so that we can
see it's correct before moving on.
R=ken2
CC=golang-dev
https://golang.org/cl/7395050
Previously, the func structure contained an inaccurate value for
the args member and a 0 value for the locals member.
This change populates the func structure with args and locals
values computed by the compiler. The number of args was
already available in the ATEXT instruction. The number of
locals is now passed through in the new ALOCALS instruction.
This change also switches the unit of args and locals to be
bytes, just like the frame member, instead of 32-bit words.
R=golang-dev, bradfitz, cshapiro, dave, rsc
CC=golang-dev
https://golang.org/cl/7399045
mpreinit() is called on the parent thread and with mcache (can allocate memory),
minit() is called on the child thread and can not allocate memory.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7389043