Along with CLs 139610043 and 141490043,
this removes all conservative scanning during
garbage collection, except _cgo_allocate,
which is SWIG-only.
LGTM=rlh, khr
R=golang-codereviews, dvyukov, rlh, khr
CC=golang-codereviews, iant
https://golang.org/cl/144860043
Now it's two allocations. I don't see much downside to that,
since the two pieces were in different cache lines anyway.
Rename 'conservative' to 'cgo_conservative_type' and make
clear that _cgo_allocate is the only allowed user.
This depends on CL 141490043, which removes the other
use of conservative (in defer).
LGTM=dvyukov, iant
R=khr, dvyukov, iant
CC=golang-codereviews, rlh
https://golang.org/cl/139610043
This makes the GC and the stack copying agree about how
to interpret the defer structures. Previously, only the stack
copying treated them precisely.
This removes an untyped memory allocation and fixes
at least three copystack bugs.
To make sure the GC can find the deferred argument
frame until it has been copied, keep a Defer on the defer list
during its execution.
In addition to making it possible to remove the untyped
memory allocation, keeping the Defer on the list fixes
two races between copystack and execution of defers
(in both gopanic and Goexit). The problem is that once
the defer has been taken off the list, a stack copy that
happens before the deferred arguments have been copied
back to the stack will not update the arguments correctly.
The new tests TestDeferPtrsPanic and TestDeferPtrsGoexit
(variations on the existing TestDeferPtrs) pass now but
failed before this CL.
In addition to those fixes, keeping the Defer on the list
helps correct a dangling pointer error during copystack.
The traceback routines walk the Defer chain to provide
information about where a panic may resume execution.
When the executing Defer was not on the Defer chain
but instead linked from the Panic chain, the traceback
had to walk the Panic chain too. But Panic structs are
on the stack and being updated by copystack.
Traceback's use of the Panic chain while copystack is
updating those structs means that it can follow an
updated pointer and find itself reading from the new stack.
The new stack is usually all zeros, so it sees an incorrect
early end to the chain. The new TestPanicUseStack makes
this happen at tip and dies when adjustdefers finds an
unexpected argp. The new StackCopyPoison mode
causes an earlier bad dereference instead.
By keeping the Defer on the list, traceback can avoid
walking the Panic chain at all, making it okay for copystack
to update the Panics.
We'd have the same problem for any Defers on the stack.
There was only one: gopanic's dabort. Since we are not
taking the executing Defer off the chain, we can use it
to do what dabort was doing, and then there are no
Defers on the stack ever, so it is okay for traceback to use
the Defer chain even while copystack is executing:
copystack cannot modify the Defer chain.
LGTM=khr
R=khr
CC=dvyukov, golang-codereviews, iant, rlh
https://golang.org/cl/141490043
The C header files are the single point of truth:
every C enum constant Foo is available to Go as _Foo.
Remove or redirect duplicate Go declarations so they
cannot be out of sync.
Eventually we will need to put constants in Go, but for now having
them be out of sync with C is too risky. These predate the build
support for auto-generating Go constants from the C definitions.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/141510043
I have made mistake while converting it to Go (CL 132820043).
Added test as penance for my sin.
LGTM=rsc
R=golang-codereviews, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/136560043
This file was already assigned to another CL
so it didn't make it into the build fix CL. Sigh.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/144850043
This is necessary because syscall.Syscall blocks, and the
garbage collector needs to be able to scan that frame while
it is blocked, and C frames have no garbage collection
information.
Windows builders are broken now due to this problem:
http://build.golang.org/log/152ca9a4be6783d3a8bf6e2f5b9fc265089728b6
LGTM=alex.brainman
R=alex.brainman
CC=golang-codereviews
https://golang.org/cl/144830043
The merged traceback was wrong for LR machines,
because traceback didn't pass lr to gentraceback.
Now that we have a test looking at traceback output
for a trap (the test of runtime.Breakpoint),
we caught this.
While we're here, fix a 'set and not used' warning.
Fixes arm build.
TBR=r
R=r
CC=golang-codereviews
https://golang.org/cl/143040043
It doesn't.
Fixes 386 build.
While we're here, mark runtime.asmcgocall as GO_ARGS,
so that it will work with stack copying. I don't think anything
that uses it can lead to a stack copy, but better safe than sorry.
Certainly the runtime.asmcgocall_errno variant needs
(and already has) GO_ARGS.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/138400043
The argsize PCDATA was specifying the number of
bytes passed to a function call, so that if the function
did not specify its argument count, the garbage collector
could use the call site information to scan those bytes
conservatively. We don't do that anymore, so stop
generating the information.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/139530043
The goal here is to commit fully to having precise information
about stack frames. If we need information we don't have,
crash instead of assuming we should scan conservatively.
Since the stack copying assumes fully precise information,
any crashes during garbage collection that are introduced by
this CL are crashes that could have happened during stack
copying instead. Those are harder to find because stacks are
copied much less often than the garbage collector is invoked.
In service of that goal, remove ARGSIZE macros from
asm_*.s, change switchtoM to have no arguments
(it doesn't have any live arguments), and add
args and locals information to some frames that
can call back into Go.
LGTM=khr
R=khr, rlh
CC=golang-codereviews
https://golang.org/cl/137540043
Dmitriy changed all the execution to interpret the BitVector
as an array of bytes. Update the declaration and generation
of the bitmaps to match, to avoid problems on big-endian
machines.
LGTM=khr
R=khr
CC=dvyukov, golang-codereviews
https://golang.org/cl/140570044
makeFuncStub and methodValueStub are used by reflect as
generic function implementations. Each call might have
different arguments. Extract those arguments from the
closure data instead of assuming it is the same each time.
Because the argument map is now being extracted from the
function itself, we don't need the special cases in reflect.Call
anymore, so delete those.
Fixes an occasional crash seen when stack copying does
not update makeFuncStub's arguments correctly.
Will also help make it safe to require stack maps in the
garbage collector.
Derived from CL 142000044 by khr.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/143890044
The pid field in the Tos structure is a 32-bit value.
Loading a 64-bit word also brings in the next field
which is used for the profiling clock.
LGTM=0intro, aram
R=rsc, 0intro, aram
CC=golang-codereviews, mischief
https://golang.org/cl/139560044
The goal here is to allow assembly functions to appear in the middle
of a Go stack (having called other code) and still record enough information
about their pointers so that stack copying and garbage collection can handle
them precisely. Today, these frames are handled only conservatively.
If you write
func myfunc(x *float64) (y *int)
(with no body, an 'extern' declaration), then the Go compiler now emits
a liveness bitmap for use from the assembly definition of myfunc.
The bitmap symbol is myfunc.args_stackmap and it contains two bitmaps.
The first bitmap, in effect at function entry, marks all inputs as live.
The second bitmap, not in effect at function entry, marks the outputs
live as well.
In funcdata.h, define new assembly macros:
GO_ARGS opts in to using the Go compiler-generated liveness bitmap
for the current function.
GO_RESULTS_INITIALIZED indicates that the results have been initialized
and need to be kept live for the remainder of the function; it causes a
switch to the second generated bitmap for the assembly code that follows.
NO_LOCAL_POINTERS indicates that there are no pointers in the
local variables being stored in the function's stack frame.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/137520043
Tests will come in a separate CL after the funcdata stuff is resolved.
Update #8696
LGTM=iant, rsc
R=rsc, iant
CC=golang-codereviews
https://golang.org/cl/138330045
Just go ahead and do it, if something is wrong we'll throw.
Also rip out cc-generated arg ptr maps, they are useless now.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/133690045
Replacing gosched with Gosched broke some builds because
some of the call sites are at times when the stack cannot be grown.
TBR=khr
CC=golang-codereviews
https://golang.org/cl/142000043
A write *p = x that needs a write barrier (not all do)
now turns into runtime.writebarrierptr(p, x)
or one of the other variants.
The write barrier implementations are trivial.
The goal here is to emit the calls in the correct places
and to incur the cost of those function calls in the Go 1.4 cycle.
Performance on the Go 1 benchmark suite below.
Remember, the goal is to slow things down (and be correct).
We will look into optimizations in separate CLs, as part of
the process of comparing Go 1.3 against tip in order to make
sure Go 1.4 runs at least as fast as Go 1.3.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3118336716 3452876110 +10.73%
BenchmarkFannkuch11 3184497677 3211552284 +0.85%
BenchmarkFmtFprintfEmpty 89.9 107 +19.02%
BenchmarkFmtFprintfString 236 287 +21.61%
BenchmarkFmtFprintfInt 246 278 +13.01%
BenchmarkFmtFprintfIntInt 395 458 +15.95%
BenchmarkFmtFprintfPrefixedInt 343 378 +10.20%
BenchmarkFmtFprintfFloat 477 525 +10.06%
BenchmarkFmtManyArgs 1446 1707 +18.05%
BenchmarkGobDecode 14398047 14685958 +2.00%
BenchmarkGobEncode 12557718 12947104 +3.10%
BenchmarkGzip 453462345 472413285 +4.18%
BenchmarkGunzip 114226016 115127398 +0.79%
BenchmarkHTTPClientServer 114689 112122 -2.24%
BenchmarkJSONEncode 24914536 26135942 +4.90%
BenchmarkJSONDecode 86832877 103620289 +19.33%
BenchmarkMandelbrot200 4833452 4898780 +1.35%
BenchmarkGoParse 4317976 4835474 +11.98%
BenchmarkRegexpMatchEasy0_32 150 166 +10.67%
BenchmarkRegexpMatchEasy0_1K 393 402 +2.29%
BenchmarkRegexpMatchEasy1_32 125 142 +13.60%
BenchmarkRegexpMatchEasy1_1K 1010 1236 +22.38%
BenchmarkRegexpMatchMedium_32 232 301 +29.74%
BenchmarkRegexpMatchMedium_1K 76963 102721 +33.47%
BenchmarkRegexpMatchHard_32 3833 5463 +42.53%
BenchmarkRegexpMatchHard_1K 119668 161614 +35.05%
BenchmarkRevcomp 763449047 706768534 -7.42%
BenchmarkTemplate 124954724 134834549 +7.91%
BenchmarkTimeParse 517 511 -1.16%
BenchmarkTimeFormat 501 514 +2.59%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 53.31 52.26 0.98x
BenchmarkGobEncode 61.12 59.28 0.97x
BenchmarkGzip 42.79 41.08 0.96x
BenchmarkGunzip 169.88 168.55 0.99x
BenchmarkJSONEncode 77.89 74.25 0.95x
BenchmarkJSONDecode 22.35 18.73 0.84x
BenchmarkGoParse 13.41 11.98 0.89x
BenchmarkRegexpMatchEasy0_32 213.30 191.72 0.90x
BenchmarkRegexpMatchEasy0_1K 2603.92 2542.74 0.98x
BenchmarkRegexpMatchEasy1_32 254.00 224.93 0.89x
BenchmarkRegexpMatchEasy1_1K 1013.53 827.98 0.82x
BenchmarkRegexpMatchMedium_32 4.30 3.31 0.77x
BenchmarkRegexpMatchMedium_1K 13.30 9.97 0.75x
BenchmarkRegexpMatchHard_32 8.35 5.86 0.70x
BenchmarkRegexpMatchHard_1K 8.56 6.34 0.74x
BenchmarkRevcomp 332.92 359.62 1.08x
BenchmarkTemplate 15.53 14.39 0.93x
LGTM=rlh
R=rlh
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/136380043
The uses of onM in dopanic/startpanic are okay even from the signal stack.
Fixes#8666.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/134710043
They will both need write barriers at some point.
But until then, no reason why we shouldn't share.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/141330043
The previous implementation had several subtle issues. It's not
clear if any of these could actually be causing the flakiness
problems on openbsd/386, but fixing them should only help.
1. thrsleep() is implemented internally as unlock, then test *abort
(if abort != nil), then tsleep(). Under the current code, that makes
it theoretically possible that semasleep()/thrsleep() could release
waitsemalock, then a racing semawakeup() could acquire the lock,
increment waitsemacount, and call thrwakeup()/wakeup() before
thrsleep() reaches tsleep(). (In practice, OpenBSD's big kernel lock
seems unlikely to let this actually happen.)
The proper way to avoid this is to pass &waitsemacount as the abort
pointer to thrsleep so thrsleep knows to re-check it before going to
sleep, and to wakeup if it's non-zero. Then we avoid any races.
(I actually suspect openbsd's sema{sleep,wakeup}() could be further
simplified using cas/xadd instead of locks, but I don't want to be
more intrusive than necessary so late in the 1.4 release cycle.)
2. semasleep() takes a relative sleep duration, but thrsleep() needs
an absolute sleep deadline. Instead of recomputing the deadline each
iteration, compute it once up front and use (*Timespec)(nil) to signify
no deadline. Ensures we retry properly if there's a spurious wakeup.
3. Instead of assuming if thrsleep() woke up and waitsemacount wasn't
available that we must have hit the deadline, check that the system
call returned EWOULDBLOCK.
4. Instead of assuming that 64-bit systems are little-endian, compute
timediv() using a temporary int32 nsec and then assign it to tv_nsec.
LGTM=iant
R=jsing, iant
CC=golang-codereviews
https://golang.org/cl/137960043
A race exists between the parent and child processes after a fork.
The child needs to access the new M pointer passed as an argument
but the parent may have already returned and clobbered it.
Previously, we avoided this by saving the necessary data into
registers before the rfork system call but this isn't guaranteed
to work because Plan 9 makes no promises about the register state
after a system call. Only the 386 kernel seems to save them.
For amd64 and arm, this method won't work.
We eliminate the race by allocating stack space for the scheduler
goroutines (g0) in the per-process copy-on-write stack segment and
by only calling rfork on the scheduler stack.
LGTM=aram, 0intro, rsc
R=aram, 0intro, mischief, rsc
CC=golang-codereviews
https://golang.org/cl/110680044
The only thing I can see that is really Plan 9-specific
is that the stack pointer used for signal handling used
to have more mapped memory above it.
Specifically it used to have at most 88 bytes (StackTop),
so change the allocation of a 40-byte frame to a 128-byte frame.
No idea if this will work, but worth a try.
Note that "fix" here means get it back to timing out
instead of crashing.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/142840043
The difference between the old and the new (from earlier) code
is that we set stackguard = stack.lo + StackGuard, while the old
code set stackguard = stack.lo. That 512 bytes appears to be
the difference between the profileloop function running and not running.
We don't know how big the system stack is, but it is likely MUCH bigger than 4k.
Give Go/C 8k.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/140440044
Start the stack a few words below the actual top, so that
if something tries to read goexit's caller PC from the stack,
it won't fault on a bad memory address.
Today, heapdump does that.
Maybe tomorrow, traceback or something else will do that.
Make it not a bug.
TBR=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/136450043
Commit to stack copying for stack growth.
We're carrying around a surprising amount of cruft from older schemes.
I am confident that precise stack scans and stack copying are here to stay.
Delete fallback code for when precise stack info is disabled.
Delete fallback code for when copying stacks is disabled.
Delete fallback code for when StackCopyAlways is disabled.
Delete Stktop chain - there is only one stack segment now.
Delete M.moreargp, M.moreargsize, M.moreframesize, M.cret.
Delete G.writenbuf (unrelated, just dead).
Delete runtime.lessstack, runtime.oldstack.
Delete many amd64 morestack variants.
Delete initialization of morestack frame/arg sizes (shortens split prologue!).
Replace G's stackguard/stackbase/stack0/stacksize/
syscallstack/syscallguard/forkstackguard with simple stack
bounds (lo, hi).
Update liblink, runtime/cgo for adjustments to G.
LGTM=khr
R=khr, bradfitz
CC=golang-codereviews, iant, r
https://golang.org/cl/137410043
I have found better approach, then longer wait.
See CL 134360043 for details.
««« original CL description
runtime/pprof: adjust cpuHogger so that tests pass on windows builders
LGTM=rsc
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/140110043
»»»
LGTM=dave
R=golang-codereviews, dave, dvyukov
CC=golang-codereviews
https://golang.org/cl/133500043
I assumed they were the same when I wrote
cgocallback.go earlier today. Merge them
to eliminate confusion.
I can't tell what gomallocgc did before with
a nil type but without FlagNoScan.
I created a call like that in cgocallback.go
this morning, translating from a C file.
It was supposed to do what the C version did,
namely treat the block conservatively.
Now it will.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/141810043
It already is updating parts of them; we're just getting lucky
retraversing them and not finding much to do.
Change argp to a pointer so that it will be updated too.
Existing tests break if you apply the change to adjustpanics
without also updating the type of argp.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/139380043
It worked at CL 134660043 on the builders,
so I believe it will stick this time.
LGTM=bradfitz
R=khr, bradfitz
CC=golang-codereviews
https://golang.org/cl/141280043
This should make deferreturn nosplit all the way down,
which should fix the current windows/amd64 failure.
If not, I will change StackCopyAlways back to 0.
TBR=khr
CC=golang-codereviews
https://golang.org/cl/135600043
Let's see how close we are to this being ready.
Will roll back if it breaks any builds in non-trivial ways.
LGTM=r, khr
R=iant, khr, r
CC=golang-codereviews
https://golang.org/cl/138200043