struct32 and struct40 structs are already declared, remove them to
make runtime tests build.
Change-Id: I3814f2b850dcb15c4002a3aa22e2a9326e5a5e53
Reviewed-on: https://go-review.googlesource.com/55614
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Instead of comparing if the number of elements will
not fit into memory check if the memory size of the
slices backing memory is higher then the memory limit.
This avoids a division or maxElems lookup.
With et.size > 0:
uintptr(newcap) > maxSliceCap(et.size)
-> uintptr(int(capmem / et.size)) > _MaxMem / et.size
-> capmem / et.size > _MaxMem / et.size
-> capmem > _MaxMem
Note that due to integer division from capmem > _MaxMem
it does not follow that uintptr(newcap) > maxSliceCap(et.size).
Consolidated runtime GrowSlice benchmarks by using sub-benchmarks and
added more struct sizes to show performance improvement when division
is avoided for element sizes larger than 32 bytes.
AMD64:
GrowSlice/Byte 38.9ns ± 2% 38.9ns ± 1% ~ (p=0.974 n=20+20)
GrowSlice/Int 58.3ns ± 3% 58.0ns ± 2% ~ (p=0.154 n=20+19)
GrowSlice/Ptr 95.7ns ± 2% 95.1ns ± 2% -0.60% (p=0.034 n=20+20)
GrowSlice/Struct/24 95.4ns ± 1% 93.9ns ± 1% -1.54% (p=0.000 n=19+19)
GrowSlice/Struct/32 110ns ± 1% 108ns ± 1% -1.76% (p=0.000 n=19+20)
GrowSlice/Struct/40 138ns ± 1% 128ns ± 1% -7.09% (p=0.000 n=20+20)
Change-Id: I1c37857c74ea809da373e668791caffb6a5cbbd3
Reviewed-on: https://go-review.googlesource.com/53471
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
We weren't initializing this field for dynamically-generated itabs.
Turns out it doesn't matter, as any time we use this field we also
generate a static itab for the interface type / concrete type pair.
But we should initialize it anyway, just to be safe.
Performance on the benchmarks in CL 44339:
benchmark old ns/op new ns/op delta
BenchmarkItabFew-12 1040585 26466 -97.46%
BenchmarkItabAll-12 228873499 4287696 -98.13%
Change-Id: I58ed2b31e6c98b584122bdaf844fee7268b58295
Reviewed-on: https://go-review.googlesource.com/44475
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
We don't use it any more, remove it.
Change-Id: I76ce1a4c2e7048fdd13a37d3718b5abf39ed9d26
Reviewed-on: https://go-review.googlesource.com/44474
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Just use fun[0]==0 to indicate a bad itab.
Change-Id: I28ecb2d2d857090c1ecc40b1d1866ac24a844848
Reviewed-on: https://go-review.googlesource.com/44473
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Keep itabs in a growable hash table.
Use a simple open-addressable hash table, quadratic probing, power
of two sized.
Synchronization gets a bit more tricky. The common read path now
has two atomic reads, one to get the table pointer and one to read
the entry out of the table.
I set the max load factor to 75%, kind of arbitrarily. There's a
space-speed tradeoff here, and I'm not sure where we should land.
Because we use open addressing the itab.link field is no longer needed.
I'll remove it in a separate CL.
Fixes#20505
Change-Id: Ifb3d9a337512d6cf968c1fceb1eeaf89559afebf
Reviewed-on: https://go-review.googlesource.com/44472
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Last runtime use was removed in https://golang.org/cl/133700043,
September 2014.
Replace plan9 syscall uses with plan9-specific variable.
Change-Id: Ifb910c021c1419a7c782959f90b054ed600d9e19
Reviewed-on: https://go-review.googlesource.com/55450
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The preceding cleanup made it clear that two cases
(have golden data, unreachable key) are handled identically.
Simplify the control flow to reflect that.
Simplifies the code and generates shorter machine code.
Change-Id: Id612e0da6679813e855506f47222c58ea6497d70
Reviewed-on: https://go-review.googlesource.com/55093
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This change unifies the x and y cases.
It shrinks evacuate's machine code by ~25% and its stack size by ~15%.
It also eliminates a critical branch.
Whether an entry should go to x or y is designed to be unpredictable.
As a result, half of the branch predictions for useX were wrong.
Mispredicting that branch can easily incur an expensive cache miss.
Switching to an xy array allows elimination of that branch,
which in turn reduces cache misses.
Change-Id: Ie9cef53744b96c724c377ac0985b487fc50b49b1
Reviewed-on: https://go-review.googlesource.com/54653
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Make the calculation of k and v a bit lazier.
None of the following code cares about indirect-vs-direct k,
and it happens on all code paths, so check t.indirectkey earlier.
Simplifies the code and reduces both machine code and stack size.
Change-Id: I5ea4c0772848d7a4b15383baedb9a1f7feb47201
Reviewed-on: https://go-review.googlesource.com/55092
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This avoids division and multiplication.
Instrumentation suggests that this is a very common case.
Change-Id: I2d5d5012d4f4df4c4af1f9f85ca9c323c9889c0e
Reviewed-on: https://go-review.googlesource.com/54657
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This avoids the never triggered capacity checks in newarray.
Change-Id: Ib72b204adcb9e3fd3ab963defe0cd40e22d5d492
Reviewed-on: https://go-review.googlesource.com/54731
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This makes sure that its argument is marked live on entry.
We need its arg to be live so defers of KeepAlive get
scanned correctly by the GC.
Fixes#21402
Change-Id: I906813e433d0e9726ca46483723303338da5b4d7
Reviewed-on: https://go-review.googlesource.com/55150
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This change replaces the current runtime capabilities check for ppc64x with the
new internal/cpu package. It also adds support for the new POWER9 ISA and
capabilities.
Updates #15403
Change-Id: I5b64a79e782f8da3603e5529600434f602986292
Reviewed-on: https://go-review.googlesource.com/53830
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
This generates better code.
Masking B in the return statement should be unnecessary,
but the compiler is understandably not yet clever enough to see that.
Someday, it'd also be nice for the compiler to generate
a CMOV for the saturation if statement.
Change-Id: Ie1c157b21f5212610da1f3c7823a93816b3b61b9
Reviewed-on: https://go-review.googlesource.com/54656
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Combine conditions into a single if statement.
This is more readable.
It should generate identical machine code, but it doesn't.
The new code is shorter.
Change-Id: I9bf52f8f288b0df97a2b9b4e4183f6ca74175e8a
Reviewed-on: https://go-review.googlesource.com/54651
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
This reduces the wall time to run these benchmarks by about 30%.
Change-Id: I494a93c93e5acb1514510d85f65796f62e1629a5
Reviewed-on: https://go-review.googlesource.com/54650
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Currently we only support finding symbols in the VDSO using the old
DT_HASH. These days everything uses DT_GNU_HASH instead. To keep up
with the times and future-proof against DT_HASH disappearing from the
VDSO in the future, this commit adds support for DT_GNU_HASH and
prefers it over DT_HASH.
Tested by making sure it found a DT_GNU_HASH section and all of the
expected symbols in it, and then disabling the DT_GNU_HASH path and
making sure the old DT_HASH path still found all of the symbols.
Fixes#19649.
Change-Id: I508c8b35a019330d2c32f04f3833b69cb2686f13
Reviewed-on: https://go-review.googlesource.com/45511
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The dieFromSignal runtime function attempts to forward crashing
signals to a signal handler registered before the runtime was
initialized, if any. However, on Darwin, a special signal handler
trampoline is invoked, even for non-Go signal handlers.
Clear the crashing signal's handlingSig entry to ensure sigtramp
forwards the signal.
Fixes the darwin/386 builder.
Updates #20392
Updates #19389
Change-Id: I441a3d30c672cdb21ed6d8f1e1322d7c0e5b9669
Reviewed-on: https://go-review.googlesource.com/55032
Run-TryBot: Elias Naur <elias.naur@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
According to http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html,
pthread_key_create return an error number which is greater than or equal
to 0. I don't know the scenario that pthread_setspecific would fail, but
also don't know the future. Add some error handlings just in case.
Change-Id: I0774b79ef658d67e300f4a9aab1f2e3879acc7ee
Reviewed-on: https://go-review.googlesource.com/54811
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
_main has an early check to verify if a binary is statically or dynamically
linked that depends on R0 being zero. R0 is not guaranteed to be zero at that
point and this was breaking Go on Alpine for ppc64le.
Change-Id: I4a1059ff7fd3db6fc489e7dcfe631c1814dd965b
Reviewed-on: https://go-review.googlesource.com/54730
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Although mincore is declared in stubs.go, mincore isn't used by any
OSes except linux. Move it to os_linux.go and clean up unused code.
Change-Id: I6cfb0fed85c0317a4d091a2722ac55fa79fc7c9a
Reviewed-on: https://go-review.googlesource.com/54910
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This computes the maximum possible waste in a size class due to both
internal and external fragmentation as a percent of the span size.
This parallels the reasoning about overhead in the comment at the top
of mksizeclasses.go and confirms that comment's assertion that (except
for the few smallest size classes), none of the size classes have
worst-case internal and external fragmentation simultaneously.
Change-Id: Idb66fe6c241d56f33d391831d4cd5a626955562b
Reviewed-on: https://go-review.googlesource.com/49370
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Before this CL, whenever the Go runtime wanted to kill its own
process with a signal dieFromSignal would reset the signal handler
to _SIG_DFL.
Unfortunately, if any signal handler were installed before the Go
runtime initialized, it wouldn't be invoked either.
Instead, use whatever signal handler was installed before
initialization.
The motivating use case is Crashlytics on Android. Before this CL,
Crashlytics would not consider a crash from a panic() since the
corresponding SIGABRT never reached its signal handler.
Updates #11382
Updates #20392 (perhaps even fixes it)
Fixes#19389
Change-Id: I0c8633329433b45cbb3b16571bea227e38e8be2e
Reviewed-on: https://go-review.googlesource.com/49590
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
To avoid gigantic core dumps, the runtime avoids raising SIGABRT
on crashes on 64-bit Darwin systems. Mobile OS'es (probably) don't
generate huge core dumps, so to aid crash reporters, allow SIGABRT
on crashes on darwin/arm64.
Change-Id: I4a29608f400967d76f9bd0643fea22244c2da9df
Reviewed-on: https://go-review.googlesource.com/49770
Run-TryBot: Elias Naur <elias.naur@gmail.com>
Reviewed-by: Avelino <t@avelino.xxx>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This change enables buildmode c-shared on ppc64le.
A bug was fixed in runtime/rt0_linux_ppc64le.s that was necessary to
make this work. In _rt0_ppc64le_linux_lib, there is code to store
the value of r2 onto the caller's stack. However, if this file
is compiled using a build mode that maintains the TOC address in
r2, then instructions will be inserted at the beginning of this
function to generate the r2 value for the callee, not the caller.
That means the r2 value for the callee is stored onto the caller's
stack. If caller and callee don't have the same r2 values, then
the caller will restore the wrong r2 value after it returns. This
situation can happen when using dlopen since the caller of this
function will be in ld64.so and will definitely have a different
TOC.
Updates #20756
Change-Id: I6e165e0d0716e73721bbbcc520e8302e4856e3ba
Reviewed-on: https://go-review.googlesource.com/53890
Reviewed-by: Ian Lance Taylor <iant@golang.org>
We use lock-free reads from mheap.spans, but the safety of these is
somewhat subtle. Document this.
Change-Id: I928c893232176135308e38bed788d5f84ff11533
Reviewed-on: https://go-review.googlesource.com/54310
Reviewed-by: Rick Hudson <rlh@golang.org>
The compiler is now smart enough not to insert a bounds check.
Not only is this simpler, it eliminates a LEAQ from the
generated code.
Change-Id: Ie90cbd11584542edd99edd5456d9b02c406e8063
Reviewed-on: https://go-review.googlesource.com/53892
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
It appears that this was just missed
by accident in the original implementation.
Change-Id: Id87147bcb7a685d624eac7034342a305ad644e7a
Reviewed-on: https://go-review.googlesource.com/53891
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Avelino <t@avelino.xxx>
According to http://infocenter.arm.com:
* ARM Cortex-A53 (Raspberry Pi 3, Pine A64)
* ARM Cortex-A57 (Opteron A1100, Tegra X1)
* ARM Cortex-A72
all have a cache line size of 64 bytes.
Change-Id: I4b333e930792fb1a221b3ca6f395bfa1b7762afa
Reviewed-on: https://go-review.googlesource.com/43250
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The only non test user of the assembler prefetch functions is the
heapBits.prefetch function which is itself unused.
The runtime prefetch functions have no functionality on most platforms
and are not inlineable since they are written in assembler. The function
call overhead eliminates the performance gains that could be achieved with
prefetching and would degrade performance for platforms where the functions
are no-ops.
If prefetch functions are needed back again later they can be improved
by avoiding the function call overhead and implementing them as intrinsics.
Change-Id: I52c553cf3607ffe09f0441c6e7a0a818cb21117d
Reviewed-on: https://go-review.googlesource.com/44370
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We need to make sure that when the key contains a pointer, we use
a write barrier to update the key.
Also mapdelete_* should use typedmemclr.
Fixes#21297
Change-Id: I63dc90bec1cb909c2c6e08676c9ec853d736cdf8
Reviewed-on: https://go-review.googlesource.com/53414
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The activeModules function is called by the cgo pointer checking code,
which is called by the write barrier (when GODEBUG=cgocheck=2), and as
such must be nosplit/nowritebarrier.
Fixes#21306
Change-Id: I57f2124f14de7f3872b2de9532abab15df95d45a
Reviewed-on: https://go-review.googlesource.com/53352
Reviewed-by: Austin Clements <austin@google.com>
We lazily map the bitmap and spans areas as the heap grows. However,
right now we're very slightly too lazy. Specifically, the following
can happen on 32-bit:
1. mallocinit fails to allocate any heap arena, so
arena_used == arena_alloc == arena_end == bitmap.
2. There's less than 256MB between the end of the bitmap mapping and
the next mapping.
3. On the first allocation, mheap.sysAlloc sees that there's not
enough room in [arena_alloc, arena_end) because there's no room at
all. It gets a 256MB mapping from somewhere *lower* in the address
space than arena_used and sets arena_alloc and arena_end to this
hole.
4. Since the new arena_alloc is lower than arena_used, mheap.sysAlloc
doesn't bother to call mheap.setArenaUsed, so we still don't have a
bitmap mapping or a spans array mapping.
5. mheap.grow, which called mheap.sysAlloc, attempts to fill in the
spans array and crashes.
Fix this by mapping the metadata regions for the initial arena_used
when the heap is initialized, rather than trying to wait for an
allocation. This maintains the intended invariant that the structures
are always mapped for [arena_start, arena_used).
Fixes#21044.
Change-Id: I4422375a6e234b9f979d22135fc63ae3395946b0
Reviewed-on: https://go-review.googlesource.com/51714
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Right now, if it's possible to grow the arena reservation but
mheap.sysAlloc fails to get 256MB more of memory, it simply fails.
However, on 32-bit we have a fallback path that uses much smaller
mmaps that could take in this situation, but fail to.
This commit fixes mheap.sysAlloc to use a common failure path in case
it can't grow the reservation. On 32-bit, this path includes the
fallback.
Ideally, mheap.sysAlloc would attempt smaller reservation growths
first, but taking the fallback path is a simple change for Go 1.9.
Updates #21044 (fixes one of two issues).
Change-Id: I1e0035ffba986c3551479d5742809e43da5e7c73
Reviewed-on: https://go-review.googlesource.com/51713
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
64bit atomics on mips/mipsle are implemented using spinlocks. If SIGPROF
is received while the program is in the critical section, it will try to
write the sample using the same spinlock, creating a deadloop.
Prevent it by creating a counter of SIGPROFs during atomic64 and
postpone writing the sample(s) until called from elsewhere, with
pc set to _LostSIGPROFDuringAtomic64.
Added a test case, per Cherry's suggestion. Works around #20146.
Change-Id: Icff504180bae4ee83d78b19c0d9d6a80097087f9
Reviewed-on: https://go-review.googlesource.com/42652
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The runtime tests may be invoked from a parent that has SIGQUIT
blocked. For example, Java invokes subprocesses this way. In this
situation, TestCrashDumpsAllThreads and TestPanicSystemstack will fail
because they depend on SIGQUIT to get tracebacks, and any subprocess
test that times out will fail to kill the subprocess.
Fix this by detecting if SIGQUIT is blocked and, if so, skipping tests
that depend on it and using SIGKILL to kill timed-out subprocesses.
Based on a fix by Carl Henrik Lunde in
https://golang.org/issue/19196#issuecomment-316145733Fixes#19196.
Change-Id: Ia20bf15b96086487d0ef6b75239dcc260c21714c
Reviewed-on: https://go-review.googlesource.com/50330
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
If we are using vfork, and if something (such as TSAN) is intercepting
the sigaction function, then we must call the system call, not the
libc function. Otherwise the intercepted sigaction call in the child
may trash the data structures in the parent.
Change-Id: Id9588bfeaa934f32c920bf829c5839be5cacf243
Reviewed-on: https://go-review.googlesource.com/50251
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
Currently we trace mark assists even if they're satisfied entirely by
stealing. This means even if background marking is keeping up with
allocation, we'll still emit a trace event every N bytes of
allocation. The event will be a few microseconds, if that, but they're
frequent enough that, when zoomed out in the trace view, it looks like
all of the time is spent in mark assists even if almost none is.
Change this so we only emit a trace event if the assist actually has
to do assisting. This makes the traces of these events far more
useful.
Change-Id: If4aed1c413b814341ef2fba61d2f10751d00451b
Reviewed-on: https://go-review.googlesource.com/50030
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
tSweepTerm and pauseStart are supposed to be when STW was triggered,
but right now they're captured a bit before STW. Move these down to
immediately before we trigger STW.
Fixes#19590.
Change-Id: Icd48a5c4d45c9b36187ff986e4f178b5064556c1
Reviewed-on: https://go-review.googlesource.com/49612
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently, Windows stacks are either 128kB or 2MB depending on whether
the binary uses cgo. This is because we assume that Go system stacks
and the small amount of C code invoked by the standard library can
operate within smaller stacks, but general Windows C code assumes
larger stacks.
However, it's easy to call into arbitrary C code using the syscall
package on Windows without ever importing cgo into a binary. Such
binaries need larger system stacks even though they don't use cgo.
Fix this on 64-bit by increasing the system stack size to 2MB always.
This only costs address space, which is free enough on 64-bit to not
worry about. We keep (for now) the existing heuristic on 32-bit, where
address space comes at more of a premium.
Updates #20975.
Change-Id: Iaaaa9a2fcbadc825cddc797aaaea8d34ef8debf2
Reviewed-on: https://go-review.googlesource.com/49331
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
kicking off contributing again with a classic
Change-Id: Ifb0aed8f1dc854f85751ce0495967a3c4315128d
Reviewed-on: https://go-review.googlesource.com/49016
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
It seems that when too much other code is running on the system,
the testprogcgo code can overrun its timeouts.
Updates #18598.
Not marking the issue as fixed until it doesn't recur for some time.
Change-Id: Ieaf106b41986fdda76b1d027bb9d5e3fb805cc3b
Reviewed-on: https://go-review.googlesource.com/48233
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
SysV semaphore undo lists should be shared by threads, just like
several other resources listed in cloneFlags. Currently we don't do
this, but it probably doesn't affect anything because 1) probably
nobody uses SysV semaphores from Go and 2) Go-created threads never
exit until the process does. Beyond being the right thing to do,
user-level QEMU requires this flag because it depends on glibc to
create new threads and glibc uses this flag.
Fixes#20763.
Change-Id: I1d1dafec53ed87e0f4d4d432b945e8e68bb72dcd
Reviewed-on: https://go-review.googlesource.com/48170
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TestStackGrowth is currently a parallel test. However, it depends on a
20 second timeout, which is already dubious in a parallel test, and
became really problematic on slow builders when runtime.GC switched to
triggering concurrent GC instead of STW GC. Before that change, the
test spent much of its time in STW GC, so it wasn't *really* parallel.
After that change, it was competing with all of the other parallel
tests and GC likely started taking ~4 times longer. On most builders
the whole test runs in well under a second, but on the slow builders
that was enough to push it over the 20 second timeout.
Fix this by making the test serial.
Updates #19381 (probably fixes it, but we'll have to wait and see).
Change-Id: I21af7cf543ab07f1ec1c930bfcb355b0df75672d
Reviewed-on: https://go-review.googlesource.com/48110
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Elias Naur <elias.naur@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The current description refers to the outermost "frame" which can be
misleading. A user reading it can think it means a stack frame.
Change-Id: Ie2c7cb4b4db8f41572df206478ce3b46a0245a5d
Reviewed-on: https://go-review.googlesource.com/47850
Reviewed-by: Austin Clements <austin@google.com>
Currently, sysmon waits 60 ms during idle before relaxing. This is
primarily to avoid reducing the precision of short-duration timers. Of
course, if there are no short-duration timers, this wastes 60 ms
running the timer at high resolution.
Improve this by instead inspecting the time until the next timer fires
and relaxing the timer resolution immediately if the next timer won't
fire for a while.
Updates #20937.
Change-Id: If4ad0a565b65a9b3e8c4cdc2eff1486968c79f24
Reviewed-on: https://go-review.googlesource.com/47833
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently, sysmon relaxes the Windows timer resolution as soon as the
Go process becomes idle. However, if it's going idle because of a
short sleep (< 15.6 ms), this can turn that short sleep into a long
sleep (15.6 ms).
To address this, wait for 60 ms of idleness before relaxing the timer
resolution. It would be better to check the time until the next wakeup
and relax immediately if it makes sense, but there's currently no
interaction between sysmon and the timer subsystem, so adding this
simple delay is a much simpler and safer change for late in the
release cycle.
Fixes#20937.
Change-Id: I817db24c3bdfa06dba04b7bc197cfd554363c379
Reviewed-on: https://go-review.googlesource.com/47832
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
R11 is callee-save in the C ABI, but the temporary register in the Go
ABI. Currently it's being clobbered by runtime.addmoduledata, which
has to follow the C ABI. The observed effect of this was that
dl_open_worker was returning to a bad PC because after it failed to
restore its SP because it was using R11 as a frame pointer.
Fix this by saving R11 around addmoduledata.
Fixes#19674.
Change-Id: Iaacbcc76809a3aa536e9897770831dcbcb6c8245
Reviewed-on: https://go-review.googlesource.com/47831
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Currently only the rwmutex write lock prevents descheduling. The read
lock does not. This leads to the following situation:
1. A reader acquires the lock and gets descheduled.
2. GOMAXPROCS writers attempt to acquire the lock (or at least one
writer does, followed by readers). This blocks all of the Ps.
3. There is no 3. The descheduled reader never gets to run again
because there are no Ps, so it never releases the lock and the system
deadlocks.
Fix this by preventing descheduling while holding the read lock. This
requires also rewriting TestParallelRWMutexReaders to always create
enough GOMAXPROCS and to use non-blocking operations for
synchronization.
Fixes#20903.
Change-Id: Ibd460663a7e5a555be5490e13b2eaaa295fac39f
Reviewed-on: https://go-review.googlesource.com/47632
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The name LabelList was changed to LabelSet during the development of the
proposal [1], except in one function comment. This commit fixes that.
Fixes#20905.
[1] https://github.com/golang/go/issues/17280
Change-Id: Id4f48d59d7d513fa24b2e42795c2baa5ceb78f36
Reviewed-on: https://go-review.googlesource.com/47470
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
mheap.allocLarge just calls bestFitTreap and is the only caller of
bestFitTreap. Flatten these into a single function. Also fix their
comments: allocLarge claims to return exactly npages but can in fact
return a larger span, and h.freelarge is not in fact indexed by span
start address.
Change-Id: Ia20112bdc46643a501ea82ea77c58596bc96f125
Reviewed-on: https://go-review.googlesource.com/47315
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The Func type has allowed calling the Func.Name method on a nil pointer
since Go1.2, where it returned an empty string. A regression caused by
CL/37331 caused this behavior to change. This breaks code that lazily
does runtime.FuncForPC(myPtr).Name() without first checking that myPtr
is actually non-nil.
Fixes#20872
Change-Id: Iae9a2ebabca5e9d1f5a2cdaf2f30e9c6198fec4f
Reviewed-on: https://go-review.googlesource.com/47354
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently the execLock is a mutex, which has the unfortunate
side-effect of serializing all thread creation. This replaces it with
an rwmutex so threads can be created in parallel, but exec still
blocks thread creation.
Fixes#20738.
Change-Id: Ia8f30a92053c3d28af460b0da71176abe5fd074b
Reviewed-on: https://go-review.googlesource.com/47072
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently runtime.rwmutex is written to block the calling goroutine
rather than the calling thread. However, rwmutex was intended to be
used in the scheduler, which means it needs to be a thread-level
synchronization primitive.
Hence, this modifies rwmutex to synchronize threads instead of
goroutines. This has the consequence of making it write-barrier-free,
which is also important for using it in the scheduler.
The implementation makes three changes: it replaces the "w" semaphore
with a mutex, since this was all it was being used for anyway; it
replaces "writerSem" with a single pending M that parks on its note;
and it replaces "readerSem" with a list of Ms that park on their notes
plus a pass count that together emulate a counting semaphore. I
model-checked the safety and liveness of this implementation through
>1 billion schedules.
For #20738.
Change-Id: I3cf5a18c266a96a3f38165083812803510217787
Reviewed-on: https://go-review.googlesource.com/47071
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
When the dedicated mark worker runs, the scheduler won't run on that P
again until GC runs out of mark work. As a result, any goroutines in
that P's local run queue are stranded until another P steals them. In
a normally operating system this may take a long time, and in a 100%
busy system, the scheduler never attempts to steal from another P.
Fix this by draining the local run queue into the global run queue if
the dedicated mark worker has run for long enough. We don't do this
immediately upon scheduling the dedicated mark worker in order to
avoid destroying locality if the mark worker runs for a short time.
Instead, the scheduler delays draining the run queue until the mark
worker gets its first preemption request (and otherwise ignores the
preemption request).
Fixes#20011.
Change-Id: I13067194b2f062b8bdef25cb75e4143b7fb6bb73
Reviewed-on: https://go-review.googlesource.com/46610
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
When Stop is called on a channel, wait until all signals have been
delivered to the channel before returning.
Use atomic operations in sigqueue to communicate more reliably between
the os/signal goroutine and the signal handler.
Fixes#14571
Change-Id: I6c5a9eea1cff85e37a34dffe96f4bb2699e12c6e
Reviewed-on: https://go-review.googlesource.com/46003
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Linux's execve has (at the time of writing, and since v2.6.30) a bug when it ran
concurrently with clone, in that it would fail to set up some datastructures if
the thread count before and after some steps differed. This is described better
and in more detail by Colin King in Launchpad¹ and kernel² bugs. When a program
written in Go runtime.Exec's a setuid binary, this issue may cause the resulting
process to not have the expected uid. This patch works around the issue by using
a mutex to serialize exec and clone.
1. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672819
2. https://bugzilla.kernel.org/show_bug.cgi?id=195453Fixes#19546
Change-Id: I126e87d1d9ce3be5ea4ec9c7ffe13f92e087903d
Reviewed-on: https://go-review.googlesource.com/43713
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This is a runtime version of sync.RWMutex that can be used by code in
the runtime package. The type is not quite the same, in that the zero
value is not valid.
For future use by CL 43713.
Updates #19546
Change-Id: I431eb3688add16ce1274dab97285f555b72735bf
Reviewed-on: https://go-review.googlesource.com/45991
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
They were failing when run on 32bit RFS, with 32bit gdb.
(mips64 builder now has 64bit RFS, with gdb 7.9.)
Leaving TestGdbPythonCgo disabled, it behaves as described in #18784.
Fixes#18173
Change-Id: I3c438cd5850b7bfd118ac6396f40c1208bac8c2d
Reviewed-on: https://go-review.googlesource.com/45874
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
These are used by DIV[U] and MOD[U] assembly instructions.
Add a test in the stdlib so we actually exercise linking
to these routines.
Update #19507
Change-Id: I0d8e19a53e3744abc0c661ea95486f94ec67585e
Reviewed-on: https://go-review.googlesource.com/45703
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Also add runtime· prefixes to the code that is still used.
Fixes#19507
Change-Id: Ib6da6b2a9e398061d3f93958ee1258295b6cc33b
Reviewed-on: https://go-review.googlesource.com/45699
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently, semrelease1 readies the next waiter before recording a
mutex event. However, if the next waiter is expecting to look at the
mutex profile, as is the case in TestMutexProfile, this may delay
recording the event too much.
Swap the order of these operations so semrelease1 records the mutex
event before readying the next waiter. This also means readying the
next waiter is the very last thing semrelease1 does, which seems
appropriate.
Fixes#19139.
Change-Id: I1a62063599fdb5d49bd86061a180c0a2d659474b
Reviewed-on: https://go-review.googlesource.com/45751
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Peter Weinberger <pjw@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Block all signals during a fork. In the parent process, after the
fork, restore the signal mask. In the child process, reset all
currently handled signals to the default handler, and then restore the
signal mask.
The effect of this is that the child will be operating using the same
signal regime as the program it is about to exec, as exec resets all
non-ignored signals to the default, and preserves the signal mask.
We do this so that in the case of a signal sent to the process group,
the child process will not try to run a signal handler while in the
precarious state after a fork.
Fixes#18600.
Change-Id: I9f39aaa3884035908d687ee323c975f349d5faaa
Reviewed-on: https://go-review.googlesource.com/45471
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
I was surprised to see readvarint show up in a cpu profile.
Use a few simple optimizations to speed up stack copying:
* Avoid making a copy of the cache.entries array or any of its elements.
* Use a shift instead of a signed division in stackmapdata.
* Change readvarint to return the number of bytes consumed
rather than an updated slice.
* Make some minor optimizations to readvarint to help the compiler.
* Avoid called readvarint when the value fits in a single byte.
The first and last optimizations are the most significant,
although they all contribute a little.
Add a benchmark for stack copying that includes lots of different
functions in a recursive loop, to bust the cache.
This might speed up other runtime operations as well;
I only benchmarked stack copying.
name old time/op new time/op delta
StackCopy-8 96.4ms ± 2% 82.7ms ± 1% -14.24% (p=0.000 n=20+19)
StackCopyNoCache-8 167ms ± 1% 131ms ± 1% -21.58% (p=0.000 n=20+20)
Change-Id: I13d5c455c65073c73b656acad86cf8e8e3c9807b
Reviewed-on: https://go-review.googlesource.com/43150
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
There are currently two arrays indexed by P ID: allp and pdesc.
Consolidate these by moving the pdesc fields into type p so they can
be indexed off allp along with all other per-P state.
For #15131.
Change-Id: Ib6c4e6e7612281a1171ba4a0d62e52fd59e960b4
Reviewed-on: https://go-review.googlesource.com/45572
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently MaxGomaxprocs is 256. The previous CL saved enough per-P
static space that we can quadruple MaxGomaxprocs (and hence the static
size of allp) and still come out ahead.
This is safe for Go 1.9. In Go 1.10 we'll eliminate the hard-coded
limit entirely.
Updates #15131.
Change-Id: I919ea821c1ce64c27812541dccd7cd7db4122d16
Reviewed-on: https://go-review.googlesource.com/45673
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Back in the day, allp was just a pointer to an array. As a result, the
runtime has a few loops of the form:
for i := 0; ; i++ {
p := allp[i]
if p == nil {
break
}
...
}
This is silly now because it requires that allp be one longer than the
maximum possible number of Ps, but now that allp is in Go it has a
length.
Replace these with range loops.
Change-Id: I91ef4bc7bd3c9d4fda2264f4aa1b1d0271d7f578
Reviewed-on: https://go-review.googlesource.com/45571
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
ARM currently does not use a hardware yield instruction in the spin
loop in procyield because the YIELD instruction was only added in
ARMv6K. However, it appears earlier ARM chips will interpret the YIELD
encoding as an effective NOP (specifically an MSR instruction that
ultimately has no effect on the CPSR register).
Hence, use YIELD in procyield on ARM since it should be, at worst,
harmless.
Fixes#16663.
Change-Id: Id1787ac48862b785b92c28f1ac84cb4908d2173d
Reviewed-on: https://go-review.googlesource.com/45250
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
If we're in a situation where printing the fp and sp in the traceback
is useful, it's almost certainly also useful to print the PC.
Change-Id: Ie48a0d5de8a54b5b90ab1d18638a897958e48f70
Reviewed-on: https://go-review.googlesource.com/45210
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Both runtime.exit and syscall.Exit call Windows ExitProcess.
But recently (CL 34616) runtime.exit was changed to ignore
Windows CreateThread errors if ExitProcess is called.
This CL adjusts syscall.Exit to do the same.
Fixes#18253 (maybe)
Change-Id: I6496c31b01e7c7d73b69c0b2ae33ed7fbe06736b
Reviewed-on: https://go-review.googlesource.com/45115
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This adds diagnostics so we can tell if the finalizer has started, in
addition to whether or not it has finished.
Updates #19381.
Change-Id: Icb7b1b0380c9ad1128b17074828945511a6cca5d
Reviewed-on: https://go-review.googlesource.com/45138
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
runtime.GC no longer triggers a STW GC. This fixes the description of
GODEBUG=gctrace=1 so it doesn't claim otherwise.
Change-Id: Ibd34a55c5ae7b5eda5c2393b9a6674bdf1d51eb3
Reviewed-on: https://go-review.googlesource.com/45131
Reviewed-by: Rick Hudson <rlh@golang.org>
The current implementation of "goroutine N cmd" assumes it can get
goroutine N's state from the goroutine's sched buffer. But this only
works if the goroutine is blocked. Extend find_goroutine so that, if
there is no saved scheduler state for a goorutine, it tries to find
the thread the goroutine is running on and use the thread's current
register state. We also extend find_goroutine to understand saved
syscall register state.
Fixes#13887.
Change-Id: I739008a8987471deaa4a9da918655e4042cf969b
Reviewed-on: https://go-review.googlesource.com/45031
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently the extra Ms created for cgo callbacks have a corresponding
G that's kept in syscall state with only a call to goexit on its
stack. This leads to confusing output from runtime.NumGoroutines and
in tracebacks:
goroutine 17 [syscall, locked to thread]:
runtime.goexit()
.../src/runtime/asm_amd64.s:2197 +0x1
Fix this by putting this goroutine into state _Gdead when it's not in
use instead of _Gsyscall. To keep the goroutine counts correct, we
also add one to sched.ngsys while the goroutine is in _Gdead. The
effect of this is as if the goroutine simply doesn't exist when it's
not in use.
Fixes#16631.
Fixes#16714.
Change-Id: Ieae08a2febd4b3d00bef5c23fd6ca88fb2bb0087
Reviewed-on: https://go-review.googlesource.com/45030
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The test is inherently racy, and for me fails about 0.05% of the time.
So only fail the test if it fails ten times in a row.
Fixes#20594
Change-Id: I3b3f7598f2196f7406f1a3937f38f21ff0c0e4b5
Reviewed-on: https://go-review.googlesource.com/45020
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
For cgo programs on linux-amd64 we call the C function mmap.
This supports programs such as the C memory sanitizer that need to
intercept all calls to mmap. It turns out that there are programs that
intercept both mmap and munmap, or that at least expect that if they
intercept mmap, they also intercept munmap. So, if we permit mmap
to be intercepted, also permit munmap to be intercepted.
No test, as it requires two odd things: a C program that intercepts
mmap and munmap, and a Go program that calls munmap.
Change-Id: Iec33f47d59f70dbb7463fd12d30728c24cd4face
Reviewed-on: https://go-review.googlesource.com/45016
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Try to avoid a race between the main goroutine exiting and a panic
occurring. Don't try too hard, to avoid hanging.
Updates #3934Fixes#20018
Change-Id: I57a02b6d795d2a61f1cadd137ce097145280ece7
Reviewed-on: https://go-review.googlesource.com/41052
Reviewed-by: Austin Clements <austin@google.com>
C code expects CR2, CR3, and CR4 to be preserved across function calls.
Preserve the entire CR register across function calls in
_rt0_ppc64le_linux_lib and crosscall2. The standard ppc64le call frame
uses 8(R1) as the place to save CR; emulate that.
It's hard to write a reliable test for this as it requires writing C
code that sets CR2, CR3, or CR4 across a call to a Go function.
Change-Id: If39e771a5b574602b848227312e83598fe74eab7
Reviewed-on: https://go-review.googlesource.com/44733
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Since TestPingPongHog tests the scheduler, it's ultimately
probabilistic. Currently, it requires the result be at most of factor
of 2 off of the ideal. It turns out this isn't quite enough in
practice, with factors on 1000 iterations on linux/amd64 ranging from
0.48 to 2.5. If the test were failing, we would expect a factor closer
to 1000X, so it's pretty safe to expand the accepted factor from 2 to
5.
Fixes#20494.
Change-Id: If8f2e96194fe66f1fb981a965d1167fe74ff38d7
Reviewed-on: https://go-review.googlesource.com/44859
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
key32 is called between entersyscallblock and exitsyscall
stack split may occur if disable inlining and the G is preempted
Fix the problem by describing key32 as nosplit function
Fixes#20510
Change-Id: I1f0787995936f34ef0052cf79fde036f1b338865
Reviewed-on: https://go-review.googlesource.com/44390
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
cmd/compile/internal/ld/decodesym.go is now
cmd/link/internal/ld/decodesym.go
Change-Id: I16ec5c89aa3507e70676c2b50d70f1fde533a085
Reviewed-on: https://go-review.googlesource.com/44373
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This avoids false-positive TSAN reports when using the C sigaction
function to read handlers registered by the Go runtime.
(Unfortunately, I can't seem to coax the runtime into reproducing the
failure in a small unit-test.)
Change-Id: I744279a163708e24b1fbe296ca691935c394b5f3
Reviewed-on: https://go-review.googlesource.com/44270
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently, the heap arena allocator allocates monotonically increasing
addresses. This is fine on 64-bit where we stake out a giant block of
the address space for ourselves and start at the beginning of it, but
on 32-bit the arena starts at address 0 but we start allocating from
wherever the OS feels like giving us memory. We can generally hint the
OS to start us at a low address, but this doesn't always work.
As a result, on 32-bit, if the OS gives us an arena block that's lower
than the current block we're allocating from, we simply say "thanks
but no thanks", return the whole (256MB!) block of memory, and then
take a fallback path that mmaps just the amount of memory we need
(which may be as little as 8K).
We have to do this because mheap_.arena_used is *both* the highest
used address in the arena and the next address we allocate from.
Fix all of this by separating the second role of arena_used out into a
new field called arena_alloc. This lets us accept any arena block the
OS gives us. This also slightly changes the invariants around
arena_end. Previously, we ensured arena_used <= arena_end, but this
was related to arena_used's second role, so the new invariant is
arena_alloc <= arena_end. As a result, we no longer necessarily update
arena_end when we're updating arena_used.
Fixes#20259 properly. (Unlike the original fix, this one should not
be cherry-picked to Go 1.8.)
This is reasonably low risk. I verified several key properties of the
32-bit code path with both 4K and 64K physical pages using a symbolic
model and the change does not materially affect 64-bit (arena_used ==
arena_alloc on 64-bit). The only oddity is that we no longer call
setArenaUsed with racemap == false to indicate that we're creating a
hole in the address space, but this only happened in a 32-bit-only
code path, and the race detector require 64-bit, so this never
mattered anyway.
Change-Id: Ib1334007933e615166bac4159bf357ae06ec6a25
Reviewed-on: https://go-review.googlesource.com/44010
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
We weren't setting r0 to 0, as required by our generated code.
Before this patch, the misc/cgo/testcarchive tests failed on ppc64le.
After this patch, they work, so enable them.
Change-Id: I53b16746961da9f7c34f59030a1e40953c9c1e05
Reviewed-on: https://go-review.googlesource.com/44093
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Commit 4dcba023c6 replaced select with pselect6 on linux/amd64 and
linux/arm, but it turns out the Android emulator uses linux/386. This
makes the equivalent change there, too.
Fixes#20409 more.
Change-Id: If542d6ade06309aab8758d5f5f6edec201ca7670
Reviewed-on: https://go-review.googlesource.com/44011
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
There are two copies each of the stackPreempt/_StackPreempt and
stackFork/_StackFork constants. Remove the ones left over from C that
are no longer used.
Change-Id: I849604c72c11e4a0cb08e45e9817eb3f5a6ce8ba
Reviewed-on: https://go-review.googlesource.com/43638
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Setting stackCache to 0 to disable stack caches for debugging hasn't
worked for a long time. It causes stackalloc to fall back to full span
allocation, round sub-page stacks down to 0 pages, and blow up.
Fix this debug mode so it disables the per-P caches, but continues to
use the global stack pools for small stacks, which correctly handle
sub-page stacks. While we're here, rename stackCache to stackNoCache
so it acts like the rest of the stack allocator debug modes where "0"
is the right default value.
Fixes#17291.
Change-Id: If401c41cee3448513cbd7bb2e9334a8efab257a7
Reviewed-on: https://go-review.googlesource.com/43637
Reviewed-by: Keith Randall <khr@golang.org>
The stackFromSystem debug mode has two problems:
1) It rounds the stack allocation to _PageSize. If the physical page
size is >8K, this can cause unmapping the memory later to either
under-unmap or over-unmap.
2) It doesn't return the rounded-up allocation size to its caller, so
when we later unmap the memory, we may pass the wrong length.
Fix these problems by rounding the size up to the physical page size
and putting that rounded-up size in the returned stack bounds.
Fixes#17289.
Change-Id: I6b854af3b06bb16e3750798397bb5e2a722ec1cb
Reviewed-on: https://go-review.googlesource.com/43636
Reviewed-by: Keith Randall <khr@golang.org>
If mheap.sysAlloc doesn't have room in the heap arena for an
allocation, it will attempt to map more address space with sysReserve.
sysReserve is given a hint, but can return any unused address range.
Currently, mheap.sysAlloc incorrectly assumes the returned region will
never fall between arena_start and arena_used. If it does,
mheap.sysAlloc will blindly accept the new region as the new
arena_used and arena_end, causing these to decrease and make it so any
Go heap above the new arena_used is no longer considered part of the
Go heap. This assumption *used to be* safe because we had all memory
between arena_start and arena_used mapped, but when we switched to an
arena_start of 0 on 32-bit, it became no longer safe.
Most likely, we've only recently seen this bug occur because we
usually start arena_used just above the binary, which is low in the
address space. Hence, the kernel is very unlikely to give us a region
before arena_used.
Since mheap.sysAlloc is a linear allocator, there's not much we can do
to handle this well. Hence, we fix this problem by simply rejecting
the new region if it isn't after arena_end. In this case, we'll take
the fall-back path and mmap a small region at any address just for the
requested memory.
Fixes#20259.
Change-Id: Ib72e8cd621545002d595c7cade1e817cfe3e5b1e
Reviewed-on: https://go-review.googlesource.com/43870
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Android O black-lists the select system call because its libc, Bionic,
does not use this system call. Replace our use of select with pselect6
(which is allowed) on the platforms that support targeting Android.
linux/arm64 already uses pselect6 because there is no select on arm64,
so only linux/amd64 and linux/arm need changing. pselect6 has been
available since Linux 2.6.16, which is before Go's minimum
requirement.
Fixes#20409.
Change-Id: Ic526b5b259a9e01d2f145a1f4d2e76e8c49ce809
Reviewed-on: https://go-review.googlesource.com/43641
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
profileBuilder.locForPC returns 0 to mean "no location" because 0 is
an invalid location index. However, the code to build count profiles
doesn't check the result of locForPC, so this 0 location index ends up
in the profile's location list. This, in turn, causes problems later
when we decode the profile because it puts a nil *Location in the
sample's location slice, which can later lead to a nil pointer panic.
Fix this by making printCountProfile correctly discard the result of
locForPC if it returns 0. This makes this call match the other two
calls of locForPC.
Updates #15156.
Change-Id: I4492b3652b513448bc56f4cfece4e37da5e42f94
Reviewed-on: https://go-review.googlesource.com/43630
Reviewed-by: Michael Matloob <matloob@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
TestGoroutineCounts was flaky when running on a system under load.
This happened on three builds the last couple of days.
Fix this by running this test with a single operating system thread, so
we do not depend on the operating system scheduler. 50 000 tests ran
without failure with the new version, the old version failed 0.5% of the
time.
Fixes#15156.
Change-Id: I1e5a18d0fef4f72cc9a56e376822b2849cdb0f8b
Reviewed-on: https://go-review.googlesource.com/43590
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
In low memory situations mmap(2) on Illumos[2] can return EAGAIN when it
is unable to reserve the necessary space for the requested mapping. Go
was not previously handling this correctly for Illumos and would fail to
recognize it was in a low-memory situation, the result being the program
would terminate with a panic instead of running the GC.
Fixes: #14930
[1]: https://www.illumos.org/man/2/mmap
Change-Id: I889cc0547e23f9d6c56e4fdd7bcbd0e15403873a
Reviewed-on: https://go-review.googlesource.com/43461
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
On other systems we use "SWI $n". Change Plan 9 files to be
consistent. Generated binary is unchanged.
Fixes#20378.
Change-Id: Ia2a722061da2450c7b30cb707ed4f172fafecf74
Reviewed-on: https://go-review.googlesource.com/43533
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
LLV and SCV are 64-bit load-linked and store-conditional. They
were used in runtime as #define WORD. Change them to normal
instruction form.
NOOP is hardware no-op. It was written as WORD $0. Make a name
for it for better disassembly output.
Fixes#12561.
Fixes#18238.
Change-Id: I82c667ce756fa83ef37b034b641e8c4366335e83
Reviewed-on: https://go-review.googlesource.com/40297
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently proto symbolization uses runtime.FuncForPC and assumes each
PC maps to a single frame. This isn't true in the presence of inlining
(even with leaf-only inlining this can get incorrect results).
Change PC symbolization to use runtime.CallersFrames to expand each PC
to all of the frames at that PC.
Change-Id: I8d20dff7495a5de495ae07f569122c225d433ced
Reviewed-on: https://go-review.googlesource.com/41256
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Proto profile conversion is inconsistent about call vs return PCs in
profile locations. The proto defines locations to be call PCs. This is
what we do when proto-izing CPU profiles, but we fail to convert the
return PCs in memory and count profile stacks to call PCs when
converting them to proto locations.
Fix this in the heap and count profile conversion functions.
TestConvertMemProfile also hard-codes this failure to convert from
return PCs to call PCs, so fix up the addresses in the synthesized
profile to be return PCs while checking that we get call PCs out of
the conversion.
Change-Id: If1fc028b86fceac6d71a2d9fa6c41ff442c89296
Reviewed-on: https://go-review.googlesource.com/42951
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
runtime.gchelper depends on the non-atomic load of work.ndone
happening strictly before the atomic add of work.nwait. Until very
recently (commit 978af9c2db, fixing #20334), the compiler reordered
these operations. This created a race since work.ndone can change as
soon as work.nwait is equal to work.ndone. If that happened, more than
one gchelper could attempt to wake up the work.alldone note, causing a
"double wakeup" panic.
This was fixed in the compiler, but to make this code less subtle,
make the load of work.ndone atomic. This clearly forces the order of
these operations, ensuring the race doesn't happen.
Fixes#19305 (though really 978af9c2db fixed it).
Change-Id: Ieb1a84e1e5044c33ac612c8a5ab6297e7db4c57d
Reviewed-on: https://go-review.googlesource.com/43311
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This adds debugging information when we panic with "heapBitsForSpan:
base out of range".
Updates #20259.
Change-Id: I0dc1a106aa9e9531051c7d08867ace5ef230eb3f
Reviewed-on: https://go-review.googlesource.com/43310
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
They are not exported and not used in the compiler or standard library.
Change-Id: Ie1d210464f826742d282f12258ed1792cbd2d188
Reviewed-on: https://go-review.googlesource.com/43135
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Implements detection of x86 cpu features that
are used in the go standard library.
Changes all standard library packages to use the new cpu package
instead of using runtime internal variables to check x86 cpu features.
Updates: #15403
Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9
Reviewed-on: https://go-review.googlesource.com/41476
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
TestGoroutineCounts currently depends on timing to get 100 goroutines
to a known blocking point before taking a profile. This fails
frequently, with different goroutines captured at different stacks.
The test is disabled on openbsd because it was too flaky, but in fact
it flakes on all platforms.
Fix this by using Gosched instead of timing. This is both much more
reliable and makes the test run faster.
Fixes#15156.
Change-Id: Ia6e894196d717655b8fb4ee96df53f6cc8bc5f1f
Reviewed-on: https://go-review.googlesource.com/42953
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Because the hint parameter is supposed to be treated
purely as a hint, if it doesn't meet the requirements
we disregard it and continue as if there was no hint
at all.
Fixes#19926
Change-Id: I86e7f99472fad6b99ba4e2fd33e4a9e55d55115e
Reviewed-on: https://go-review.googlesource.com/40854
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Changes all cpu features to be detected and stored in bools in rt0_go.
Updates: #15403
Change-Id: I5a9961cdec789b331d09c44d86beb53833d5dc3e
Reviewed-on: https://go-review.googlesource.com/41950
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
overLoadFactor used a uintptr for its calculations.
When the number of potential buckets was large,
perhaps due to a coding error or corrupt/malicious user input
leading to a very large map size hint,
this led to overflow on 32 bit systems.
This overflow resulted in an infinite loop.
Prevent it by always using a 64 bit calculation.
Updates #20195
Change-Id: Iaabc710773cd5da6754f43b913478cc5562d89a2
Reviewed-on: https://go-review.googlesource.com/42185
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Currently Go sets the system-wide timer resolution to 1ms the whole
time it's running. This has negative affects on system performance and
power consumption. Unfortunately, simply reducing the timer resolution
to the default 15ms interferes with several sleeps in the runtime
itself, including sysmon's ability to interrupt goroutines.
This commit takes a hybrid approach: it only reduces the timer
resolution when the Go process is entirely idle. When the process is
idle, nothing needs a high resolution timer. When the process is
non-idle, it's already consuming CPU so it doesn't really matter if
the OS also takes timer interrupts more frequently.
Updates #8687.
Change-Id: I0652564b4a36d61a80e045040094a39c19da3b06
Reviewed-on: https://go-review.googlesource.com/38403
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently the pprof tests re-symbolize PCs in profiles, and do so in a
way that can't handle inlining. Proto profiles already contain full
symbol information, so this modifies the tests to use the symbol
information already present in the profile.
Change-Id: I63cd491de7197080fd158b1e4f782630f1bbbb56
Reviewed-on: https://go-review.googlesource.com/41255
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Currently _TinySizeClass is untyped, which means it can accidentally
be used as a spanClass (not that I would know this from experience or
anything). Make it an int8 to avoid this mix up.
This is a cherry-pick of dev.garbage commit 81b74bf9c5.
Change-Id: I1e69eccee436ea5aa45e9a9828a013e369e03f1a
Reviewed-on: https://go-review.googlesource.com/41254
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This is no longer necessary now that we can more efficiently consult
the span's noscan bit.
This is a cherry-pick of dev.garbage commit 312aa09996.
Change-Id: Id0b00b278533660973f45eb6efa5b00f373d58af
Reviewed-on: https://go-review.googlesource.com/41252
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, we mix objects with pointers and objects without pointers
("noscan" objects) together in memory. As a result, for every object
we grey, we have to check that object's heap bits to find out if it's
noscan, which adds to the per-object cost of GC. This also hurts the
TLB footprint of the garbage collector because it decreases the
density of scannable objects at the page level.
This commit improves the situation by using separate spans for noscan
objects. This will allow a much simpler noscan check (in a follow up
CL), eliminate the need to clear the bitmap of noscan objects (in a
follow up CL), and improves TLB footprint by increasing the density of
scannable objects.
This is also a step toward eliminating dead bits, since the current
noscan check depends on checking the dead bit of the first word.
This has no effect on the heap size of the garbage benchmark.
We'll measure the performance change of this after the follow-up
optimizations.
This is a cherry-pick from dev.garbage commit d491e550c3. The only
non-trivial merge conflict was in updatememstats in mstats.go, where
we now have to separate the per-spanclass stats from the per-sizeclass
stats.
Change-Id: I13bdc4869538ece5649a8d2a41c6605371618e40
Reviewed-on: https://go-review.googlesource.com/41251
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
In particular, this says that Frames.Function uniquely identifies a
function within a program. We depend on this in various places that
use runtime.Frames in std, but it wasn't actually written down.
Change-Id: Ie7ede348c17673e11ae513a094862b60c506abc5
Reviewed-on: https://go-review.googlesource.com/41610
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Profile labels added by the user using pprof.Do, if present will
be in a *labelMap stored in the unsafe.Pointer 'tag' field of
the profile map entry. This change extracts the labels from the tag
field and writes them to the profile proto.
Change-Id: Ic40fdc58b66e993ca91d5d5effe0e04ffbb5bc46
Reviewed-on: https://go-review.googlesource.com/39613
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
If g1 sets its labels and then they are copied into a profile buffer
and then g2 reads the profile buffer and inspects the labels,
the race detector must understand that g1's recording of the labels
happens before g2's use of the labels. Make that so.
Fixes race test failure in CL 39613.
Change-Id: Id7cda1c2aac6f8eef49213b5ca414f7154b4acfa
Reviewed-on: https://go-review.googlesource.com/42111
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Delete old TestRuntimeFunctionTrimming, which is testing a dead API
and is now handled in end-to-end tests.
Change-Id: I64fc2991ed4a7690456356b5f6b546f36935bb67
Reviewed-on: https://go-review.googlesource.com/41815
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
This may improve perormance during concurrent access
to mheap.central array from multiple CPU cores.
Change-Id: I8f48dd2e72aa62e9c32de07ae60fe552d8642782
Reviewed-on: https://go-review.googlesource.com/41550
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Mostly unnecessary *testing.T arguments.
Found with github.com/mvdan/unparam.
Change-Id: Ifb955cb88f2ce8784ee4172f4f94d860fa36ae9a
Reviewed-on: https://go-review.googlesource.com/41691
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reduces cmd/go by 4464 bytes on amd64.
Removes the duplicate detection of AVX support and
presence of Intel processors.
Change-Id: I4670189951a63760fae217708f68d65e94a30dc5
Reviewed-on: https://go-review.googlesource.com/41570
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Implemented low-level time system for windows on hardware (software),
which does not support memory mapped _KSYSTEM_TIME page update.
In particular this problem exists on Wine where _KSYSTEM_TIME
only contains time at the start, and is never modified.
On start we try to detect Wine and if it's so we fallback to
GetSystemTimeAsFileTime() for current time and a monotonic
timer based on QueryPerformanceCounter family of syscalls:
https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspxFixes#18537
Change-Id: I269d22467ed9b0afb62056974d23e731b80c83ed
Reviewed-on: https://go-review.googlesource.com/35710
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The experiment "clobberdead" clobbers all pointer fields that the
compiler thinks are dead, just before and after every safepoint.
Useful for debugging the generation of live pointer bitmaps.
Helped find the following issues:
Update #15936
Update #16026
Update #16095
Update #18860
Change-Id: Id1d12f86845e3d93bae903d968b1eac61fc461f9
Reviewed-on: https://go-review.googlesource.com/23924
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Currently TestSetGCPercent checks that NextGC is within 10 MB of the
expected value. For some reason it's much noisier on some of the
builders. To get these passing again, raise the threshold to 20 MB.
Change-Id: I14e64025660d782d81ff0421c1eb898f416e11fe
Reviewed-on: https://go-review.googlesource.com/41374
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently SetGCPercent forces a GC in order to recompute GC pacing.
Since we can now recompute pacing on the fly using gcSetTriggerRatio,
change SetGCPercent (really runtime.setGCPercent) to go through
gcSetTriggerRatio and not trigger a GC.
Fixes#19076.
Change-Id: Ib30d7ab1bb3b55219535b9f238108f3d45a1b522
Reviewed-on: https://go-review.googlesource.com/39835
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
The current SetGCPercent test is, shall we say, minimal.
Expand it to check that the GC target is actually computed and updated
correctly.
For #19076.
Change-Id: I6e9b2ee0ef369f22f72e43b58d89e9f1e1b73b1b
Reviewed-on: https://go-review.googlesource.com/39834
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This changes gcSetTriggerRatio so it can be called even during
concurrent mark or sweep. In this case, it will adjust the pacing of
the current phase, accounting for progress that has already been made.
To make this work for concurrent sweep, this introduces a "basis" for
the pagesSwept count, much like the basis we just introduced for
heap_live. This lets gcSetTriggerRatio shift the basis to the current
heap_live and pagesSwept and compute a slope from there to completion.
This avoids creating a discontinuity where, if the ratio has
increased, there has to be a flurry of sweep activity to catch up.
Instead, this creates a continuous, piece-wise linear function as
adjustments are made.
For #19076.
Change-Id: Ibcd76aeeb81ff4814b00be7cbd3530b73bbdbba9
Reviewed-on: https://go-review.googlesource.com/39833
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, proportional sweep maintains its own count of how many
bytes have been allocated since the beginning of the sweep cycle so it
can compute how many pages need to be swept for a given allocation.
However, this requires a somewhat complex reimbursement scheme since
proportional sweep must be done before a span is allocated, but we
don't know how many bytes to charge until we've allocated a span. This
means that the allocated byte count used by proportional sweep can go
up and down, which has led to underflow bugs in the past (#18043) and
is going to interfere with adjusting sweep pacing on-the-fly (for #19076).
This approach also means we're maintaining a statistic that is very
closely related to heap_live, but has a different 0 value. This is
particularly confusing because the sweep ratio is computed based on
heap_live, so you have to understand that these two statistics are
very closely related.
Replace all of this and compute the sweep debt directly from the
current value of heap_live. To make this work, we simply save the
value of heap_live when the sweep ratio is computed to use as a
"basis" for later computing the sweep debt.
This eliminates the need for reimbursement as well as the code for
maintaining the sweeper's version of the live heap size.
For #19076.
Coincidentally fixes#18043, since this eliminates sweep reimbursement
entirely.
Change-Id: I1f931ddd6e90c901a3972c7506874c899251dc2a
Reviewed-on: https://go-review.googlesource.com/39832
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, the computations that derive controls from the GC trigger
are spread across several parts of the mark termination code.
Consolidate computing the absolute trigger, the heap goal, and sweep
pacing into a single function called at the end of mark termination.
Unlike the code being consolidated, this has to be more careful about
negative gcpercent. Many of the consolidated code paths simply didn't
execute if GC was off.
This is a step toward being able to change the GC trigger ratio in the
middle of concurrent sweeping and marking. For this commit, we try to
stick close to the original structure of the code that's being
consolidated, so it doesn't yet support mid-cycle adjustments.
For #19076.
Change-Id: Ic5335be04b96ad20e70d53d67913a86bd6b31456
Reviewed-on: https://go-review.googlesource.com/39831
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
gcController.triggerRatio is the only field in gcController that
persists across cycles. As global mutable state, the places where it
written and read are spread out, making it difficult to see that
updates and downstream calculations are done correctly.
Improve this situation by doing two things:
1) Move triggerRatio to memstats so it lives with the other
trigger-related fields and makes gcController entirely transient
state.
2) Commit the new trigger ratio during mark termination when we
compute other next-cycle controls, including the absolute trigger.
This forces us to explicitly thread the new trigger ratio from
gcController.endCycle to mark termination, so we're not just pulling
it out of global state.
Change-Id: I6669932f8039a8c0ef46a3f2a8c537db72e578aa
Reviewed-on: https://go-review.googlesource.com/39830
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
heap_live is updated atomically without locking, so we should also use
atomic loads to read it. Fix the reads of heap_live that happen
outside of STW to be atomic.
Change-Id: Idca9451c348168c2a792a9499af349833a3c333f
Reviewed-on: https://go-review.googlesource.com/41371
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
On 32-bit architectures (or if we fail to map a 64-bit-style arena),
we try to map the heap arena just above the end of the process image.
While we can accept any address, using lower addresses is preferable
because lower addresses cause us to map less of the heap bitmap.
However, if a program is linked against C code that has global
constructors, those constructors may call brk/sbrk to allocate memory
(e.g., many C malloc implementations do this for small allocations).
The brk also starts just above the process image, so this may adjust
the brk past the beginning of where we want to put the heap arena. In
this case, the kernel will pick a different address for the arena and
it will usually be very high (at least, as these things go in a 32-bit
address space).
Fix this by consulting the current value of the brk and using this in
addition to the end of the process image to compute the initial arena
placement.
This is implemented only on Linux currently, since we have no evidence
that it's an issue on any other OSes.
Fixes#19831.
Change-Id: Id64b45d08d8c91e4f50d92d0339146250b04f2f8
Reviewed-on: https://go-review.googlesource.com/39810
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TestBreakpoint expects to see "runtime.Breakpoint()" in the stack trace.
If runtime.Breakpoint() is inlined, then the stack trace prints
"runtime.Breakpoint(...)" since the runtime does not have information
about arguments (or lack thereof) to inlined functions. This change
makes the test independent of inlining by looking for the string
"runtime.Breakpoint(". Now TestBreakpoint passes with -l=4.
Change-Id: Ia044a8e8a4de2337cb2b393d6fa78c73a2f25926
Reviewed-on: https://go-review.googlesource.com/40997
Run-TryBot: David Lazar <lazard@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TestBlockProfile matches samples against a regexp that accepts "," in
profile PCs. I suspect this was just a syntax mistake. Remove "," from
the character class.
Change-Id: Idcfc20ed6900075abae08597ba71db559e89b37b
Reviewed-on: https://go-review.googlesource.com/41111
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Peter Weinberger <pjw@google.com>
TestBlockProfile currently requires exactly five PCs in each sample.
With more aggressive inlining there may be fewer, so change this test
to use the same pattern as TestMutexProfile, which accepts one or more
PCs. With this change, this test passes when compiled with -l=4.
Change-Id: I1421a6d56c96b77111bdc671d88723a222672fd6
Reviewed-on: https://go-review.googlesource.com/41110
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Lazar <lazard@golang.org>
CL 40876 changed ExampleFrames so that the output
was stable with and without mid-stack inlining.
However, that change lost some of the
pedagogical and copy/paste value of the example.
It was unclear why both more and i were being tracked,
and whether the 5 in i < 5 is related to len(pc),
and if so, why and how.
This CL rewrites the example with lots more comments,
and such that the core structure more closely matches
normal usage, and such that it is obvious
which lines of code should be deleted when copying.
As a bonus, it also now illustrates Frame.File.
Change-Id: Iab73541dd096657ddf79c5795337e8b596d89740
Reviewed-on: https://go-review.googlesource.com/41136
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The period recorded in CPU profiles is in nanoseconds, but was being
computed incorrectly as hz * 1000. As a result, many absolute times
displayed by pprof were incorrect.
Fix this by computing the period correctly.
Change-Id: I6fadd6d8ad3e57f31e8cc7a25a24fcaec510d8d4
Reviewed-on: https://go-review.googlesource.com/40995
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Instead of populating the aux symbol
of CALLudiv during rewrite rules,
populate it during genssa.
This simplifies the rewrite rules.
It also removes all remaining calls
to ctxt.Lookup from any rewrite rules.
This is a first step towards removing
ctxt from ssa.Cache entirely,
and also a first step towards converting
the obj.LSym.Version field into a boolean.
It should also speed up compilation.
Also, move func udiv into package runtime.
That's where it is anyway,
and it lets udiv look and act like the rest of
the runtime support functions.
Change-Id: I41462a632c14fdc41f61b08049ec13cd80a87bfe
Reviewed-on: https://go-review.googlesource.com/41191
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Returns at the end of func bodies where the funcs have no return values
are pointless.
Change-Id: I0da5ea78671503e41a9f56dd770df8c919310ce5
Reviewed-on: https://go-review.googlesource.com/41093
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This extends the GCSweepDone event with counts of swept and reclaimed
bytes. These are useful for understanding the duration and
effectiveness of sweep events.
Change-Id: I3c97a4f0f3aad3adbd188adb264859775f54e2df
Reviewed-on: https://go-review.googlesource.com/40811
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Currently, each individual span sweep emits a span to the trace. But
sweeps are generally done in loops until some condition is satisfied,
so this tracing is lower-level than anyone really wants any hides the
fact that no other work is being accomplished between adjacent sweep
events. This is also high overhead: enabling tracing significantly
impacts sweep latency.
Replace this with instead tracing around the sweep loops used for
allocation. This is slightly tricky because sweep loops don't
generally know if any sweeping will happen in them. Hence, we make the
tracing lazy by recording in the P that we would like to start tracing
the sweep *if* one happens, and then only closing the sweep event if
we started it.
This does mean we don't get tracing on every sweep path, which are
legion. However, we get much more informative tracing on the paths
that block allocation, which are the paths that matter.
Change-Id: I73e14fbb250acb0c9d92e3648bddaa5e7d7e271c
Reviewed-on: https://go-review.googlesource.com/40810
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Changes the text to match GOOS which appends 'and so on' at the
end to avoid restricting the set of possible values.
Change-Id: I54bcde71334202cf701662cdc2582c974ba8bf53
Reviewed-on: https://go-review.googlesource.com/41074
Reviewed-by: Ian Lance Taylor <iant@golang.org>
When allocating a non-small array of buckets for a map,
also preallocate some overflow buckets.
The estimate of the number of overflow buckets
is based on a simulation of putting mid=(low+high)/2 elements
into a map, where low is the minimum number of elements
needed to reach this value of b (according to overLoadFactor),
and high is the maximum number of elements possible
to put in this value of b (according to overLoadFactor).
This estimate is surprisingly reliable and accurate.
The number of overflow buckets needed is quadratic,
for a fixed value of b.
Using this mid estimate means that we will overallocate a few
too many overflow buckets when the actual number of elements is near low,
and underallocate significantly too few overflow buckets
when the actual number of elements is near high.
The mechanism introduced in this CL can be re-used for
other overflow bucket optimizations.
For example, given an initial size hint,
we could estimate quite precisely the number of overflow buckets.
This is #19931.
We could also change from "non-nil means end-of-list"
to "pointer-to-hmap.buckets means end-of-list",
and then create a linked list of reusable overflow buckets
when they are freed by map growth.
That is #19992.
We could also use a similar mechanism to do bulk allocation
of overflow buckets.
All these uses can co-exist with only the one additional pointer
in mapextra, given a little care.
name old time/op new time/op delta
MapPopulate/1-8 60.1ns ± 2% 60.3ns ± 2% ~ (p=0.278 n=19+20)
MapPopulate/10-8 577ns ± 1% 578ns ± 1% ~ (p=0.140 n=20+20)
MapPopulate/100-8 8.06µs ± 1% 8.19µs ± 1% +1.67% (p=0.000 n=20+20)
MapPopulate/1000-8 104µs ± 1% 104µs ± 1% ~ (p=0.317 n=20+20)
MapPopulate/10000-8 891µs ± 1% 888µs ± 1% ~ (p=0.101 n=19+20)
MapPopulate/100000-8 8.61ms ± 1% 8.58ms ± 0% -0.34% (p=0.009 n=20+17)
name old alloc/op new alloc/op delta
MapPopulate/1-8 0.00B 0.00B ~ (all equal)
MapPopulate/10-8 179B ± 0% 179B ± 0% ~ (all equal)
MapPopulate/100-8 3.33kB ± 0% 3.38kB ± 0% +1.48% (p=0.000 n=20+16)
MapPopulate/1000-8 55.5kB ± 0% 53.4kB ± 0% -3.84% (p=0.000 n=19+20)
MapPopulate/10000-8 432kB ± 0% 428kB ± 0% -1.06% (p=0.000 n=19+20)
MapPopulate/100000-8 3.65MB ± 0% 3.62MB ± 0% -0.70% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
MapPopulate/1-8 0.00 0.00 ~ (all equal)
MapPopulate/10-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
MapPopulate/100-8 18.0 ± 0% 17.0 ± 0% -5.56% (p=0.000 n=20+20)
MapPopulate/1000-8 96.0 ± 0% 72.6 ± 1% -24.38% (p=0.000 n=20+20)
MapPopulate/10000-8 625 ± 0% 319 ± 0% -48.86% (p=0.000 n=20+20)
MapPopulate/100000-8 6.23k ± 0% 4.00k ± 0% -35.79% (p=0.000 n=20+20)
Change-Id: I01f41cb1374bdb99ccedbc00d04fb9ae43daa204
Reviewed-on: https://go-review.googlesource.com/40979
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Any change to how we allocate overflow buckets
will require some extra hmap storage,
but we don't want hmap to grow,
particular as small maps usually don't need overflow buckets.
This CL converts the existing hmap overflow field,
which is usually used for pointer-free maps,
into a generic extra field.
This extra field can be used to hold data that is optional.
If it is valuable enough to do have special
handling of overflow buckets, which are medium-sized,
it is valuable enough to pay an extra alloc and two extra words for.
Adding fields to extra would entail adding overhead to pointer-free maps;
any mapextra fields added would need to be weighed against that.
This CL is just rearrangement, though.
Updates #19931
Updates #19992
Change-Id: If8537a206905b9d4dc6cd9d886184ece671b3f80
Reviewed-on: https://go-review.googlesource.com/40976
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This simplifies the code, as well as providing
a single place to modify to change the
allocation of new overflow buckets.
Updates #19931
Updates #19992
Change-Id: I77070619f5c8fe449bbc35278278bca5eda780f2
Reviewed-on: https://go-review.googlesource.com/40975
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Only the noinline pragma on testCallerFoo is needed to pass the test,
but the second pragma makes the test robust to future changes to the
inliner.
Change-Id: I80b384380c598f52e0382f53b59bb47ff196363d
Reviewed-on: https://go-review.googlesource.com/40877
Run-TryBot: David Lazar <lazard@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Otherwise, with -l=4, runtime.Callers gets inlined and the example
prints too many frames. Now the example passes with -l=4.
Change-Id: I9e420af9371724ac3ec89efafd76a658cf82bb4a
Reviewed-on: https://go-review.googlesource.com/40876
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This rewrites runtime.Caller in terms of stackExpander, which already
handles inlined frames and partially skipped frames. This also has the
effect of making runtime.Caller understand cgo frames if there is a cgo
symbolizer.
Updates #19348.
Change-Id: Icdf4df921aab5aa394d4d92e3becc4dd169c9a6e
Reviewed-on: https://go-review.googlesource.com/40270
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The Frames API forces the PC slice to escape to the heap because it
stores it in the Frames object. However, we'd like to use this API for
call stack expansion internally in the runtime in places where it
would be very good to avoid heap allocation.
This commit makes this possible by pulling the bulk of the Frames
implementation into an internal frameExpander API. The key difference
between these APIs is that the frameExpander does not hold the PC
slice; instead, the caller is responsible for threading the PC slice
through the frameExpander API calls. This makes it possible to keep
the PC slice on the stack. The Frames API then becomes a thin shim
around the frameExpander that keeps the PC slice in the Frames object.
Change-Id: If6b2d0b9132a2a905a0cf5deced9feddce76fc0e
Reviewed-on: https://go-review.googlesource.com/40610
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Lazar <lazard@golang.org>
While debugging a recent regression it was discovered that
the assembler for ppc64x was not always generating the correct
instruction for DS form loads and stores. When an instruction
is DS form then the offset must be a multiple of 4, and if it
isn't then bits outside the offset field were being incorrectly
set resulting in unexpected and incorrect instructions.
This change adds a check to determine when the opcode is DS form
and then verifies that the offset is a multiple of 4 before
generating the instruction, otherwise logs an error.
This also changes a few asm files that were using unaligned offsets
for DS form loads and stores. In the runtime package these were
instructions intended to cause a crash so using aligned or unaligned
offsets doesn't change that behavior.
Change-Id: Ie3a7e1e65dcc9933b54de7a46a054da8459cb56f
Reviewed-on: https://go-review.googlesource.com/40476
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Now the runtime/trace tests pass with -l=4.
This also gets rid of the frames cache for multiple reasons:
1) The frames cache was used to avoid repeated calls to funcname and
funcline. Now these calls happen inside the CallersFrames iterator.
2) Maintaining a frames cache is harder: map[uintptr]traceFrame
doesn't work since each PC can map to multiple traceFrames.
3) It's not clear that the cache is important.
Change-Id: I2914ac0b3ba08e39b60149d99a98f9f532b35bbb
Reviewed-on: https://go-review.googlesource.com/40591
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This extends the sweeper to free workbufs back to the heap between GC
cycles, allowing this memory to be reused for GC'd allocations or
eventually returned to the OS.
This helps for applications that have high peak heap usage relative to
their regular heap usage (for example, a high-memory initialization
phase). Workbuf memory is roughly proportional to heap size and since
we currently never free workbufs, it's proportional to *peak* heap
size. By freeing workbufs, we can release and reuse this memory for
other purposes when the heap shrinks.
This is somewhat complicated because this costs ~1–2 µs per workbuf
span, so for large heaps it's too expensive to just do synchronously
after mark termination between starting the world and dropping the
worldsema. Hence, we do it asynchronously in the sweeper. This adds a
list of "free" workbuf spans that can be returned to the heap. GC
moves all workbuf spans to this list after mark termination and the
background sweeper drains this list back to the heap. If the sweeper
doesn't finish, that's fine, since getempty can directly reuse any
remaining spans to allocate more workbufs.
Performance impact is negligible. On the x/benchmarks, this reduces
GC-bytes-from-system by 6–11%.
Fixes#19325.
Change-Id: Icb92da2196f0c39ee984faf92d52f29fd9ded7a8
Reviewed-on: https://go-review.googlesource.com/38582
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently the runtime allocates workbufs from persistent memory, which
means they can never be freed.
Switch to allocating them from manually-managed heap spans. This
doesn't free them yet, but it puts us in a position to do so.
For #19325.
Change-Id: I94b2512a2f2bbbb456cd9347761b9412e80d2da9
Reviewed-on: https://go-review.googlesource.com/38581
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This introduces a new type, *gcBits, to use for alloc/mark bitmap
allocations instead of *uint8. This type is marked go:notinheap, so
uses of it correctly eliminate write barriers. Since we now have a
type, this also extracts some common operations to methods both for
convenience and to avoid (*uint8) casts at most use sites.
For #19325.
Change-Id: Id51f734fb2e96b8b7715caa348c8dcd4aef0696a
Reviewed-on: https://go-review.googlesource.com/38580
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This clarifies that the gcBits type is actually an arena of gcBits and
will let us introduce a new gcBits type representing a single
mark/alloc bitmap allocated from the arena.
For #19325.
Change-Id: Idedf76d202d9174a17c61bcca9d5539e042e2445
Reviewed-on: https://go-review.googlesource.com/38579
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, manually-managed spans are included in memstats.heap_inuse
and memstats.heap_sys, but when we export these stats to the user, we
subtract out how much has been allocated for stack spans from both.
This works for now because stacks are the only manually-managed spans
we have.
However, we're about to use manually-managed spans for more things
that don't necessarily have obvious stats we can use to adjust the
user-presented numbers. Prepare for this by changing the accounting so
manually-managed spans don't count toward heap_inuse or heap_sys. This
makes these fields align with the fields presented to the user and
means we don't have to track more statistics just so we can adjust
these statistics.
For #19325.
Change-Id: I5cb35527fd65587ff23339276ba2c3969e2ad98f
Reviewed-on: https://go-review.googlesource.com/38577
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
We're going to start using manually-managed spans for GC workbufs, so
rename the allocate/free methods and pass in a pointer to the stats to
use instead of using the stack stats directly.
For #19325.
Change-Id: I37df0147ae5a8e1f3cb37d59c8e57a1fcc6f2980
Reviewed-on: https://go-review.googlesource.com/38576
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
We're going to use this free list for other types of manually-managed
memory in the heap.
For #19325.
Change-Id: Ib7e682295133eabfddf3a84f44db43d937bfdd9c
Reviewed-on: https://go-review.googlesource.com/38575
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
We're about to generalize _MSpanStack to be used for other forms of
in-heap manual memory management in the runtime. This is an automated
rename of _MSpanStack to _MSpanManual plus some comment fix-ups.
For #19325.
Change-Id: I1e20a57bb3b87a0d324382f92a3e294ffc767395
Reviewed-on: https://go-review.googlesource.com/38574
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently CallersFrames expands each PC to a slice of Frames and then
iteratively returns those Frames. However, this makes it very
difficult to avoid heap allocation: either the Frames slice will be
heap allocated, or, if it uses internal scratch space for small slices
(as it currently does), the Frames object itself has to be heap
allocated.
Fix this, at least in the common case, by expanding each PC
iteratively. We introduce a new pcExpander type that's responsible for
expanding a single PC. This maintains state from one Frame to the next
in the same PC. Frames then becomes a wrapper around this responsible
for feeding it the next PC when the pcExpander runs out of frames for
the current PC.
This makes it possible to stack-allocate a Frames object, which will
make it possible to use this API for PC expansion from within the
runtime itself.
Change-Id: I993463945ab574557cf1d6bedbe79ce7e9cbbdcd
Reviewed-on: https://go-review.googlesource.com/40434
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Lazar <lazard@golang.org>
Prevent a crash if the same type in two plugins had a recursive
definition, either by referring to a pointer to itself or a map existing
with the type as a value type (which creates a recursive definition
through the overflow bucket type).
Fixes#19258
Change-Id: Iac1cbda4c5b6e8edd5e6859a4d5da3bad539a9c6
Reviewed-on: https://go-review.googlesource.com/40292
Run-TryBot: Todd Neal <todd@tneal.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
This was unintentionally emptied rather than removed in 9417c022.
Change-Id: Ie6fdcf7ef55e58f12e2a2750ab448aa2d9f94d15
Reviewed-on: https://go-review.googlesource.com/40413
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
OpenBSD 6.0 and later have support for PT_TLS in ld.so(1). Now that OpenBSD
6.1 has been released, OpenBSD 5.9 is no longer officially supported and Go
can start generating PT_TLS for OpenBSD cgo binaries. This also allows us
to remove the workarounds in the OpenBSD cgo runtime.
This change also removes the environ and progname exports - these are now
provided directly by ld.so(1) itself.
Fixes#19932
Change-Id: I42e75ef9feb5dcd4696add5233497e3cbc48ad52
Reviewed-on: https://go-review.googlesource.com/40331
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The hardware divider is an optional component of ARMv7. This patch
detects whether it is available in runtime and use it or not.
1. The hardware divider is detected at startup and a flag is set/clear
according to a perticular bit of runtime.hwcap.
2. Each call of runtime.udiv will check this flag and decide if
use the hardware division instruction.
A rough test shows the performance improves 40-50% for ARMv7. And
the compatibility of ARMv5/v6 is not broken.
fixes#19118
Change-Id: Ic586bc9659ebc169553ca2004d2bdb721df823ac
Reviewed-on: https://go-review.googlesource.com/37496
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Changing mheap_.arena_used requires several steps that are currently
repeated multiple times in mheap_.sysAlloc. Consolidate these into a
single function.
In the future, this will also make it easier to add other auxiliary VM
structures.
Change-Id: Ie68837d2612e1f4ba4904acb1b6b832b15431d56
Reviewed-on: https://go-review.googlesource.com/40151
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The runtime.writeBarrier variable tries to be helpful by telling you
that the compiler also knows about this variable, which you could
probably guess, but doesn't say how the compiler knows about it. In
fact, the compiler has a complete copy in builtin/runtime.go that
needs to be kept in sync. Say so.
Change-Id: Ia7fb0c591cb6f9b8230decce01008b417dfcec89
Reviewed-on: https://go-review.googlesource.com/40150
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
They are dead code already, but the verifier is still not happy.
Don't assemble them at all.
Looks like it has been like that for long. I don't know why it
was ok. Maybe the verifier is now more picky?
Fixes#19884.
Change-Id: Ib806fb73ca469789dec56f52d484cf8baf7a245c
Reviewed-on: https://go-review.googlesource.com/40111
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Dave Cheney <dave@cheney.net>