One of my earlier versions of finer-grained select locking
failed on this test. If you just naively lock and check channels
one-by-one, it is possible that you skip over ready channels.
Consider that initially c1 is ready and c2 is not. Select checks c2.
Then another goroutine makes c1 not ready and c2 ready (in that order).
Then select checks c1, concludes that no channels are ready and
executes the default case. But there was no point in time when
no channel is ready and so default case must not be executed.
Change-Id: I3594bf1f36cfb120be65e2474794f0562aebcbbd
Reviewed-on: https://go-review.googlesource.com/7550
Reviewed-by: Russ Cox <rsc@golang.org>
The value in question is really a bit pattern
(a pointer with extra bits thrown in),
so treat it as a uintptr instead, avoiding the
generation of a write barrier when there
might not be a p.
Also add the obligatory //go:nowritebarrier.
Change-Id: I4ea097945dd7093a140f4740bcadca3ce7191971
Reviewed-on: https://go-review.googlesource.com/7667
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The GC assumes that there will be no asynchronous write barriers when
the world is stopped. This keeps the synchronization between write
barriers and the GC simple. However, currently, there are a few places
in runtime code where this assumption does not hold.
The GC stops the world by collecting all Ps, which stops all user Go
code, but small parts of the runtime can run without a P. For example,
the code that releases a P must still deschedule its G onto a runnable
queue before stopping. Similarly, when a G returns from a long-running
syscall, it must run code to reacquire a P.
Currently, this code can contain write barriers. This can lead to the
GC collecting reachable objects if something like the following
sequence of events happens:
1. GC stops the world by collecting all Ps.
2. G #1 returns from a syscall (for example), tries to install a
pointer to object X, and calls greyobject on X.
3. greyobject on G #1 marks X, but does not yet add it to a write
buffer. At this point, X is effectively black, not grey, even though
it may point to white objects.
4. GC reaches X through some other path and calls greyobject on X, but
greyobject does nothing because X is already marked.
5. GC completes.
6. greyobject on G #1 adds X to a work buffer, but it's too late.
7. Objects that were reachable only through X are incorrectly collected.
To fix this, we check the invariant that no asynchronous write
barriers happen when the world is stopped by checking that write
barriers always have a P, and modify all currently known sources of
these writes to disable the write barrier. In all modified cases this
is safe because the object in question will always be reachable via
some other path.
Some of the trace code was turned off, in particular the
code that traces returning from a syscall. The GC assumes
that as far as the heap is concerned the thread is stopped
when it is in a syscall. Upon returning the trace code
must not do any heap writes for the same reasons discussed
above.
Fixes#10098Fixes#9953Fixes#9951Fixes#9884
May relate to #9610#9771
Change-Id: Ic2e70b7caffa053e56156838eb8d89503e3c0c8a
Reviewed-on: https://go-review.googlesource.com/7504
Reviewed-by: Austin Clements <austin@google.com>
Some versions of libc, in this case Android's bionic, point environ
directly at the envp memory.
https://android.googlesource.com/platform/bionic/+/master/libc/bionic/libc_init_common.cpp#104
The Go runtime does something surprisingly similar, building the
runtime's envs []string using gostringnocopy. Both libc and the Go
runtime reusing memory interacts badly. When syscall.Setenv uses cgo
to call setenv(3), C modifies the underlying memory of a Go string.
This manifests on android/arm. With GOROOT=/data/local/tmp, a
runtime test calls syscall.Setenv("/os"), resulting in
runtime.GOROOT()=="/os\x00a/local/tmp/goroot".
Avoid this by copying environment string memory into Go.
Covered by runtime.TestFixedGOROOT on android/arm.
Change-Id: Id0cf9553969f587addd462f2239dafca1cf371fa
Reviewed-on: https://go-review.googlesource.com/7663
Reviewed-by: Keith Randall <khr@golang.org>
Channels and sync.Mutex'es allow another goroutine to acquire resource
ahead of an unblocked goroutine. This is good for performance, but
leads to futile wakeups (the unblocked goroutine needs to block again).
Futile wakeups caused user confusion during the very first evaluation
of tracing functionality on a real server (a goroutine as if acquires a mutex
in a loop, while there is no loop in user code).
This change detects futile wakeups on channels and emits a special event
to denote the fact. Later parser finds entire wakeup sequences
(unblock->start->block) and removes them.
sync.Mutex will be supported in a separate change.
Change-Id: Iaaaee9d5c0921afc62b449a97447445030ac19d3
Reviewed-on: https://go-review.googlesource.com/7380
Reviewed-by: Keith Randall <khr@golang.org>
The Go builders (and standard development cycle) for programs on iOS
require running the programs under lldb. Unfortunately lldb intercepts
SIGSEGV and will not give it back.
https://llvm.org/bugs/show_bug.cgi?id=22868
We get around this by never letting lldb see the SIGSEGV. On darwin,
Unix signals are emulated on top of mach exceptions. The debugger
registers a task-level mach exception handler. We register a
thread-level exception handler which acts as a faux signal handler.
The thread-level handler gets precedence over the task-level handler,
so we can turn the exception EXC_BAD_ACCESS into a panic before lldb
can see it.
Fixes#10043
Change-Id: I64d7c310dfa7ecf60eb1e59f094966520d473335
Reviewed-on: https://go-review.googlesource.com/7072
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
When checkmark fails, greyobject dumps both the object that pointed to
the unmarked object and the unmarked object. This code cluttered up
greyobject, was copy-pasted for the two objects, and the copy for
dumping the unmarked object was not entirely correct.
Extract object dumping out to a new function. This declutters
greyobject and fixes the bugs in dumping the unmarked object. The new
function is slightly cleaned up from the original code to have more
natural control flow and shows a marker on the field in the base
object that points to the unmarked object to make it easy to find.
Change-Id: Ib51318a943f50b0b99995f0941d03ee8876b9fcf
Reviewed-on: https://go-review.googlesource.com/7506
Reviewed-by: Rick Hudson <rlh@golang.org>
scanobject no longer returns the new wbuf.
Change-Id: I0da335ae5cd7ef7ea0e0fa965cf0e9f3a650d0e6
Reviewed-on: https://go-review.googlesource.com/7505
Reviewed-by: Rick Hudson <rlh@golang.org>
DragonFlyBSD dropped support for i386 in 4.0 and there is no longer a
dragonfly/386 - as such, remove the Go port.
Fixes#8951Fixes#7580Fixes#7421
Change-Id: I69022ab2262132e8f97153f14dc8c37c98527008
Reviewed-on: https://go-review.googlesource.com/7543
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Joel Sing <jsing@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The kern.rthreads sysctl has not existed for a long time - there is no way to
disable rthreads and __tfork no longer returns ENOTSUP.
Change-Id: Ia50ff01ac86ea83358e72b8f45f7818aaec1e4b1
Reviewed-on: https://go-review.googlesource.com/7490
Reviewed-by: Minux Ma <minux@golang.org>
Fixes#10135.
Change-Id: Ic4c5ab15bcb7b9c3fcc685a788d3b59c60c26e1e
Signed-off-by: Shenghou Ma <minux@golang.org>
Reviewed-on: https://go-review.googlesource.com/7400
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Everything has moved to Go, but comments still refer to .c/.h files.
Fix all of those up, at least for these three directories.
Fixes#10138
Change-Id: Ie5efe89b247841e0b3f82aac5256b2c606ef67dc
Reviewed-on: https://go-review.googlesource.com/7431
Reviewed-by: Russ Cox <rsc@golang.org>
This allows to test goroutine analysis code in runtime/pprof tests.
Also fix a nil-deref crash in goroutine analysis code that happens on runtime/pprof tests.
Change-Id: Id7884aa29f7fe4a8d7042482a86fe434e030461e
Reviewed-on: https://go-review.googlesource.com/7301
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
Augment ProcStart events with OS thread id.
This helps in scheduler locality analysis.
Change-Id: I93fea75d3072cf68de66110d0b59d07101badcb5
Reviewed-on: https://go-review.googlesource.com/7302
Reviewed-by: Keith Randall <khr@golang.org>
Some of the trace stacks are OS-dependent due to OS-specific code
in net package. Check these stacks only on subset of OSes.
Change-Id: If95e4485839f4120fd6395725374c3a2f8706dfc
Reviewed-on: https://go-review.googlesource.com/7300
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Even though the world is stopped the GC may do pointer
writes that need to be protected by write barriers.
This means that the write barrier must be on
continuously from the time the mark phase starts and
the mark termination phase ends. Checks were added to
ensure that no allocation happens during a GC.
Hoist the logic that clears pools the start of the GC
so that the memory can be reclaimed during this GC cycle.
Change-Id: I9d1551ac5db9bac7bac0cb5370d5b2b19a9e6a52
Reviewed-on: https://go-review.googlesource.com/6990
Reviewed-by: Austin Clements <austin@google.com>
Stip uninteresting bottom and top frames from trace stacks.
This makes both binary and json trace files smaller,
and also makes stacks shorter and more readable in the viewer.
Change-Id: Ib9c80ccc280504f0e235f867f53f1d2652c41583
Reviewed-on: https://go-review.googlesource.com/5523
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Removes a potential data race between os.Setenv and runtime.GOROOT,
along with a bug where os.Setenv would only sometimes change the
value of runtime.GOROOT.
Change-Id: I7d2a905115c667ea6e73f349f3784a1d3e8f810d
Reviewed-on: https://go-review.googlesource.com/6611
Reviewed-by: Keith Randall <khr@golang.org>
Also fixed a stack corruption bug for nacl/amd64p32.
Change-Id: I64b821b16999c296a159137d971af3870053c621
Signed-off-by: Shenghou Ma <minux@golang.org>
Reviewed-on: https://go-review.googlesource.com/7073
Reviewed-by: Dave Cheney <dave@cheney.net>
Starting it lazily causes a memory allocation (for the goroutine) during GC.
First use of channels for runtime implementation.
Change-Id: I9cd24dcadbbf0ee5070ee6d0ed7ea415504f316c
Reviewed-on: https://go-review.googlesource.com/6960
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
I asked for this in CL 3742 and it was ignored.
Change-Id: I30ad05f87c7d9eccb11df7e19288e3ed2c7e2e3f
Reviewed-on: https://go-review.googlesource.com/6930
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
For OSes that use elf on intel, 2*Ptrsize bytes are reserved for TLS.
But only one pointer (g) has been stored in the TLS for a while now.
So we can set it to just Ptrsize, which happily matches what happens
when externally linking.
Fixes#9913
Change-Id: Ic816369d3a55a8cdcc23be349b1a1791d53f5f81
Reviewed-on: https://go-review.googlesource.com/6584
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This is an experiment to see if removing the boundary bit logic will
lead to fewer cache misses and improved performance. Instead of using
boundary bits we use the span information to get element size and use
some bit whacking to get the boundary without having to touch the
random heap bits which cause cache misses.
Furthermore once the boundary bit is removed we can either use that
bit for a simpler checkmark routine or we can reduce the number of
bits in the GC bitmap to 2 bits per pointer sized work. For example
the 2 bits at the boundary can be used for marking and pointer/scalar
differentiation. Since we don't need the mark bit except at the
boundary nibble of the object other nibbles can use this bit
as a noscan bit to indicate that there are no more pointers in
the object.
Currently the changed included in this CL slows down the garbage
benchmark. With the boundary bits garbage gives 5.78 and without
(this CL) it gives 5.88 which is a 2% slowdown.
Change-Id: Id68f831ad668176f7dc9f7b57b339e4ebb6dc4c2
Reviewed-on: https://go-review.googlesource.com/6665
Reviewed-by: Austin Clements <austin@google.com>
Gc already calculates n as an int, so converting to int64 to call
growslice doesn't serve any purpose except to emit slightly larger
code on 32-bit platforms. Passing n as an int shrinks godoc's text
segment by 8kB (9472633 => 9464133) when building for ARM.
Change-Id: Ief9492c21d01afcb624d3f2a484df741450b788d
Reviewed-on: https://go-review.googlesource.com/6231
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The unbounded list-based defer pool can grow infinitely.
This can happen if a goroutine routinely allocates a defer;
then blocks on one P; and then unblocked, scheduled and
frees the defer on another P.
The scenario was reported on golang-nuts list.
We've been here several times. Any unbounded local caches
are bad and grow to infinite size. This change introduces
central defer pool; local pools become fixed-size
with the only purpose of amortizing accesses to the
central pool.
Freedefer now executes on system stack to not consume
nosplit stack space.
Change-Id: I1a27695838409259d1586a0adfa9f92bccf7ceba
Reviewed-on: https://go-review.googlesource.com/3967
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
The unbounded list-based sudog cache can grow infinitely.
This can happen if a goroutine is routinely blocked on one P
and then unblocked and scheduled on another P.
The scenario was reported on golang-nuts list.
We've been here several times. Any unbounded local caches
are bad and grow to infinite size. This change introduces
central sudog cache; local caches become fixed-size
with the only purpose of amortizing accesses to the
central cache.
The change required to move sudog cache from mcache to P,
because mcache is not scanned by GC.
Change-Id: I3bb7b14710354c026dcba28b3d3c8936a8db4e90
Reviewed-on: https://go-review.googlesource.com/3742
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Error detection code copied from syscall, where presumably
we actually do it right.
Note that we throw the errno away. The runtime doesn't use it.
Fixes#10052
Change-Id: I8de77dda6bf287276b137646c26b84fa61554ec8
Reviewed-on: https://go-review.googlesource.com/6571
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
OpenBSD's sigprocmask system call passes the signal mask by value
rather than reference, so vars are unnecessary. Additionally,
declaring "var sigset_all = ^sigset_none" means sigset_all won't be
initialized until runtime_init is called, but the first call to
newosproc happens before then.
I've witnessed Go processes on OpenBSD crash from receiving SIGWINCH
on the newly created OS thread before it finished initializing.
Change-Id: I16995e7e466d5e7e50bcaa7d9490173789a0b4cc
Reviewed-on: https://go-review.googlesource.com/6440
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Move type definitions from chan1.go to chan.go and select.go.
Remove underscores from names.
Make c.buf unsafe.Pointer instead of *uint8.
Change-Id: I75cf8385bdb9f79eb5a7f7ad319495abbacbe942
Reviewed-on: https://go-review.googlesource.com/4900
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
This fixes runtime's TestBreakpoint on ppc64:
the Breakpoint frame was not showing up in the trace.
It seems like f.frame should be either the frame size
including the saved LR (if any) or the frame size
not including the saved LR.
On ppc64, f.frame is the frame size not including the saved LR.
On arm, f.frame is the frame size not including the saved LR,
except when that's -4, f.frame is 0 instead.
The code here in the runtime expects that f.frame is the frame
size including the saved LR.
Since all three disagree and nothing else uses f.frame anymore,
stop using it here too. Use funcspdelta, which tells us the exact
difference between the FP and SP. If it's zero, LR has not been
saved yet, so the one saved for sigpanic should be recorded.
This fixes TestBreakpoint on both ppc64 and ppc64le.
I don't really understand how it ever worked there.
Change-Id: I2d2c580d5c0252cc8471e828980aeedcab76858d
Reviewed-on: https://go-review.googlesource.com/6430
Reviewed-by: Minux Ma <minux@golang.org>
Plan 9 provides a /dev/random device to return a
stream of random numbers. However, the method used
to generate random numbers on Plan 9 is slow and
reading from /dev/random may block.
We don't want our Go programs to be significantly
slowed down just to slightly improve the distribution
of hash values.
So, we do the same thing as NaCl and rely exclusively
on extendRandom to generate pseudo-random numbers.
Fixes#10028.
Change-Id: I7e11a9b109c22f23608eb09c406b7c3dba31f26a
Reviewed-on: https://go-review.googlesource.com/6386
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
issue #10017: TestGdbPython 'print mapvar' is reported to fail on ppc64.
issue #10002: TestGdbPython 'print mapvar' is reported to fail on arm hardfloat.
The testcase now uses plain line number in main. Unwinding issues are
unrelated to the GDB map prettyprinter feature.
Remove arch-specific t.Skip()s from those two issues.
Fixes#10017Fixes#10002
Change-Id: I9d50ffe2f3eb7bf65dd17c8c76a2677571de68ba
Reviewed-on: https://go-review.googlesource.com/6267
Reviewed-by: Minux Ma <minux@golang.org>
mv cmd/new5l cmd/5l and so on.
Minimal changes to cmd/dist and cmd/go to keep things building.
More can be deleted in followup CLs.
Change-Id: I1449eca7654ce2580d1f413a56dc4a75f3d4618b
Reviewed-on: https://go-review.googlesource.com/6361
Reviewed-by: Rob Pike <r@golang.org>
We used to not call traceback from goexit1.
But now tracer does it and crashes on amd64p32:
runtime: unexpected return pc for runtime.getg called from 0x108a4240
goroutine 18 [runnable, locked to thread]:
runtime.traceGoEnd()
src/runtime/trace.go:758 fp=0x10818fe0 sp=0x10818fdc
runtime.goexit1()
src/runtime/proc1.go:1540 +0x20 fp=0x10818fe8 sp=0x10818fe0
runtime.getg(0x0)
src/runtime/asm_386.s:2414 fp=0x10818fec sp=0x10818fe8
created by runtime/pprof_test.TestTraceStress
src/runtime/pprof/trace_test.go:123 +0x500
Return PC from goexit1 points right after goexit (+0x6).
It happens to work most of the time somehow.
This change fixes traceback from goexit1 by adding an additional NOP to goexit.
Fixes#9931
Change-Id: Ied25240a181b0a2d7bc98127b3ed9068e9a1a13e
Reviewed-on: https://go-review.googlesource.com/5460
Reviewed-by: Russ Cox <rsc@golang.org>
This is to be used by an lldb script inside go_darwin_arm_exec to pause
the execution of tests on iOS so the working directory can be adjusted
into something resembling a GOROOT.
Change-Id: I69ea2d4d871800ae56634b23ffa48583559ddbc6
Reviewed-on: https://go-review.googlesource.com/6363
Reviewed-by: Minux Ma <minux@golang.org>
Change-Id: I9b08b74214e5a41a7e98866a993b038030a4c073
Reviewed-on: https://go-review.googlesource.com/6251
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Previously, the typeDead check in greyobject was under a separate
!useCheckmark conditional. Put it with the rest of the !useCheckmark
code. Also move a comment about atomic update of the marked bit to
where we actually do that update now.
Change-Id: Ief5f16401a25739ad57d959607b8d81ffe0bc211
Reviewed-on: https://go-review.googlesource.com/6271
Reviewed-by: Rick Hudson <rlh@golang.org>
Change-Id: I1bb0b8b11e8c7686b85657050fd7cf926afe4d29
Reviewed-on: https://go-review.googlesource.com/6200
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Previously, the memory allocator on Plan 9 did
not free memory properly. It was only able to
free the last allocated block.
This change implements a variant of the
Kernighan & Ritchie memory allocator with
coalescing and splitting.
The most notable differences are:
- no header is prefixing the allocated blocks, since
the size is always specified when calling sysFree,
- the free list is nil-terminated instead of circular.
Fixes#9736.
Fixes#9803.
Fixes#9952.
Change-Id: I00d533714e4144a0012f69820d31cbb0253031a3
Reviewed-on: https://go-review.googlesource.com/5524
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Disable the test properly on nacl systems, tested on nacl/amd64p32.
Change-Id: Iffe210be4f9c426bfc47f2dd3a8f0c6b5a398cc3
Reviewed-on: https://go-review.googlesource.com/6093
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Update #9993
If the physical page size of the machine is larger than the logical
heap size, for example 8k logical, 64k physical, then madvise(2) will
round up the requested amount to a 64k boundary and may discard pages
close to the page being madvised.
This patch disables the scavenger in these situations, which at the moment
is only ppc64 and ppc64le systems. NaCl also uses a 64k page size, but
it's not clear if it is affected by this problem.
Change-Id: Ib897f8d3df5bd915ddc0b510f2fd90a30ef329ca
Reviewed-on: https://go-review.googlesource.com/6091
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Needs the Go tool, which we do not have on iOS. (No Fork.)
Change-Id: Iedf69f5ca81d66515647746546c9b304c8ec10c4
Reviewed-on: https://go-review.googlesource.com/6102
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
There is no sense in trying to netpoll while there is
already a thread blocked in netpoll. And in most cases
there must be a thread blocked in netpoll, because
the first otherwise idle thread does blocking netpoll.
On some program I see that netpoll called from findrunnable
consumes 3% of time.
Change-Id: I0af1a73d637bffd9770ea50cb9278839716e8816
Reviewed-on: https://go-review.googlesource.com/4553
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
This makes Go's CPU profiling code somewhat more idiomatic; e.g.,
using := instead of forward declaring variables, using "int" for
element counts instead of "uintptr", and slices instead of C-style
pointer+length. This makes the code easier to read and eliminates a
lot of type conversion clutter.
Additionally, in sigprof we can collect just maxCPUProfStack stack
frames, as cpuprof won't use more than that anyway.
Change-Id: I0235b5ae552191bcbb453b14add6d8c01381bd06
Reviewed-on: https://go-review.googlesource.com/6072
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
The first call is pointless. It appears to simply be a mistake.
benchmark old ns/op new ns/op delta
BenchmarkComplexAlgMap 90.7 76.1 -16.10%
Change-Id: Id0194c9f09cea8b68f17b2ac751a8e3240e47f19
Reviewed-on: https://go-review.googlesource.com/5284
Reviewed-by: Keith Randall <khr@golang.org>
Gives tests a way to find the bundle that contains their testdata, and
is generally useful for finding resources.
Change-Id: Idfa03e8543af927c17bc8ec8aadc5014ec82df28
Reviewed-on: https://go-review.googlesource.com/6000
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Updates #10002
The gdb test added in 1c82e236f5 is failing on most arm systems.
Temporarily disable this test so that we can return to a working arm build.
Change-Id: Iff96ea8d5a99e1ceacf4979e864ff196e5503535
Reviewed-on: https://go-review.googlesource.com/5902
Reviewed-by: Ian Lance Taylor <iant@golang.org>
We return memory to the kernel with madvise(..., DONTNEED).
Also mark returned memory with NOHUGEPAGE to keep the kernel from
merging this memory into a huge page, effectively reallocating it.
Only known to be a problem on linux/{386,amd64,amd64p32} at the moment.
It may come up on other os/arch combinations in the future.
Fixes#8832
Change-Id: Ifffc6627a0296926e3f189a8a9b6e4bdb54c79eb
Reviewed-on: https://go-review.googlesource.com/5660
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
We need to distinguish pointers to free spans, which indicate bugs in
our pointer analysis, from pointers to never-in-the-heap spans, which
can legitimately arise from sysAlloc/mmap/etc. This normally isn't a
problem because the heap is contiguous, but in some situations (32
bit, particularly) the heap must grow around an already allocated
region.
The bad pointer test is disabled so this fix doesn't actually do
anything, but it removes one barrier from reenabling it.
Fixes#9872.
Change-Id: I0a92db4d43b642c58d2b40af69c906a8d9777f88
Reviewed-on: https://go-review.googlesource.com/5780
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Available darwin/arm devices sporadically have trouble mapping 256M.
I would really appreciate it if anyone could check my working on
this, and make sure sure there aren't obviously bad consequences I
haven't considered.
Change-Id: Id1a8edae104d974fcf5f9333274f958625467f79
Reviewed-on: https://go-review.googlesource.com/5752
Reviewed-by: Keith Randall <khr@golang.org>
Since allglock is held in this function, there's no point to
tip-toeing around allgs. Just use a for-range loop.
Change-Id: I1ee61c7e8cac8b8ebc8107c0c22f739db5db9840
Reviewed-on: https://go-review.googlesource.com/5882
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Previously, we had three loops in the garbage collector that all
cleared the per-G GC flags. Consolidate these into one function.
This one function is designed to work in a concurrent setting. As a
result, it's slightly more expensive than the loops it replaces during
STW phases, but these happen at most twice per GC.
Change-Id: Id1ec0074fd58865eb0112b8a0547b267802d0df1
Reviewed-on: https://go-review.googlesource.com/5881
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The loop in gcMark is redundant with the gcworkdone resetting
performed by markroot, which called a few lines later in gcMark.
Change-Id: Ie0a826a614ecfa79e6e6b866e8d1de40ba515856
Reviewed-on: https://go-review.googlesource.com/5880
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Package runtime's Go code was converted to directly call getcallerpc
and getcallersp in https://golang.org/cl/138740043, but the assembly
implementations were not removed.
Change-Id: Ib2eaee674d594cbbe799925aae648af782a01c83
Reviewed-on: https://go-review.googlesource.com/5901
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
NetBSD's semaphore implementation is derived from OpenBSD's, but has
subsequently diverged due to cleanups that were only applied to the
latter (https://golang.org/cl/137960043, https://golang.org/cl/5563).
This CL applies analogous cleanups for NetBSD.
Notably, we can also remove the scary NetBSD deadlock warning.
NetBSD's manual pages document that lwp_unpark on a not-yet-parked LWP
will cause that LWP's next lwp_park system call to return immediately,
so there's no race hazard.
Change-Id: Ib06844c420d2496ac289748eba13eb4700bbbbb2
Reviewed-on: https://go-review.googlesource.com/5564
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Joel Sing <jsing@google.com>
(gdb) p x
Python Exception <class 'gdb.error'> There is no member named b.:
$2 = map[string]string
->
(gdb) p x
$1 = map[string]string = {["shane"] = "hansen"}
Change-Id: I874d02a029f2ac9afc5ab666afb65760ec2c3177
Reviewed-on: https://go-review.googlesource.com/5522
Reviewed-by: Ian Lance Taylor <iant@golang.org>
OpenBSD's thrsleep system call includes an "abort" parameter, which
specifies a memory address to be tested after being registered on the
sleep channel (i.e., capable of being woken up by thrwakeup). By
passing a pointer to waitsemacount for this parameter, we avoid race
conditions without needing a lock. Instead we just need to use
atomicload, cas, and xadd to mutate the semaphore count.
Change-Id: If9f2ab7cfd682da217f9912783cadea7e72283a8
Reviewed-on: https://go-review.googlesource.com/5563
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Joel Sing <jsing@google.com>
When GODEBUG=gctrace=2 two gcs are preformed. During the first gc
the stack scan sets the g's gcscanvalid and gcworkdone flags to true
indicating that the stacks have to be scanned and do not need to
be rescanned. These need to be reset to false for the second GC so the
stacks are rescanned, otherwise if the only pointer to an object is
on the stack it will not be discovered and the object will be freed.
Typically this will include the object that was just allocated in
the mallocgc call that initiated the GC.
Change-Id: Ic25163f4689905fd810c90abfca777324005c02f
Reviewed-on: https://go-review.googlesource.com/5861
Reviewed-by: Russ Cox <rsc@golang.org>
Currently sync.Mutex is fully cooperative. That is, once contention is discovered,
the goroutine calls into scheduler. This is suboptimal as the resource can become
free soon after (especially if critical sections are short). Server software
usually runs at ~~50% CPU utilization, that is, switching to other goroutines
is not necessary profitable.
This change adds limited active spinning to sync.Mutex if:
1. running on a multicore machine and
2. GOMAXPROCS>1 and
3. there is at least one other running P and
4. local runq is empty.
As opposed to runtime mutex we don't do passive spinning,
because there can be work on global runq on on other Ps.
benchmark old ns/op new ns/op delta
BenchmarkMutexNoSpin 1271 1272 +0.08%
BenchmarkMutexNoSpin-2 702 683 -2.71%
BenchmarkMutexNoSpin-4 377 372 -1.33%
BenchmarkMutexNoSpin-8 197 190 -3.55%
BenchmarkMutexNoSpin-16 131 122 -6.87%
BenchmarkMutexNoSpin-32 170 164 -3.53%
BenchmarkMutexSpin 4724 4728 +0.08%
BenchmarkMutexSpin-2 2501 2491 -0.40%
BenchmarkMutexSpin-4 1330 1325 -0.38%
BenchmarkMutexSpin-8 684 684 +0.00%
BenchmarkMutexSpin-16 414 372 -10.14%
BenchmarkMutexSpin-32 559 469 -16.10%
BenchmarkMutex 19.1 19.1 +0.00%
BenchmarkMutex-2 81.6 54.3 -33.46%
BenchmarkMutex-4 143 100 -30.07%
BenchmarkMutex-8 154 156 +1.30%
BenchmarkMutex-16 140 159 +13.57%
BenchmarkMutex-32 141 163 +15.60%
BenchmarkMutexSlack 33.3 31.2 -6.31%
BenchmarkMutexSlack-2 122 97.7 -19.92%
BenchmarkMutexSlack-4 168 158 -5.95%
BenchmarkMutexSlack-8 152 158 +3.95%
BenchmarkMutexSlack-16 140 159 +13.57%
BenchmarkMutexSlack-32 146 162 +10.96%
BenchmarkMutexWork 154 154 +0.00%
BenchmarkMutexWork-2 89.2 89.9 +0.78%
BenchmarkMutexWork-4 139 86.1 -38.06%
BenchmarkMutexWork-8 177 162 -8.47%
BenchmarkMutexWork-16 170 173 +1.76%
BenchmarkMutexWork-32 176 176 +0.00%
BenchmarkMutexWorkSlack 160 160 +0.00%
BenchmarkMutexWorkSlack-2 103 99.1 -3.79%
BenchmarkMutexWorkSlack-4 155 148 -4.52%
BenchmarkMutexWorkSlack-8 176 170 -3.41%
BenchmarkMutexWorkSlack-16 170 173 +1.76%
BenchmarkMutexWorkSlack-32 175 176 +0.57%
"No work" benchmarks are not very interesting (BenchmarkMutex and
BenchmarkMutexSlack), as they are absolutely not realistic.
Fixes#8889
Change-Id: I6f14f42af1fa48f73a776fdd11f0af6dd2bb428b
Reviewed-on: https://go-review.googlesource.com/5430
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
This change deletes the C implementations of
the Go compiler and assembler from the master branch.
The Go implementations are a bit slower right now,
due mainly to garbage generated by taking addresses
of stack variables all over the place (it was C code,
after all). That will be cleaned up (mechanically) over the
next week or so, and things will get faster.
Change-Id: I66b2b3477aec8835f9960d0798f5752dcd98d08f
The slow path of heapBitsForObjects somewhat subtly assumes that the
pointer will not point to the first word of the object and will round
the pointer wrong if this assumption is violated. This assumption is
safe because the fast path should always take care of this case, but
there's no benefit to making this assumption, it makes the code more
difficult to experiment with than necessary, and it's trivial to
eliminate.
Change-Id: Iedd336f7d529a27d3abeb83e77dfb32a285ea73a
Reviewed-on: https://go-review.googlesource.com/5636
Reviewed-by: Russ Cox <rsc@golang.org>
The routine mallocgc retrieves objects from freelists. Prefetch
the object that will be returned in the next call to mallocgc.
Experiments indicate that this produces a 1% improvement when using
prefetchnta and less when using prefetcht0, prefetcht1, or prefetcht2.
Benchmark numbers indicate a 1% improvement over no
prefetch, much less over prefetcht0, prefetcht1, and prefetcht2.
These numbers were for the garbage benchmark with MAXPROCS=4
no prefetch >> 5.96 / 5.77 / 5.89
prefetcht0(uintptr(v.ptr().next)) >> 5.88 / 6.17 / 5.84
prefetcht1(uintptr(v.ptr().next)) >> 5.88 / 5.89 / 5.91
prefetcht2(uintptr(v.ptr().next)) >> 5.87 / 6.47 / 5.92
prefetchnta(uintptr(v.ptr().next)) >> 5.72 / 5.84 / 5.85
Change-Id: I54e07172081cccb097d5b5ce8789d74daa055ed9
Reviewed-on: https://go-review.googlesource.com/5350
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Makes them compatible with the new asm.
Applied mechanically from vet diagnostics.
Manual edits: the names for arguments in time·now(SB) in runtime/sys_*_arm.s.
Change-Id: Ib295390d9509d306afc67714e3f50dc832256625
Reviewed-on: https://go-review.googlesource.com/5576
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
With a trivial Golang-built program loaded in gdb-7.8.90.20150214-7.fc23.x86_64
I get this error:
(gdb) source ./src/runtime/runtime-gdb.py
Loading Go Runtime support.
Traceback (most recent call last):
File "./src/runtime/runtime-gdb.py", line 230, in <module>
_rctp_type = gdb.lookup_type("struct reflect.rtype").pointer()
gdb.error: No struct type named reflect.rtype.
(gdb) q
No matter if this struct should or should not be in every Golang-built binary
this change should fix that with no disadvantages.
Change-Id: I0c490d3c9bbe93c65a2183b41bfbdc0c0f405bd1
Reviewed-on: https://go-review.googlesource.com/5521
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trace command allows to visualize and analyze traces.
Run as:
$ go tool trace binary trace.file
The commands opens web browser with the main page,
which contains links for trace visualization,
blocking profiler, network IO profiler and per-goroutine
traces.
Also move trace parser from runtime/pprof/trace_parser_test.go
to internal/trace/parser.go, so that it can be shared between
tests and the command.
Change-Id: Ic97ed59ad6e4c7e1dc9eca5e979701a2b4aed7cf
Reviewed-on: https://go-review.googlesource.com/3601
Reviewed-by: Andrew Gerrand <adg@golang.org>
Restores stack traces in the android/arm builder.
Change-Id: If637aa2ed6f8886126b77cf9cc8a0535ec7c4369
Reviewed-on: https://go-review.googlesource.com/5453
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
In most cases we pass return PC to race detector,
and race runtime subtracts one from them.
However, in manual instrumentation in runtime
we pass function start PC to race runtime.
Race runtime can't distinguish these cases
and so it does not subtract one from top PC.
This leads to bogus line numbers in some cases.
Make it consistent and always pass what looks
like a return PC, so that race runtime can
subtract one and still get PC in the same function.
Also delete two unused functions.
Update #8053
Change-Id: I4242dec5e055e460c9a8990eaca1d085ae240ed2
Reviewed-on: https://go-review.googlesource.com/4902
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This is a nice split but more importantly it provides a better
way to fit the checkmark phase into the sequencing.
Also factor out common span copying into gcSpanCopy.
Change-Id: Ia058644974e4ed4ac3cf4b017a3446eb2284d053
Reviewed-on: https://go-review.googlesource.com/5333
Reviewed-by: Austin Clements <austin@google.com>
The loop made more sense when gc_m was not its own function.
Change-Id: I71a7f21d777e69c1924e3b534c507476daa4dfdd
Reviewed-on: https://go-review.googlesource.com/5332
Reviewed-by: Austin Clements <austin@google.com>
See the following issue for context:
https://github.com/golang/go/issues/9729#issuecomment-74648287
In short, RDTSC can produce skewed results without preceding LFENCE/MFENCE.
Information on this matter is very scrappy in the internet.
But this is what linux kernel does (see rdtsc_barrier).
It also fixes the test program on my machine.
Update #9729
Change-Id: I3c1ffbf129fdfdd388bd5b7911b392b319248e68
Reviewed-on: https://go-review.googlesource.com/5033
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Fix many incorrect FP references and a few other details.
Some errors remain, especially in vlop, but fixing them requires semantics. For another day.
Change-Id: Ib769fb519b465e79fc08d004a51acc5644e8b259
Reviewed-on: https://go-review.googlesource.com/5288
Reviewed-by: Russ Cox <rsc@golang.org>
That is, I accidentally dropped this change of Austin's
when preparing my CL. I blame Git.
Change-Id: I9dd772c84edefad96c4b16785fdd2dea04a4a0d6
Reviewed-on: https://go-review.googlesource.com/5320
Reviewed-by: Austin Clements <austin@google.com>
Move code from malloc1.go, malloc2.go, mem.go, mgc0.go into
appropriate locations.
Factor mgc.go into mgc.go, mgcmark.go, mgcsweep.go, mstats.go.
A lot of this code was in certain files because the right place was in
a C file but it was written in Go, or vice versa. This is one step toward
making things actually well-organized again.
Change-Id: I6741deb88a7cfb1c17ffe0bcca3989e10207968f
Reviewed-on: https://go-review.googlesource.com/5300
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Until recently, struct workbuf had only lfnode and uintptr fields
before the obj array to make it convenient to compute the size of the
obj array. It slowly grew more fields until this became inconvenient
enough that it was restructured to make the size computation easy.
Now the size computation doesn't care what the field types are, so
switch to more natural types.
Change-Id: I966140ba7ebb4aeb41d5c66d9d2a3bdc17dd4bcf
Reviewed-on: https://go-review.googlesource.com/5262
Reviewed-by: Russ Cox <rsc@golang.org>
This converts the garbage collector from directly manipulating work
buffers to using the new gcWork abstraction.
The previous management of work buffers was rather ad hoc. As a
result, switching to the gcWork abstraction changes many details of
work buffer management.
If greyobject fills a work buffer, it can now pull from work.partial
in addition to work.empty.
Previously, gcDrain started with a partial or empty work buffer and
fetched an empty work buffer if it filled its current buffer (in
greyobject). Now, gcDrain starts with a full work buffer and fetches
an partial or empty work buffer if it fills its current buffer (in
greyobject). The original behavior was bad because gcDrain would
immediately drop the empty work buffer returned by greyobject and
fetch a full work buffer, which greyobject was likely to immediately
overflow, fetching another empty work buffer, etc. The new behavior
isn't great at the start because greyobject is likely to immediately
overflow the full buffer, but the steady-state behavior should be more
stable. Both before and after this change, gcDrain fetches a full
work buffer if it drains its current buffer. Basically all of these
choices are bad; the right answer is to use a dual work buffer scheme.
Previously, shade always fetched a work buffer (though usually from
m.currentwbuf), even if the object was already marked. Now it only
fetches a work buffer if it actually greys an object.
Change-Id: I8b880ed660eb63135236fa5d5678f0c1c041881f
Reviewed-on: https://go-review.googlesource.com/5232
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This introduces a producer/consumer abstraction for GC work pointers
that internally handles the details of filling, draining, and
shuffling work buffers.
In addition to simplifying the GC code, this should make it easy for
us to change how we use work buffers, including cleaning up how we use
the work.partial queue, reintroducing a FIFO lookahead cache, adding
prefetching, and using dual buffers to avoid flapping.
This commit doesn't change any existing code. The following commit
will switch the garbage collector from explicit workbuf manipulation
to gcWork.
Change-Id: Ifbfe5fff45bf0362d6d7c3cecb061f0c9874077d
Reviewed-on: https://go-review.googlesource.com/5231
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
References to FP must now have a symbol.
Change-Id: I3f06b99cc48cbd4ccd6f23f2e4b0830af40f7f3d
Reviewed-on: https://go-review.googlesource.com/5281
Reviewed-by: Russ Cox <rsc@golang.org>
Nit. There's no reason to take a uintptr and doing so just requires
casts in annoying places.
Change-Id: Ifeb9638c6d94eae619c490930cf724cc315680ba
Reviewed-on: https://go-review.googlesource.com/5230
Reviewed-by: Russ Cox <rsc@golang.org>
Require a name to be specified when referencing the pseudo-stack.
If you want a real stack offset, use the hardware stack pointer (e.g.,
R13 on arm), not SP.
Fix affected assembly files.
Change-Id: If3545f187a43cdda4acc892000038ec25901132a
Reviewed-on: https://go-review.googlesource.com/5120
Run-TryBot: Rob Pike <r@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Apparently when ARM stops at a GDB breakpoint, it appears to be in
syscall.Syscall. The "info goroutines" test expected it to be in a
runtime function. Since this isn't fundamental to the test, simply
tweak the test's regexp to make sure "info goroutines" prints some
running goroutine with an active M, but don't require it to be in any
particular function.
Change-Id: Iba2618b46d3dc49cef62ffb72484b83ea7b0317d
Reviewed-on: https://go-review.googlesource.com/5060
Reviewed-by: Dave Cheney <dave@cheney.net>
All of the other memory-related source files start with "m". Keep up
the tradition.
Change-Id: Idd88fdbf2a1453374fa12109b949b1c4d149a4f8
Reviewed-on: https://go-review.googlesource.com/4853
Reviewed-by: Minux Ma <minux@golang.org>
Rather than reaching in to slices directly in the slice pretty
printer, use the newly introduced SliceValue wrapper.
Change-Id: Ibb25f8c618c2ffb3fe1a8dd044bb9a6a085df5b7
Reviewed-on: https://go-review.googlesource.com/4936
Reviewed-by: Minux Ma <minux@golang.org>
"info goroutines" is failing because it hasn't kept up with changes in
the 1.5 runtime. This fixes three issues preventing "info goroutines"
from working. allg is no longer a linked list, so switch to using the
allgs slice. The g struct's 'status' field is now called
'atomicstatus', so rename uses of 'status'. Finally, this was trying
to parse str(pc) as an int, but str(pc) can return symbolic
information after the raw hex value; fix this by stripping everything
after the first space.
This also adds a test for "info goroutines" to runtime-gdb_test, which
was previously quite skeletal.
Change-Id: I8ad83ee8640891cdd88ecd28dad31ed9b5833b7a
Reviewed-on: https://go-review.googlesource.com/4935
Reviewed-by: Minux Ma <minux@golang.org>
R15 is the real register. PC is a pseudo-register that we are making
illegal in this context as part of the grand assembly unification.
Change-Id: Ie0ea38ce7ef4d2cf4fcbe23b851a570fd312ce8d
Reviewed-on: https://go-review.googlesource.com/4966
Reviewed-by: Minux Ma <minux@golang.org>
There is currently no way to ignore signals using the os/signal package.
It is possible to catch a signal and do nothing but this is not the same
as ignoring it. The new function Ignore allows a set of signals to be
ignored. The new function Reset allows the initial handlers for a set of
signals to be restored.
Fixes#5572
Change-Id: I5c0f07956971e3a9ff9b9d9631e6e3a08c20df15
Reviewed-on: https://go-review.googlesource.com/3580
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change 85e7bee introduced a bug:
it marks map buckets as noscan when key and val do not contain pointers.
However, buckets with large/outline key or val do contain pointers.
This change takes key/val size into consideration when
marking buckets as noscan.
Change-Id: I7172a0df482657be39faa59e2579dd9f209cb54d
Reviewed-on: https://go-review.googlesource.com/4901
Reviewed-by: Keith Randall <khr@golang.org>
Several .s files for ARM had several properties the new assembler will not support.
These include:
- mentioning SP or PC as a hardware register
These are always pseudo-registers except that in some contexts
they're not, and it's confusing because the context should not affect
which register you mean. Change the references to the hardware
registers to be explicit: R13 for SP, R15 for PC.
- constant creation using assignment
The files say a=b when they could instead say #define a b.
There is no reason to have both mechanisms.
- R(0) to refer to R0.
Some macros use this to a great extent. Again, it's easy just to
use a #define to rename a register.
Change-Id: I002335ace8e876c5b63c71c2560533eb835346d2
Reviewed-on: https://go-review.googlesource.com/4822
Reviewed-by: Dave Cheney <dave@cheney.net>
MOVQ RARG0, 0(SP) smashes exactly what was saved by PUSHQ R15.
This code managed to work somehow with the current race runtime,
but corrupts caller arguments with new race runtime that I am testing.
Change-Id: I9ffe8b5eee86451db36e99dbf4d11f320192e576
Reviewed-on: https://go-review.googlesource.com/4810
Reviewed-by: Keith Randall <khr@golang.org>
New race runtime is more scrupulous about env flags format.
Change-Id: I2828bc737a8be3feae5288ccf034c52883f224d8
Reviewed-on: https://go-review.googlesource.com/4811
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
drainworkbuf is now gcDrain, since it drains until there's
nothing left to drain. drainobjects is now gcDrainN because it's
the bounded equivalent to gcDrain.
The new names use the Go camel case convention because we have to
start somewhere. The "gc" prefix is because we don't have runtime
packages yet and just "drain" is too ambiguous.
Change-Id: I88dbdf32e8ce4ce6c3b7e1f234664be9b76cb8fd
Reviewed-on: https://go-review.googlesource.com/4785
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
All calls to drainworkbuf now pass true for this argument, so remove
the argument and update the documentation to reflect the simplified
interface.
At a higher level, there are no longer any situations where we drain
"one wbuf" (though drainworkbuf didn't guarantee this anyway). We
either drain everything, or we drain a specific number of objects.
Change-Id: Ib7ee0fde56577eff64232ee1e711ec57c4361335
Reviewed-on: https://go-review.googlesource.com/4784
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
scanblock is only called during _GCscan and _GCmarktermination.
During _GCscan, scanblock didn't call drainworkbufs anyway. During
_GCmarktermination, there's really no point in draining some (largely
arbitrary) amount of work during the scanblock, since the GC is about
to drain everything anyway, so simply eliminate this case.
Change-Id: I7f3c59ce9186a83037c6f9e9b143181acd04c597
Reviewed-on: https://go-review.googlesource.com/4783
Reviewed-by: Russ Cox <rsc@golang.org>
We no longer ever call scanblock with b == 0.
Change-Id: I9b01da39595e0cc251668c24d58748d88f5f0792
Reviewed-on: https://go-review.googlesource.com/4782
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
scanblock(0, 0, nil, nil) was just a confusing way of saying
wbuf = getpartialorempty()
drainworkbuf(wbuf, true)
Make drainworkbuf accept a nil workbuf and perform the
getpartialorempty itself and replace all uses of scanblock(0, 0, nil,
nil) with direct calls to drainworkbuf(nil, true).
Change-Id: I7002a2f8f3eaf6aa85bbf17ccc81d7288acfef1c
Reviewed-on: https://go-review.googlesource.com/4781
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Previously, scanblock called checknocurrentwbuf() after
drainworkbuf(). Move this call into drainworkbuf so that every return
path from drainworkbuf calls checknocurrentwbuf(). This is equivalent
to the previous code because scanblock was the only caller of
drainworkbuf.
Change-Id: I96ef2168c8aa169bfc4d368f296342fa0fbeafb4
Reviewed-on: https://go-review.googlesource.com/4780
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently we always create context objects for closures that capture variables.
However, it is completely unnecessary for direct calls of closures
(whether it is func()(), defer func()() or go func()()).
This change transforms any OCALLFUNC(OCLOSURE) to normal function call.
Closed variables become function arguments.
This transformation is especially beneficial for go func(),
because we do not need to allocate context object on heap.
But it makes direct closure calls a bit faster as well (see BenchmarkClosureCall).
On implementation level it required to introduce yet another compiler pass.
However, the pass iterates only over xtop, so it should not be an issue.
Transformation consists of two parts: closure transformation and call site
transformation. We can't run these parts on different sides of escape analysis,
because tree state is inconsistent. We can do both parts during typecheck,
we don't know how to capture variables and don't have call site.
We can't do both parts during walk of OCALLFUNC, because we can walk
OCLOSURE body earlier.
So now capturevars pass only decides how to capture variables
(this info is required for escape analysis). New transformclosure
pass, that runs just before order/walk, does all transformations
of a closure. And later walk of OCALLFUNC(OCLOSURE) transforms call site.
benchmark old ns/op new ns/op delta
BenchmarkClosureCall 4.89 3.09 -36.81%
BenchmarkCreateGoroutinesCapture 1634 1294 -20.81%
benchmark old allocs new allocs delta
BenchmarkCreateGoroutinesCapture 6 2 -66.67%
benchmark old bytes new bytes delta
BenchmarkCreateGoroutinesCapture 176 48 -72.73%
Change-Id: Ic85e1706e18c3235cc45b3c0c031a9c1cdb7a40e
Reviewed-on: https://go-review.googlesource.com/4050
Reviewed-by: Russ Cox <rsc@golang.org>
Consider an interface value i of type I and concrete value c of type C.
Prior to this CL, i==c was evaluated as
I(c) == i
Evaluating I(c) can allocate.
This CL changes the evaluation of i==c to
x, ok := i.(C); ok && x == c
The new generated code is shorter and does not allocate directly.
If C is small, as it is in every instance in the stdlib,
the new code also uses less stack space
and makes one runtime call instead of two.
If C is very large, the original implementation is used.
The cutoff for "very large" is 1<<16,
following the stack vs heap cutoff used elsewhere.
This kind of comparison occurs in 38 places in the stdlib,
mostly in the net and os packages.
benchmark old ns/op new ns/op delta
BenchmarkEqEfaceConcrete 29.5 7.92 -73.15%
BenchmarkEqIfaceConcrete 32.1 7.90 -75.39%
BenchmarkNeEfaceConcrete 29.9 7.90 -73.58%
BenchmarkNeIfaceConcrete 35.9 7.90 -77.99%
Fixes#9370.
Change-Id: I7c4555950bcd6406ee5c613be1f2128da2c9a2b7
Reviewed-on: https://go-review.googlesource.com/2096
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
No code modifications.
This is in preparation for improving the wbuf abstraction.
Change-Id: I719543a345c34d079b7e39b251eccd5dd8a07826
Reviewed-on: https://go-review.googlesource.com/4710
Reviewed-by: Rick Hudson <rlh@golang.org>
Plan 9's sysFree has an optimization where if the object being freed
is the last object allocated, it will roll back the brk to allow the
memory to be reused by sysAlloc. However, it does not zero this
"returned" memory, so as a result, sysAlloc can return non-zeroed
memory after a sysFree. This leads to corruption because the runtime
assumes sysAlloc returns zeroed memory.
Fix this by zeroing the memory returned by sysFree.
Fixes#9846.
Change-Id: Id328c58236eb7c464b31ac1da376a0b757a5dc6a
Reviewed-on: https://go-review.googlesource.com/4700
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: David du Colombier <0intro@gmail.com>
typedslicecopy is another write barrier that is not
understood by racewalk. It seems quite complex to handle it
in the compiler, so instead just instrument it in runtime.
Update #9796
Change-Id: I0eb6abf3a2cd2491a338fab5f7da22f01bf7e89b
Reviewed-on: https://go-review.googlesource.com/4370
Reviewed-by: Russ Cox <rsc@golang.org>
Support the following conversions in escape analysis:
[]rune("foo")
[]byte("foo")
string([]rune{})
If the result does not escape, allocate temp buffer on stack
and pass it to runtime functions.
Change-Id: I1d075907eab8b0109ad7ad1878104b02b3d5c690
Reviewed-on: https://go-review.googlesource.com/3590
Reviewed-by: Russ Cox <rsc@golang.org>
Add local workbufs to the m struct in order to reduce contention.
Add consistency checks for workbuf ownership.
Chain workbufs through call change to avoid swapping them
to and from the m struct.
Adjust the size of the workbuf so that the mutators can
more frequently pass modifications to the GC thus shifting
some work from the STW mark termination phase to the concurrent
mark phase.
Change-Id: I557b53af34ad9972265e0ed9f5996e52d548563d
Reviewed-on: https://go-review.googlesource.com/3972
Reviewed-by: Austin Clements <austin@google.com>
Fixes#9791
g.issystem flag setup races with other code wherever we set it.
Even if we set both in parent goroutine and in the system goroutine,
it is still possible that some other goroutine crashes
before the flag is set. We could pass issystem flag to newproc1,
but we start all goroutines with go nowadays.
Instead look at g.startpc to distinguish system goroutines (similar to topofstack).
Change-Id: Ia3467968dee27fa07d9fecedd4c2b00928f26645
Reviewed-on: https://go-review.googlesource.com/4113
Reviewed-by: Keith Randall <khr@golang.org>
Update #8832
This is probably not the root cause of the issue.
Resolve TODO about setting unusedsince on a wrong span.
Change-Id: I69c87e3d93cb025e3e6fa80a8cffba6ad6ad1395
Reviewed-on: https://go-review.googlesource.com/4390
Reviewed-by: Keith Randall <khr@golang.org>
Container symbols shouldn't be considered as functions in the functab.
Having them present probably messes up function lookup, as you might get
the descriptor of the container instead of the descriptor of the actual
function on the stack. It also messed up the findfunctab because these
entries caused off-by-one errors in how functab entries were counted.
Normal code is not affected - it only changes (& hopefully fixes) the
behavior for libraries linked as a unit, like:
net
runtime/cgo
runtime/race
Fixes#9804
Change-Id: I81e036e897571ac96567d59e1f1d7f058ca75e85
Reviewed-on: https://go-review.googlesource.com/4290
Reviewed-by: Russ Cox <rsc@golang.org>
This CL introduces new methods for 'context' type, so we can
manipulate its values in an architecture independent way.
Use new methods to replace both 386 and amd64 versions of
dosigprof with single piece of code.
There is more similar code to be converted in the following CLs.
Also remove os_windows_386.go and os_windows_amd64.go. These
contain unused functions.
Change-Id: I28f76aeb97f6e4249843d30d3d0c33fb233d3f7f
Reviewed-on: https://go-review.googlesource.com/2790
Reviewed-by: Minux Ma <minux@golang.org>
CL 2118 makes the assumption that all references to runtime.tlsg
should be accompanied by a declaration of runtime.tlsg if its type
should be a normal variable, instead of a placeholder for TLS
relocation.
Because if runtime.tlsg is not declared by the runtime package,
the type of runtime.tlsg will be zero, so fix the check in liblink
to look for 0 instead of STLSBSS (the type will be initialized by
cmd/ld, but cmd/ld doesn't run during assembly).
Change-Id: I691ac5c3faea902f8b9a0b963e781b22e7b269a7
Reviewed-on: https://go-review.googlesource.com/4030
Reviewed-by: David Crawshaw <crawshaw@golang.org>
This change is an implementation of the signal
runtime and os/signal package on Plan 9.
Contrary to Unix, on Plan 9 a signal is called
a note and is represented by a string.
For this reason, the sigsend and signal_recv
functions had to be reimplemented specifically
for Plan 9.
In order to reuse most of the code and internal
interface of the os/signal package, the note
strings are mapped to integers.
Thanks to Russ Cox for the early review.
Change-Id: I95836645efe21942bb1939f43f87fb3c0eaaef1a
Reviewed-on: https://go-review.googlesource.com/2164
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
It turns out -iex argument is not supported by all gdb versions,
but as we need to add the auto-load safe path before loading the
inferior, test -iex support first and skip the test if it's not
available.
We should still update our builders though.
Change-Id: I355697de51baf12162ba6cb82f389dad93f93dc5
Reviewed-on: https://go-review.googlesource.com/4070
Reviewed-by: Ian Lance Taylor <iant@golang.org>
On some systems, gdb refuses to load Python plugin from arbitrary
paths, so we have to add $GOROOT/src/runtime to auto-load-safe-path
in the gdb script test.
Change-Id: Icc44baab8d04a65bd21ceac2ab8ddb13c8d083e8
Reviewed-on: https://go-review.googlesource.com/2905
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
eqstring does not need to check the length of the strings.
Other architectures were done in a separate commit.
While we're here, add a pointer equality check.
Change-Id: Id2c8616a03a7da7037c1e9ccd56a549fc952bd98
Reviewed-on: https://go-review.googlesource.com/3956
Reviewed-by: Keith Randall <khr@golang.org>
eqstring does not need to check the length of the strings.
6g
benchmark old ns/op new ns/op delta
BenchmarkCompareStringEqual 7.03 6.14 -12.66%
BenchmarkCompareStringIdentical 3.36 3.04 -9.52%
5g
benchmark old ns/op new ns/op delta
BenchmarkCompareStringEqual 238 232 -2.52%
BenchmarkCompareStringIdentical 90.8 80.7 -11.12%
The equivalent PPC changes are in a separate commit
because I don't have the hardware to test them.
Change-Id: I292874324b9bbd9d24f57a390cfff8b550cdd53c
Reviewed-on: https://go-review.googlesource.com/3955
Reviewed-by: Keith Randall <khr@golang.org>
Only documentation / comment changes. Update references to
point to golang.org permalinks or go.googlesource.com/go.
References in historical release notes under doc are left as is.
Change-Id: Icfc14e4998723e2c2d48f9877a91c5abef6794ea
Reviewed-on: https://go-review.googlesource.com/4060
Reviewed-by: Ian Lance Taylor <iant@golang.org>
In the old code, liblink, cmd/ld and runtime all have code determine
whether runtime.tlsg is an actual variable or a placeholder for TLS
relocation. This change consolidate them into one: the runtime/tls_arm.s
will ultimately determine the type of that variable.
Change-Id: I3b3f80791a1db4c2b7318f81a115972cd2237e43
Reviewed-on: https://go-review.googlesource.com/2118
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
In android-L, logging is done through the logd daemon.
If logd daemon is available, send logging to logd.
Otherwise, fallback to the legacy mechanism (/dev/log files).
This change adds access/socket/connect calls to interact with the logd.
Fixesgolang/go#9398.
Change-Id: I3c52b81b451f5862107d7c675f799fc85548486d
Reviewed-on: https://go-review.googlesource.com/3350
Reviewed-by: David Crawshaw <crawshaw@golang.org>
The unbounded list-based defer pool can grow infinitely.
This can happen if a goroutine routinely allocates a defer;
then blocks on one P; and then unblocked, scheduled and
frees the defer on another P.
The scenario was reported on golang-nuts list.
We've been here several times. Any unbounded local caches
are bad and grow to infinite size. This change introduces
central defer pool; local pools become fixed-size
with the only purpose of amortizing accesses to the
central pool.
Change-Id: Iadcfb113ccecf912e1b64afc07926f0de9de2248
Reviewed-on: https://go-review.googlesource.com/3741
Reviewed-by: Keith Randall <khr@golang.org>
Using benchmark from the issue:
benchmark old ns/op new ns/op delta
BenchmarkRangeStringCast 2162 1152 -46.72%
benchmark old allocs new allocs delta
BenchmarkRangeStringCast 1 0 -100.00%
Fixes#2204
Change-Id: I92c5edd2adca4a7b6fba00713a581bf49dc59afe
Reviewed-on: https://go-review.googlesource.com/3790
Reviewed-by: Keith Randall <khr@golang.org>
Before 3c0fee1, runtime.gogo was just long enough to align to 64 bytes
on OSs with short get_tls implementations and 80 bytes on OSs with
longer get_tls implementations (Windows, Solaris, and Plan 9).
3c0fee1 added a few instructions, which pushed it to 80 on most OSs,
including Windows and Plan 9, and 96 on Solaris.
Fixes#9770.
Change-Id: Ie84810657c14ab16dce9f0e0a932955251b0bf33
Reviewed-on: https://go-review.googlesource.com/3850
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Use memprofilerate in GODEBUG instead of memprofrate to be
consistent with other uses.
Change-Id: Iaf6bd3b378b1fc45d36ecde32f3ad4e63ca1e86b
Reviewed-on: https://go-review.googlesource.com/3800
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The overflow happens only with -gcflags="-N -l"
and can be reproduced with:
$ go test -gcflags="-N -l" -a -run=none net
runtime.cgocall: nosplit stack overflow
504 assumed on entry to runtime.cgocall
480 after runtime.cgocall uses 24
472 on entry to runtime.cgocall_errno
408 after runtime.cgocall_errno uses 64
400 on entry to runtime.exitsyscall
288 after runtime.exitsyscall uses 112
280 on entry to runtime.exitsyscallfast
152 after runtime.exitsyscallfast uses 128
144 on entry to runtime.writebarrierptr
88 after runtime.writebarrierptr uses 56
80 on entry to runtime.writebarrierptr_nostore1
24 after runtime.writebarrierptr_nostore1 uses 56
16 on entry to runtime.acquirem
-24 after runtime.acquirem uses 40
Move closure creation into separate function so that
frames of writebarrierptr_shadow and writebarrierptr_nostore1
are overlapped.
Fixes#9721
Change-Id: I40851f0786763ee964af34814edbc3e3d73cf4e7
Reviewed-on: https://go-review.googlesource.com/3418
Reviewed-by: Russ Cox <rsc@golang.org>
Currently race detector produces the following reports on pprof tests:
WARNING: DATA RACE
Read by goroutine 4:
runtime/pprof_test.TestTraceStartStop()
src/runtime/pprof/trace_test.go:38 +0x1da
testing.tRunner()
src/testing/testing.go:448 +0x13a
Previous write by goroutine 5:
bytes.(*Buffer).grow()
src/bytes/buffer.go:102 +0x190
bytes.(*Buffer).Write()
src/bytes/buffer.go:127 +0x75
runtime/pprof.func·002()
src/runtime/pprof/pprof.go:633 +0xae
Trace writer goroutine synchronizes with StopTrace
using trace.shutdownSema runtime semaphore.
But race detector does not see that synchronization
and so produces false reports.
Teach race detector about the synchronization.
Change-Id: I1219817325d4e16b423f29a0cbee94c929793881
Reviewed-on: https://go-review.googlesource.com/3746
Reviewed-by: Russ Cox <rsc@golang.org>
The test for the framepointer experiment flag is cheaper and more
branch-predictable than the other parts of this conditional, so move
it first. This is also more readable.
(Originally, the flag check required parsing the experiments string,
which is why it was done last. Now that flag is cached.)
Change-Id: I84e00fa7e939e9064f0fa0a4a6fe00576dd61457
Reviewed-on: https://go-review.googlesource.com/3782
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Previously, we checked for a saved frame pointer by looking for a
2*ptrSize gap between the argument pointer and the locals pointer.
The intent of this check was to look for a two stack slot gap (caller
IP and saved frame pointer), but stack slots are regSize, not ptrSize.
Correct this by checking instead for a 2*regSize gap.
On most platforms, this made no difference because ptrSize==regSize.
However, on amd64p32 (nacl), the saved frame pointer check incorrectly
fired when there was no saved frame pointer because the one stack slot
for the caller IP left an 8 byte gap, which is 2*ptrSize (but not
2*regSize) on amd64p32.
Fixes#9760.
Change-Id: I6eedcf681fe5bf2bf924dde8a8f2d9860a4d758e
Reviewed-on: https://go-review.googlesource.com/3781
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Add memprofrate as a value recognized in GODEBUG. The
value provided is used as the new setting for
runtime.MemProfileRate, allowing the user to
adjust memory profiling.
Change-Id: If129a247683263b11e2dd42473cf9b31280543d5
Reviewed-on: https://go-review.googlesource.com/3450
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This adds a "framepointer" GOEXPERIMENT that that makes the amd64
toolchain maintain base pointer chains in the same way that gcc
-fno-omit-frame-pointer does. Go doesn't use these saved base
pointers, but this does enable external tools like Linux perf and
VTune to unwind Go stacks when collecting system-wide profiles.
This requires support in the compilers to not clobber BP, support in
liblink for generating the BP-saving function prologue and unwinding
epilogue, and support in the runtime to save BPs across preemption, to
skip saved BPs during stack unwinding and, and to adjust saved BPs
during stack moving.
As with other GOEXPERIMENTs, everything from the toolchain to the
runtime must be compiled with this experiment enabled. To do this,
run make.bash (or all.bash) with GOEXPERIMENT=framepointer.
Change-Id: I4024853beefb9539949e5ca381adfdd9cfada544
Reviewed-on: https://go-review.googlesource.com/2992
Reviewed-by: Russ Cox <rsc@golang.org>
Any place that clobbers BP in the runtime can potentially interfere
with frame pointer unwinding with GOEXPERIMENT=framepointer. This
change eliminates uses of BP in the runtime to address this problem.
We have spare registers everywhere this occurs, so there's no downside
to eliminating BP. Where possible, this uses the same new register as
the amd64p32 runtime, which doesn't use BP due to restrictions placed
on it by NaCL.
One nice side effect of this is that it will let perf/VTune unwind the
call stack even through a call to systemstack, which will let us get
really good call graphs from the garbage collector.
Change-Id: I0ffa14cb4dd2b613a7049b8ec59df37c52286212
Reviewed-on: https://go-review.googlesource.com/3390
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
m.gcing has become overloaded to mean "don't preempt this g" in
general. Once the garbage collector is preemptible, the one thing it
*won't* mean is that we're in the garbage collector.
So, rename gcing to "preemptoff" and make it a string giving a reason
that preemption is disabled. gcing was never set to anything but 0 or
1, so we don't have to worry about there being a stack of reasons.
Change-Id: I4337c29e8e942e7aa4f106fc29597e1b5de4ef46
Reviewed-on: https://go-review.googlesource.com/3660
Reviewed-by: Russ Cox <rsc@golang.org>
Commit 656be31 replaced onM with systemstack, but missed updating a
few comments that still referred to onM. Update these.
Change-Id: I0efb017e9a66ea0adebb6e1da6e518ee11263f69
Reviewed-on: https://go-review.googlesource.com/3664
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
The following line in sysFree:
n += (n + memRound) &^ memRound
doubles value of n (n += n).
Which is wrong and can lead to memory corruption.
Fixes#9712
Change-Id: I3c141b71da11e38837c09408cf4f1d22e8f7f36e
Reviewed-on: https://go-review.googlesource.com/3602
Reviewed-by: David du Colombier <0intro@gmail.com>
Set gcscanvalid=false after you have cased to _Grunning.
If you do it before the cas and the atomicstatus races to a scan state,
the scan will set gcscanvalid=true and we will be _Grunning
with gcscanvalid==true which is not a good thing.
Change-Id: Ie53ea744a5600392b47da91159d985fe6fe75961
Reviewed-on: https://go-review.googlesource.com/3510
Reviewed-by: Austin Clements <austin@google.com>
Yet another leftover from C: parfor took a func value for the
callback, casted it to an unsafe.Pointer for storage, and then casted
it back to a func value to call it. This is unnecessary, so just
store the body as a func value. Beyond general cleanup, this also
eliminates the last use of unsafe in parfor.
Change-Id: Ia904af7c6c443ba75e2699835aee8e9a39b26dd8
Reviewed-on: https://go-review.googlesource.com/3396
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Prior to the conversion of the runtime to Go, this void* was
necessary to get closure information in to C callbacks. There
are no more C callbacks and parfor is perfectly capable of
invoking a Go closure now, so eliminate ctx and all of its
unsafe-ness. (Plus, the runtime currently doesn't use ctx for
anything.)
Change-Id: I39fc53b7dd3d7f660710abc76b0d831bfc6296d8
Reviewed-on: https://go-review.googlesource.com/3395
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
parfor originally used a tail array for its thread array. This got
replaced with a slice allocation in the conversion to Go, but many of
its gnarlier effects remained. Instead of keeping track of the
pointer to the first element of the slice and using unsafe pointer
math to get at the ith element, just keep the slice around and use
regular slice indexing. There is no longer any need for padding to
64-bit align the tail array (there hasn't been since the Go
conversion), so remove this unnecessary padding from the parfor
struct. Finally, since the slice tracks its own length, replace the
nthrmax field with len(thr).
Change-Id: I0020a1815849bca53e3613a8fa46ae4fbae67576
Reviewed-on: https://go-review.googlesource.com/3394
Reviewed-by: Russ Cox <rsc@golang.org>
This cleanup was slated for after the conversion of the runtime to Go.
Also improve type and function documentation.
Change-Id: I55a16b09e00cf701f246deb69e7ce7e3e04b26e7
Reviewed-on: https://go-review.googlesource.com/3393
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently, if we do an atomic{load,store}64 of an unaligned address on
386, we'll simply get a non-atomic load/store. This has been the
source of myriad bugs, so add alignment checks to these two
operations. These checks parallel the equivalent checks in
sync/atomic.
The alignment check is not necessary in cas64 because it uses a locked
instruction. The CPU will either execute this atomically or raise an
alignment fault (#AC)---depending on the alignment check flag---either
of which is fine.
This also fixes the two places in the runtime that trip the new
checks. One is in the runtime self-test and shouldn't have caused
real problems. The other is in tickspersecond and could, in
principle, have caused a misread of the ticks per second during
initialization.
Change-Id: If1796667012a6154f64f5e71d043c7f5fb3dd050
Reviewed-on: https://go-review.googlesource.com/3521
Reviewed-by: Russ Cox <rsc@golang.org>
Language specification says that variables are captured by reference.
And that is what gc compiler does. However, in lots of cases it is
possible to capture variables by value under the hood without
affecting visible behavior of programs. For example, consider
the following typical pattern:
func (o *Obj) requestMany(urls []string) []Result {
wg := new(sync.WaitGroup)
wg.Add(len(urls))
res := make([]Result, len(urls))
for i := range urls {
i := i
go func() {
res[i] = o.requestOne(urls[i])
wg.Done()
}()
}
wg.Wait()
return res
}
Currently o, wg, res, and i are captured by reference causing 3+len(urls)
allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap).
But all of them can be captured by value without changing behavior.
This change implements simple strategy for capturing by value:
if a captured variable is not addrtaken and never assigned to,
then it is captured by value (it is effectively const).
This simple strategy turned out to be very effective:
~80% of all captures in std lib are turned into value captures.
The remaining 20% are mostly in defers and non-escaping closures,
that is, they do not cause allocations anyway.
benchmark old allocs new allocs delta
BenchmarkCompressedZipGarbage 153 126 -17.65%
BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18%
BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53%
BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40%
BenchmarkEncodeDigitsDefault1e4 100 75 -25.00%
BenchmarkEncodeDigitsDefault1e5 193 139 -27.98%
BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63%
BenchmarkEncodeDigitsCompress1e4 100 75 -25.00%
BenchmarkEncodeDigitsCompress1e5 193 139 -27.98%
BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63%
BenchmarkEncodeTwainSpeed1e4 109 81 -25.69%
BenchmarkEncodeTwainSpeed1e5 211 151 -28.44%
BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92%
BenchmarkEncodeTwainDefault1e4 103 77 -25.24%
BenchmarkEncodeTwainDefault1e5 199 143 -28.14%
BenchmarkEncodeTwainDefault1e6 1324 917 -30.74%
BenchmarkEncodeTwainCompress1e4 103 77 -25.24%
BenchmarkEncodeTwainCompress1e5 190 137 -27.89%
BenchmarkEncodeTwainCompress1e6 1327 919 -30.75%
BenchmarkConcurrentDBExec 16223 16220 -0.02%
BenchmarkConcurrentStmtQuery 17687 16182 -8.51%
BenchmarkConcurrentStmtExec 5191 5186 -0.10%
BenchmarkConcurrentTxQuery 17665 17661 -0.02%
BenchmarkConcurrentTxExec 15154 15150 -0.03%
BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52%
BenchmarkConcurrentTxStmtExec 3677 3673 -0.11%
BenchmarkConcurrentRandom 14000 13614 -2.76%
BenchmarkManyConcurrentQueries 25 22 -12.00%
BenchmarkDecodeComplex128Slice 318 252 -20.75%
BenchmarkDecodeFloat64Slice 318 252 -20.75%
BenchmarkDecodeInt32Slice 318 252 -20.75%
BenchmarkDecodeStringSlice 2318 2252 -2.85%
BenchmarkDecode 11 8 -27.27%
BenchmarkEncodeGray 64 56 -12.50%
BenchmarkEncodeNRGBOpaque 64 56 -12.50%
BenchmarkEncodeNRGBA 67 58 -13.43%
BenchmarkEncodePaletted 68 60 -11.76%
BenchmarkEncodeRGBOpaque 64 56 -12.50%
BenchmarkGoLookupIP 153 139 -9.15%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServer 62 59 -4.84%
BenchmarkClientServerParallel4 62 59 -4.84%
BenchmarkClientServerParallel64 62 59 -4.84%
BenchmarkClientServerParallelTLS4 79 76 -3.80%
BenchmarkClientServerParallelTLS64 112 109 -2.68%
BenchmarkCreateGoroutinesCapture 10 6 -40.00%
BenchmarkAfterFunc 1006 1005 -0.10%
Fixes#6632.
Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca
Reviewed-on: https://go-review.googlesource.com/3166
Reviewed-by: Russ Cox <rsc@golang.org>
Set the minimum heap size to 4Mbytes except when the hash
table code wants to force a GC. In an unrelated change when a
mutator is asked to assist the GC by marking pointer workbufs
it will keep working until the requested number of pointers
are processed even if it means asking for additional workbufs.
Change-Id: I661cfc0a7f2efcf6286b5d37d73e593d9ecd04d5
Reviewed-on: https://go-review.googlesource.com/3392
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
If result of string(i) does not escape,
allocate a [4]byte temp on stack for it.
Change-Id: If31ce9447982929d5b3b963fd0830efae4247c37
Reviewed-on: https://go-review.googlesource.com/3411
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we always allocate string buffers in heap.
For example, in the following code we allocate a temp string
just for comparison:
if string(byteSlice) == "abc" { ... }
This change extends escape analysis to cover []byte->string
conversions and string concatenation. If the result of operations
does not escape, compiler allocates a small buffer
on stack and passes it to slicebytetostring and concatstrings.
Then runtime uses the buffer if the result fits into it.
Size of the buffer is 32 bytes. There is no fundamental theory
behind this number. Just an observation that on std lib
tests/benchmarks frequency of string allocation is inversely
proportional to string length; and there is significant number
of allocations up to length 32.
benchmark old allocs new allocs delta
BenchmarkFprintfBytes 2 1 -50.00%
BenchmarkDecodeComplex128Slice 318 316 -0.63%
BenchmarkDecodeFloat64Slice 318 316 -0.63%
BenchmarkDecodeInt32Slice 318 316 -0.63%
BenchmarkDecodeStringSlice 2318 2316 -0.09%
BenchmarkStripTags 11 5 -54.55%
BenchmarkDecodeGray 111 102 -8.11%
BenchmarkDecodeNRGBAGradient 200 188 -6.00%
BenchmarkDecodeNRGBAOpaque 165 152 -7.88%
BenchmarkDecodePaletted 319 309 -3.13%
BenchmarkDecodeRGB 166 157 -5.42%
BenchmarkDecodeInterlacing 279 268 -3.94%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServerParallel4 62 61 -1.61%
BenchmarkClientServerParallel64 62 61 -1.61%
BenchmarkClientServerParallelTLS4 79 78 -1.27%
BenchmarkClientServerParallelTLS64 112 111 -0.89%
benchmark old ns/op new ns/op delta
BenchmarkFprintfBytes 381 311 -18.37%
BenchmarkStripTags 2615 2351 -10.10%
BenchmarkDecodeNRGBAGradient 3715887 3635096 -2.17%
BenchmarkDecodeNRGBAOpaque 3047645 2928644 -3.90%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
Change-Id: I9ec01da816945c3329d7be3c7794b520418c3f99
Reviewed-on: https://go-review.googlesource.com/3120
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
During a concurrent GC stacks are scanned in
an initial scan phase informing the GC of all
pointers on the stack. The GC only needs to rescan
the stack if it potentially changes which can only
happen if the goroutine runs.
This CL tracks whether the Goroutine has run
since it was last scanned and thus may have changed
its stack. If necessary the stack is rescanned.
Change-Id: I5fb1c4338d42e3f61ab56c9beb63b7b2da25f4f1
Reviewed-on: https://go-review.googlesource.com/3275
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we allocate a new string during []byte->string conversion
in string comparison expressions. String allocation is unnecessary in
this case, because comparison does memorize the strings for later use.
This change uses slicebytetostringtmp to construct temp string directly
from []byte buffer and passes it to runtime.eqstring.
Change-Id: If00f1faaee2076baa6f6724d245d5b5e0f59b563
Reviewed-on: https://go-review.googlesource.com/3410
Reviewed-by: Russ Cox <rsc@golang.org>
Coarse-grained test skips to fix bots.
Need to look closer at windows and nacl failures.
Change-Id: I767ef1707232918636b33f715459ee3c0349b45e
Reviewed-on: https://go-review.googlesource.com/3416
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Call frame allocations can account for significant portion
of all allocations in a program, if call is executed
in an inner loop (e.g. to process every line in a log).
On the other hand, the allocation is easy to remove
using sync.Pool since the allocation is strictly scoped.
benchmark old ns/op new ns/op delta
BenchmarkCall 634 338 -46.69%
BenchmarkCall-4 496 167 -66.33%
benchmark old allocs new allocs delta
BenchmarkCall 1 0 -100.00%
BenchmarkCall-4 1 0 -100.00%
Update #7818
Change-Id: Icf60cce0a9be82e6171f0c0bd80dee2393db54a7
Reviewed-on: https://go-review.googlesource.com/1954
Reviewed-by: Keith Randall <khr@golang.org>
The %61 hack was added when runtime was is in C.
Now the Go compiler does the optimization.
Change-Id: I79c3302ec4b931eaaaaffe75e7101c92bf287fc7
Reviewed-on: https://go-review.googlesource.com/3289
Reviewed-by: Keith Randall <khr@golang.org>
Consider the following code:
s := "(" + string(byteSlice) + ")"
Currently we allocate a new string during []byte->string conversion,
and pass it to concatstrings. String allocation is unnecessary in
this case, because concatstrings does memorize the strings for later use.
This change uses slicebytetostringtmp to construct temp string directly
from []byte buffer and passes it to concatstrings.
I've found few such cases in std lib:
s += string(msg[off:off+c]) + "."
buf.WriteString("Sec-WebSocket-Accept: " + string(c.accept) + "\r\n")
bw.WriteString("Sec-WebSocket-Key: " + string(nonce) + "\r\n")
err = xml.Unmarshal([]byte("<Top>"+string(data)+"</Top>"), &logStruct)
d.err = d.syntaxError("invalid XML name: " + string(b))
return m, ProtocolError("malformed MIME header line: " + string(kv))
But there are much more in our internal code base.
Change-Id: I42f401f317131237ddd0cb9786b0940213af16fb
Reviewed-on: https://go-review.googlesource.com/3163
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Half of tests currently crash with GODEBUG=wbshadow.
_PageSize is set to 8192. So data can be extended outside
of actually mapped region during rounding. Which leads to crash
during initial copying to shadow.
Use _PhysPageSize instead.
Change-Id: Iaa89992bd57f86dafa16b092b53fdc0606213acb
Reviewed-on: https://go-review.googlesource.com/3286
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we scan maps even if k/v does not contain pointers.
This is required because overflow buckets are hanging off the main table.
This change introduces a separate array that contains pointers to all
overflow buckets and keeps them alive. Buckets themselves are marked
as containing no pointers and are not scanned by GC (if k/v does not
contain pointers).
This brings maps in line with slices and chans -- GC does not scan
their contents if elements do not contain pointers.
Currently scanning of a map[int]int with 2e8 entries (~8GB heap)
takes ~8 seconds. With this change scanning takes negligible time.
Update #9477.
Change-Id: Id8a04066a53d2f743474cad406afb9f30f00eaae
Reviewed-on: https://go-review.googlesource.com/3288
Reviewed-by: Keith Randall <khr@golang.org>
Adjust triggergc so that we trigger when we have used 7/8
of the available heap memory. Do first collection when we
exceed 4Mbytes.
Change-Id: I467b4335e16dc9cd1521d687fc1f99a51cc7e54b
Reviewed-on: https://go-review.googlesource.com/3149
Reviewed-by: Austin Clements <austin@google.com>
Adujst triggergc so that we trigger when we have used 7/8
of the available memory.
Change-Id: I7ca02546d3084e6a04d60b09479e04a9a9837ae2
Reviewed-on: https://go-review.googlesource.com/3061
Reviewed-by: Russ Cox <rsc@golang.org>
Print out the object holding the reference to the object
that checkmark detects as not being properly marked.
Change-Id: Ieedbb6fddfaa65714504af9e7230bd9424cd0ae0
Reviewed-on: https://go-review.googlesource.com/2744
Reviewed-by: Austin Clements <austin@google.com>