Plan 9 provides a /dev/random device to return a
stream of random numbers. However, the method used
to generate random numbers on Plan 9 is slow and
reading from /dev/random may block.
We don't want our Go programs to be significantly
slowed down just to slightly improve the distribution
of hash values.
So, we do the same thing as NaCl and rely exclusively
on extendRandom to generate pseudo-random numbers.
Fixes#10028.
Change-Id: I7e11a9b109c22f23608eb09c406b7c3dba31f26a
Reviewed-on: https://go-review.googlesource.com/6386
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
issue #10017: TestGdbPython 'print mapvar' is reported to fail on ppc64.
issue #10002: TestGdbPython 'print mapvar' is reported to fail on arm hardfloat.
The testcase now uses plain line number in main. Unwinding issues are
unrelated to the GDB map prettyprinter feature.
Remove arch-specific t.Skip()s from those two issues.
Fixes#10017Fixes#10002
Change-Id: I9d50ffe2f3eb7bf65dd17c8c76a2677571de68ba
Reviewed-on: https://go-review.googlesource.com/6267
Reviewed-by: Minux Ma <minux@golang.org>
mv cmd/new5l cmd/5l and so on.
Minimal changes to cmd/dist and cmd/go to keep things building.
More can be deleted in followup CLs.
Change-Id: I1449eca7654ce2580d1f413a56dc4a75f3d4618b
Reviewed-on: https://go-review.googlesource.com/6361
Reviewed-by: Rob Pike <r@golang.org>
We used to not call traceback from goexit1.
But now tracer does it and crashes on amd64p32:
runtime: unexpected return pc for runtime.getg called from 0x108a4240
goroutine 18 [runnable, locked to thread]:
runtime.traceGoEnd()
src/runtime/trace.go:758 fp=0x10818fe0 sp=0x10818fdc
runtime.goexit1()
src/runtime/proc1.go:1540 +0x20 fp=0x10818fe8 sp=0x10818fe0
runtime.getg(0x0)
src/runtime/asm_386.s:2414 fp=0x10818fec sp=0x10818fe8
created by runtime/pprof_test.TestTraceStress
src/runtime/pprof/trace_test.go:123 +0x500
Return PC from goexit1 points right after goexit (+0x6).
It happens to work most of the time somehow.
This change fixes traceback from goexit1 by adding an additional NOP to goexit.
Fixes#9931
Change-Id: Ied25240a181b0a2d7bc98127b3ed9068e9a1a13e
Reviewed-on: https://go-review.googlesource.com/5460
Reviewed-by: Russ Cox <rsc@golang.org>
This is to be used by an lldb script inside go_darwin_arm_exec to pause
the execution of tests on iOS so the working directory can be adjusted
into something resembling a GOROOT.
Change-Id: I69ea2d4d871800ae56634b23ffa48583559ddbc6
Reviewed-on: https://go-review.googlesource.com/6363
Reviewed-by: Minux Ma <minux@golang.org>
Change-Id: I9b08b74214e5a41a7e98866a993b038030a4c073
Reviewed-on: https://go-review.googlesource.com/6251
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Previously, the typeDead check in greyobject was under a separate
!useCheckmark conditional. Put it with the rest of the !useCheckmark
code. Also move a comment about atomic update of the marked bit to
where we actually do that update now.
Change-Id: Ief5f16401a25739ad57d959607b8d81ffe0bc211
Reviewed-on: https://go-review.googlesource.com/6271
Reviewed-by: Rick Hudson <rlh@golang.org>
Change-Id: I1bb0b8b11e8c7686b85657050fd7cf926afe4d29
Reviewed-on: https://go-review.googlesource.com/6200
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Previously, the memory allocator on Plan 9 did
not free memory properly. It was only able to
free the last allocated block.
This change implements a variant of the
Kernighan & Ritchie memory allocator with
coalescing and splitting.
The most notable differences are:
- no header is prefixing the allocated blocks, since
the size is always specified when calling sysFree,
- the free list is nil-terminated instead of circular.
Fixes#9736.
Fixes#9803.
Fixes#9952.
Change-Id: I00d533714e4144a0012f69820d31cbb0253031a3
Reviewed-on: https://go-review.googlesource.com/5524
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Disable the test properly on nacl systems, tested on nacl/amd64p32.
Change-Id: Iffe210be4f9c426bfc47f2dd3a8f0c6b5a398cc3
Reviewed-on: https://go-review.googlesource.com/6093
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Update #9993
If the physical page size of the machine is larger than the logical
heap size, for example 8k logical, 64k physical, then madvise(2) will
round up the requested amount to a 64k boundary and may discard pages
close to the page being madvised.
This patch disables the scavenger in these situations, which at the moment
is only ppc64 and ppc64le systems. NaCl also uses a 64k page size, but
it's not clear if it is affected by this problem.
Change-Id: Ib897f8d3df5bd915ddc0b510f2fd90a30ef329ca
Reviewed-on: https://go-review.googlesource.com/6091
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Needs the Go tool, which we do not have on iOS. (No Fork.)
Change-Id: Iedf69f5ca81d66515647746546c9b304c8ec10c4
Reviewed-on: https://go-review.googlesource.com/6102
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
There is no sense in trying to netpoll while there is
already a thread blocked in netpoll. And in most cases
there must be a thread blocked in netpoll, because
the first otherwise idle thread does blocking netpoll.
On some program I see that netpoll called from findrunnable
consumes 3% of time.
Change-Id: I0af1a73d637bffd9770ea50cb9278839716e8816
Reviewed-on: https://go-review.googlesource.com/4553
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
This makes Go's CPU profiling code somewhat more idiomatic; e.g.,
using := instead of forward declaring variables, using "int" for
element counts instead of "uintptr", and slices instead of C-style
pointer+length. This makes the code easier to read and eliminates a
lot of type conversion clutter.
Additionally, in sigprof we can collect just maxCPUProfStack stack
frames, as cpuprof won't use more than that anyway.
Change-Id: I0235b5ae552191bcbb453b14add6d8c01381bd06
Reviewed-on: https://go-review.googlesource.com/6072
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
The first call is pointless. It appears to simply be a mistake.
benchmark old ns/op new ns/op delta
BenchmarkComplexAlgMap 90.7 76.1 -16.10%
Change-Id: Id0194c9f09cea8b68f17b2ac751a8e3240e47f19
Reviewed-on: https://go-review.googlesource.com/5284
Reviewed-by: Keith Randall <khr@golang.org>
Gives tests a way to find the bundle that contains their testdata, and
is generally useful for finding resources.
Change-Id: Idfa03e8543af927c17bc8ec8aadc5014ec82df28
Reviewed-on: https://go-review.googlesource.com/6000
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Updates #10002
The gdb test added in 1c82e236f5 is failing on most arm systems.
Temporarily disable this test so that we can return to a working arm build.
Change-Id: Iff96ea8d5a99e1ceacf4979e864ff196e5503535
Reviewed-on: https://go-review.googlesource.com/5902
Reviewed-by: Ian Lance Taylor <iant@golang.org>
We return memory to the kernel with madvise(..., DONTNEED).
Also mark returned memory with NOHUGEPAGE to keep the kernel from
merging this memory into a huge page, effectively reallocating it.
Only known to be a problem on linux/{386,amd64,amd64p32} at the moment.
It may come up on other os/arch combinations in the future.
Fixes#8832
Change-Id: Ifffc6627a0296926e3f189a8a9b6e4bdb54c79eb
Reviewed-on: https://go-review.googlesource.com/5660
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
We need to distinguish pointers to free spans, which indicate bugs in
our pointer analysis, from pointers to never-in-the-heap spans, which
can legitimately arise from sysAlloc/mmap/etc. This normally isn't a
problem because the heap is contiguous, but in some situations (32
bit, particularly) the heap must grow around an already allocated
region.
The bad pointer test is disabled so this fix doesn't actually do
anything, but it removes one barrier from reenabling it.
Fixes#9872.
Change-Id: I0a92db4d43b642c58d2b40af69c906a8d9777f88
Reviewed-on: https://go-review.googlesource.com/5780
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Available darwin/arm devices sporadically have trouble mapping 256M.
I would really appreciate it if anyone could check my working on
this, and make sure sure there aren't obviously bad consequences I
haven't considered.
Change-Id: Id1a8edae104d974fcf5f9333274f958625467f79
Reviewed-on: https://go-review.googlesource.com/5752
Reviewed-by: Keith Randall <khr@golang.org>
Since allglock is held in this function, there's no point to
tip-toeing around allgs. Just use a for-range loop.
Change-Id: I1ee61c7e8cac8b8ebc8107c0c22f739db5db9840
Reviewed-on: https://go-review.googlesource.com/5882
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Previously, we had three loops in the garbage collector that all
cleared the per-G GC flags. Consolidate these into one function.
This one function is designed to work in a concurrent setting. As a
result, it's slightly more expensive than the loops it replaces during
STW phases, but these happen at most twice per GC.
Change-Id: Id1ec0074fd58865eb0112b8a0547b267802d0df1
Reviewed-on: https://go-review.googlesource.com/5881
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The loop in gcMark is redundant with the gcworkdone resetting
performed by markroot, which called a few lines later in gcMark.
Change-Id: Ie0a826a614ecfa79e6e6b866e8d1de40ba515856
Reviewed-on: https://go-review.googlesource.com/5880
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Package runtime's Go code was converted to directly call getcallerpc
and getcallersp in https://golang.org/cl/138740043, but the assembly
implementations were not removed.
Change-Id: Ib2eaee674d594cbbe799925aae648af782a01c83
Reviewed-on: https://go-review.googlesource.com/5901
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
NetBSD's semaphore implementation is derived from OpenBSD's, but has
subsequently diverged due to cleanups that were only applied to the
latter (https://golang.org/cl/137960043, https://golang.org/cl/5563).
This CL applies analogous cleanups for NetBSD.
Notably, we can also remove the scary NetBSD deadlock warning.
NetBSD's manual pages document that lwp_unpark on a not-yet-parked LWP
will cause that LWP's next lwp_park system call to return immediately,
so there's no race hazard.
Change-Id: Ib06844c420d2496ac289748eba13eb4700bbbbb2
Reviewed-on: https://go-review.googlesource.com/5564
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Joel Sing <jsing@google.com>
(gdb) p x
Python Exception <class 'gdb.error'> There is no member named b.:
$2 = map[string]string
->
(gdb) p x
$1 = map[string]string = {["shane"] = "hansen"}
Change-Id: I874d02a029f2ac9afc5ab666afb65760ec2c3177
Reviewed-on: https://go-review.googlesource.com/5522
Reviewed-by: Ian Lance Taylor <iant@golang.org>
OpenBSD's thrsleep system call includes an "abort" parameter, which
specifies a memory address to be tested after being registered on the
sleep channel (i.e., capable of being woken up by thrwakeup). By
passing a pointer to waitsemacount for this parameter, we avoid race
conditions without needing a lock. Instead we just need to use
atomicload, cas, and xadd to mutate the semaphore count.
Change-Id: If9f2ab7cfd682da217f9912783cadea7e72283a8
Reviewed-on: https://go-review.googlesource.com/5563
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Joel Sing <jsing@google.com>
When GODEBUG=gctrace=2 two gcs are preformed. During the first gc
the stack scan sets the g's gcscanvalid and gcworkdone flags to true
indicating that the stacks have to be scanned and do not need to
be rescanned. These need to be reset to false for the second GC so the
stacks are rescanned, otherwise if the only pointer to an object is
on the stack it will not be discovered and the object will be freed.
Typically this will include the object that was just allocated in
the mallocgc call that initiated the GC.
Change-Id: Ic25163f4689905fd810c90abfca777324005c02f
Reviewed-on: https://go-review.googlesource.com/5861
Reviewed-by: Russ Cox <rsc@golang.org>
Currently sync.Mutex is fully cooperative. That is, once contention is discovered,
the goroutine calls into scheduler. This is suboptimal as the resource can become
free soon after (especially if critical sections are short). Server software
usually runs at ~~50% CPU utilization, that is, switching to other goroutines
is not necessary profitable.
This change adds limited active spinning to sync.Mutex if:
1. running on a multicore machine and
2. GOMAXPROCS>1 and
3. there is at least one other running P and
4. local runq is empty.
As opposed to runtime mutex we don't do passive spinning,
because there can be work on global runq on on other Ps.
benchmark old ns/op new ns/op delta
BenchmarkMutexNoSpin 1271 1272 +0.08%
BenchmarkMutexNoSpin-2 702 683 -2.71%
BenchmarkMutexNoSpin-4 377 372 -1.33%
BenchmarkMutexNoSpin-8 197 190 -3.55%
BenchmarkMutexNoSpin-16 131 122 -6.87%
BenchmarkMutexNoSpin-32 170 164 -3.53%
BenchmarkMutexSpin 4724 4728 +0.08%
BenchmarkMutexSpin-2 2501 2491 -0.40%
BenchmarkMutexSpin-4 1330 1325 -0.38%
BenchmarkMutexSpin-8 684 684 +0.00%
BenchmarkMutexSpin-16 414 372 -10.14%
BenchmarkMutexSpin-32 559 469 -16.10%
BenchmarkMutex 19.1 19.1 +0.00%
BenchmarkMutex-2 81.6 54.3 -33.46%
BenchmarkMutex-4 143 100 -30.07%
BenchmarkMutex-8 154 156 +1.30%
BenchmarkMutex-16 140 159 +13.57%
BenchmarkMutex-32 141 163 +15.60%
BenchmarkMutexSlack 33.3 31.2 -6.31%
BenchmarkMutexSlack-2 122 97.7 -19.92%
BenchmarkMutexSlack-4 168 158 -5.95%
BenchmarkMutexSlack-8 152 158 +3.95%
BenchmarkMutexSlack-16 140 159 +13.57%
BenchmarkMutexSlack-32 146 162 +10.96%
BenchmarkMutexWork 154 154 +0.00%
BenchmarkMutexWork-2 89.2 89.9 +0.78%
BenchmarkMutexWork-4 139 86.1 -38.06%
BenchmarkMutexWork-8 177 162 -8.47%
BenchmarkMutexWork-16 170 173 +1.76%
BenchmarkMutexWork-32 176 176 +0.00%
BenchmarkMutexWorkSlack 160 160 +0.00%
BenchmarkMutexWorkSlack-2 103 99.1 -3.79%
BenchmarkMutexWorkSlack-4 155 148 -4.52%
BenchmarkMutexWorkSlack-8 176 170 -3.41%
BenchmarkMutexWorkSlack-16 170 173 +1.76%
BenchmarkMutexWorkSlack-32 175 176 +0.57%
"No work" benchmarks are not very interesting (BenchmarkMutex and
BenchmarkMutexSlack), as they are absolutely not realistic.
Fixes#8889
Change-Id: I6f14f42af1fa48f73a776fdd11f0af6dd2bb428b
Reviewed-on: https://go-review.googlesource.com/5430
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
This change deletes the C implementations of
the Go compiler and assembler from the master branch.
The Go implementations are a bit slower right now,
due mainly to garbage generated by taking addresses
of stack variables all over the place (it was C code,
after all). That will be cleaned up (mechanically) over the
next week or so, and things will get faster.
Change-Id: I66b2b3477aec8835f9960d0798f5752dcd98d08f
The slow path of heapBitsForObjects somewhat subtly assumes that the
pointer will not point to the first word of the object and will round
the pointer wrong if this assumption is violated. This assumption is
safe because the fast path should always take care of this case, but
there's no benefit to making this assumption, it makes the code more
difficult to experiment with than necessary, and it's trivial to
eliminate.
Change-Id: Iedd336f7d529a27d3abeb83e77dfb32a285ea73a
Reviewed-on: https://go-review.googlesource.com/5636
Reviewed-by: Russ Cox <rsc@golang.org>
The routine mallocgc retrieves objects from freelists. Prefetch
the object that will be returned in the next call to mallocgc.
Experiments indicate that this produces a 1% improvement when using
prefetchnta and less when using prefetcht0, prefetcht1, or prefetcht2.
Benchmark numbers indicate a 1% improvement over no
prefetch, much less over prefetcht0, prefetcht1, and prefetcht2.
These numbers were for the garbage benchmark with MAXPROCS=4
no prefetch >> 5.96 / 5.77 / 5.89
prefetcht0(uintptr(v.ptr().next)) >> 5.88 / 6.17 / 5.84
prefetcht1(uintptr(v.ptr().next)) >> 5.88 / 5.89 / 5.91
prefetcht2(uintptr(v.ptr().next)) >> 5.87 / 6.47 / 5.92
prefetchnta(uintptr(v.ptr().next)) >> 5.72 / 5.84 / 5.85
Change-Id: I54e07172081cccb097d5b5ce8789d74daa055ed9
Reviewed-on: https://go-review.googlesource.com/5350
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Makes them compatible with the new asm.
Applied mechanically from vet diagnostics.
Manual edits: the names for arguments in time·now(SB) in runtime/sys_*_arm.s.
Change-Id: Ib295390d9509d306afc67714e3f50dc832256625
Reviewed-on: https://go-review.googlesource.com/5576
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
With a trivial Golang-built program loaded in gdb-7.8.90.20150214-7.fc23.x86_64
I get this error:
(gdb) source ./src/runtime/runtime-gdb.py
Loading Go Runtime support.
Traceback (most recent call last):
File "./src/runtime/runtime-gdb.py", line 230, in <module>
_rctp_type = gdb.lookup_type("struct reflect.rtype").pointer()
gdb.error: No struct type named reflect.rtype.
(gdb) q
No matter if this struct should or should not be in every Golang-built binary
this change should fix that with no disadvantages.
Change-Id: I0c490d3c9bbe93c65a2183b41bfbdc0c0f405bd1
Reviewed-on: https://go-review.googlesource.com/5521
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trace command allows to visualize and analyze traces.
Run as:
$ go tool trace binary trace.file
The commands opens web browser with the main page,
which contains links for trace visualization,
blocking profiler, network IO profiler and per-goroutine
traces.
Also move trace parser from runtime/pprof/trace_parser_test.go
to internal/trace/parser.go, so that it can be shared between
tests and the command.
Change-Id: Ic97ed59ad6e4c7e1dc9eca5e979701a2b4aed7cf
Reviewed-on: https://go-review.googlesource.com/3601
Reviewed-by: Andrew Gerrand <adg@golang.org>
Restores stack traces in the android/arm builder.
Change-Id: If637aa2ed6f8886126b77cf9cc8a0535ec7c4369
Reviewed-on: https://go-review.googlesource.com/5453
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
In most cases we pass return PC to race detector,
and race runtime subtracts one from them.
However, in manual instrumentation in runtime
we pass function start PC to race runtime.
Race runtime can't distinguish these cases
and so it does not subtract one from top PC.
This leads to bogus line numbers in some cases.
Make it consistent and always pass what looks
like a return PC, so that race runtime can
subtract one and still get PC in the same function.
Also delete two unused functions.
Update #8053
Change-Id: I4242dec5e055e460c9a8990eaca1d085ae240ed2
Reviewed-on: https://go-review.googlesource.com/4902
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This is a nice split but more importantly it provides a better
way to fit the checkmark phase into the sequencing.
Also factor out common span copying into gcSpanCopy.
Change-Id: Ia058644974e4ed4ac3cf4b017a3446eb2284d053
Reviewed-on: https://go-review.googlesource.com/5333
Reviewed-by: Austin Clements <austin@google.com>
The loop made more sense when gc_m was not its own function.
Change-Id: I71a7f21d777e69c1924e3b534c507476daa4dfdd
Reviewed-on: https://go-review.googlesource.com/5332
Reviewed-by: Austin Clements <austin@google.com>
See the following issue for context:
https://github.com/golang/go/issues/9729#issuecomment-74648287
In short, RDTSC can produce skewed results without preceding LFENCE/MFENCE.
Information on this matter is very scrappy in the internet.
But this is what linux kernel does (see rdtsc_barrier).
It also fixes the test program on my machine.
Update #9729
Change-Id: I3c1ffbf129fdfdd388bd5b7911b392b319248e68
Reviewed-on: https://go-review.googlesource.com/5033
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Fix many incorrect FP references and a few other details.
Some errors remain, especially in vlop, but fixing them requires semantics. For another day.
Change-Id: Ib769fb519b465e79fc08d004a51acc5644e8b259
Reviewed-on: https://go-review.googlesource.com/5288
Reviewed-by: Russ Cox <rsc@golang.org>
That is, I accidentally dropped this change of Austin's
when preparing my CL. I blame Git.
Change-Id: I9dd772c84edefad96c4b16785fdd2dea04a4a0d6
Reviewed-on: https://go-review.googlesource.com/5320
Reviewed-by: Austin Clements <austin@google.com>
Move code from malloc1.go, malloc2.go, mem.go, mgc0.go into
appropriate locations.
Factor mgc.go into mgc.go, mgcmark.go, mgcsweep.go, mstats.go.
A lot of this code was in certain files because the right place was in
a C file but it was written in Go, or vice versa. This is one step toward
making things actually well-organized again.
Change-Id: I6741deb88a7cfb1c17ffe0bcca3989e10207968f
Reviewed-on: https://go-review.googlesource.com/5300
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Until recently, struct workbuf had only lfnode and uintptr fields
before the obj array to make it convenient to compute the size of the
obj array. It slowly grew more fields until this became inconvenient
enough that it was restructured to make the size computation easy.
Now the size computation doesn't care what the field types are, so
switch to more natural types.
Change-Id: I966140ba7ebb4aeb41d5c66d9d2a3bdc17dd4bcf
Reviewed-on: https://go-review.googlesource.com/5262
Reviewed-by: Russ Cox <rsc@golang.org>
This converts the garbage collector from directly manipulating work
buffers to using the new gcWork abstraction.
The previous management of work buffers was rather ad hoc. As a
result, switching to the gcWork abstraction changes many details of
work buffer management.
If greyobject fills a work buffer, it can now pull from work.partial
in addition to work.empty.
Previously, gcDrain started with a partial or empty work buffer and
fetched an empty work buffer if it filled its current buffer (in
greyobject). Now, gcDrain starts with a full work buffer and fetches
an partial or empty work buffer if it fills its current buffer (in
greyobject). The original behavior was bad because gcDrain would
immediately drop the empty work buffer returned by greyobject and
fetch a full work buffer, which greyobject was likely to immediately
overflow, fetching another empty work buffer, etc. The new behavior
isn't great at the start because greyobject is likely to immediately
overflow the full buffer, but the steady-state behavior should be more
stable. Both before and after this change, gcDrain fetches a full
work buffer if it drains its current buffer. Basically all of these
choices are bad; the right answer is to use a dual work buffer scheme.
Previously, shade always fetched a work buffer (though usually from
m.currentwbuf), even if the object was already marked. Now it only
fetches a work buffer if it actually greys an object.
Change-Id: I8b880ed660eb63135236fa5d5678f0c1c041881f
Reviewed-on: https://go-review.googlesource.com/5232
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This introduces a producer/consumer abstraction for GC work pointers
that internally handles the details of filling, draining, and
shuffling work buffers.
In addition to simplifying the GC code, this should make it easy for
us to change how we use work buffers, including cleaning up how we use
the work.partial queue, reintroducing a FIFO lookahead cache, adding
prefetching, and using dual buffers to avoid flapping.
This commit doesn't change any existing code. The following commit
will switch the garbage collector from explicit workbuf manipulation
to gcWork.
Change-Id: Ifbfe5fff45bf0362d6d7c3cecb061f0c9874077d
Reviewed-on: https://go-review.googlesource.com/5231
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
References to FP must now have a symbol.
Change-Id: I3f06b99cc48cbd4ccd6f23f2e4b0830af40f7f3d
Reviewed-on: https://go-review.googlesource.com/5281
Reviewed-by: Russ Cox <rsc@golang.org>
Nit. There's no reason to take a uintptr and doing so just requires
casts in annoying places.
Change-Id: Ifeb9638c6d94eae619c490930cf724cc315680ba
Reviewed-on: https://go-review.googlesource.com/5230
Reviewed-by: Russ Cox <rsc@golang.org>
Require a name to be specified when referencing the pseudo-stack.
If you want a real stack offset, use the hardware stack pointer (e.g.,
R13 on arm), not SP.
Fix affected assembly files.
Change-Id: If3545f187a43cdda4acc892000038ec25901132a
Reviewed-on: https://go-review.googlesource.com/5120
Run-TryBot: Rob Pike <r@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Apparently when ARM stops at a GDB breakpoint, it appears to be in
syscall.Syscall. The "info goroutines" test expected it to be in a
runtime function. Since this isn't fundamental to the test, simply
tweak the test's regexp to make sure "info goroutines" prints some
running goroutine with an active M, but don't require it to be in any
particular function.
Change-Id: Iba2618b46d3dc49cef62ffb72484b83ea7b0317d
Reviewed-on: https://go-review.googlesource.com/5060
Reviewed-by: Dave Cheney <dave@cheney.net>
All of the other memory-related source files start with "m". Keep up
the tradition.
Change-Id: Idd88fdbf2a1453374fa12109b949b1c4d149a4f8
Reviewed-on: https://go-review.googlesource.com/4853
Reviewed-by: Minux Ma <minux@golang.org>
Rather than reaching in to slices directly in the slice pretty
printer, use the newly introduced SliceValue wrapper.
Change-Id: Ibb25f8c618c2ffb3fe1a8dd044bb9a6a085df5b7
Reviewed-on: https://go-review.googlesource.com/4936
Reviewed-by: Minux Ma <minux@golang.org>
"info goroutines" is failing because it hasn't kept up with changes in
the 1.5 runtime. This fixes three issues preventing "info goroutines"
from working. allg is no longer a linked list, so switch to using the
allgs slice. The g struct's 'status' field is now called
'atomicstatus', so rename uses of 'status'. Finally, this was trying
to parse str(pc) as an int, but str(pc) can return symbolic
information after the raw hex value; fix this by stripping everything
after the first space.
This also adds a test for "info goroutines" to runtime-gdb_test, which
was previously quite skeletal.
Change-Id: I8ad83ee8640891cdd88ecd28dad31ed9b5833b7a
Reviewed-on: https://go-review.googlesource.com/4935
Reviewed-by: Minux Ma <minux@golang.org>
R15 is the real register. PC is a pseudo-register that we are making
illegal in this context as part of the grand assembly unification.
Change-Id: Ie0ea38ce7ef4d2cf4fcbe23b851a570fd312ce8d
Reviewed-on: https://go-review.googlesource.com/4966
Reviewed-by: Minux Ma <minux@golang.org>
There is currently no way to ignore signals using the os/signal package.
It is possible to catch a signal and do nothing but this is not the same
as ignoring it. The new function Ignore allows a set of signals to be
ignored. The new function Reset allows the initial handlers for a set of
signals to be restored.
Fixes#5572
Change-Id: I5c0f07956971e3a9ff9b9d9631e6e3a08c20df15
Reviewed-on: https://go-review.googlesource.com/3580
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change 85e7bee introduced a bug:
it marks map buckets as noscan when key and val do not contain pointers.
However, buckets with large/outline key or val do contain pointers.
This change takes key/val size into consideration when
marking buckets as noscan.
Change-Id: I7172a0df482657be39faa59e2579dd9f209cb54d
Reviewed-on: https://go-review.googlesource.com/4901
Reviewed-by: Keith Randall <khr@golang.org>
Several .s files for ARM had several properties the new assembler will not support.
These include:
- mentioning SP or PC as a hardware register
These are always pseudo-registers except that in some contexts
they're not, and it's confusing because the context should not affect
which register you mean. Change the references to the hardware
registers to be explicit: R13 for SP, R15 for PC.
- constant creation using assignment
The files say a=b when they could instead say #define a b.
There is no reason to have both mechanisms.
- R(0) to refer to R0.
Some macros use this to a great extent. Again, it's easy just to
use a #define to rename a register.
Change-Id: I002335ace8e876c5b63c71c2560533eb835346d2
Reviewed-on: https://go-review.googlesource.com/4822
Reviewed-by: Dave Cheney <dave@cheney.net>
MOVQ RARG0, 0(SP) smashes exactly what was saved by PUSHQ R15.
This code managed to work somehow with the current race runtime,
but corrupts caller arguments with new race runtime that I am testing.
Change-Id: I9ffe8b5eee86451db36e99dbf4d11f320192e576
Reviewed-on: https://go-review.googlesource.com/4810
Reviewed-by: Keith Randall <khr@golang.org>
New race runtime is more scrupulous about env flags format.
Change-Id: I2828bc737a8be3feae5288ccf034c52883f224d8
Reviewed-on: https://go-review.googlesource.com/4811
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
drainworkbuf is now gcDrain, since it drains until there's
nothing left to drain. drainobjects is now gcDrainN because it's
the bounded equivalent to gcDrain.
The new names use the Go camel case convention because we have to
start somewhere. The "gc" prefix is because we don't have runtime
packages yet and just "drain" is too ambiguous.
Change-Id: I88dbdf32e8ce4ce6c3b7e1f234664be9b76cb8fd
Reviewed-on: https://go-review.googlesource.com/4785
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
All calls to drainworkbuf now pass true for this argument, so remove
the argument and update the documentation to reflect the simplified
interface.
At a higher level, there are no longer any situations where we drain
"one wbuf" (though drainworkbuf didn't guarantee this anyway). We
either drain everything, or we drain a specific number of objects.
Change-Id: Ib7ee0fde56577eff64232ee1e711ec57c4361335
Reviewed-on: https://go-review.googlesource.com/4784
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
scanblock is only called during _GCscan and _GCmarktermination.
During _GCscan, scanblock didn't call drainworkbufs anyway. During
_GCmarktermination, there's really no point in draining some (largely
arbitrary) amount of work during the scanblock, since the GC is about
to drain everything anyway, so simply eliminate this case.
Change-Id: I7f3c59ce9186a83037c6f9e9b143181acd04c597
Reviewed-on: https://go-review.googlesource.com/4783
Reviewed-by: Russ Cox <rsc@golang.org>
We no longer ever call scanblock with b == 0.
Change-Id: I9b01da39595e0cc251668c24d58748d88f5f0792
Reviewed-on: https://go-review.googlesource.com/4782
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
scanblock(0, 0, nil, nil) was just a confusing way of saying
wbuf = getpartialorempty()
drainworkbuf(wbuf, true)
Make drainworkbuf accept a nil workbuf and perform the
getpartialorempty itself and replace all uses of scanblock(0, 0, nil,
nil) with direct calls to drainworkbuf(nil, true).
Change-Id: I7002a2f8f3eaf6aa85bbf17ccc81d7288acfef1c
Reviewed-on: https://go-review.googlesource.com/4781
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Previously, scanblock called checknocurrentwbuf() after
drainworkbuf(). Move this call into drainworkbuf so that every return
path from drainworkbuf calls checknocurrentwbuf(). This is equivalent
to the previous code because scanblock was the only caller of
drainworkbuf.
Change-Id: I96ef2168c8aa169bfc4d368f296342fa0fbeafb4
Reviewed-on: https://go-review.googlesource.com/4780
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently we always create context objects for closures that capture variables.
However, it is completely unnecessary for direct calls of closures
(whether it is func()(), defer func()() or go func()()).
This change transforms any OCALLFUNC(OCLOSURE) to normal function call.
Closed variables become function arguments.
This transformation is especially beneficial for go func(),
because we do not need to allocate context object on heap.
But it makes direct closure calls a bit faster as well (see BenchmarkClosureCall).
On implementation level it required to introduce yet another compiler pass.
However, the pass iterates only over xtop, so it should not be an issue.
Transformation consists of two parts: closure transformation and call site
transformation. We can't run these parts on different sides of escape analysis,
because tree state is inconsistent. We can do both parts during typecheck,
we don't know how to capture variables and don't have call site.
We can't do both parts during walk of OCALLFUNC, because we can walk
OCLOSURE body earlier.
So now capturevars pass only decides how to capture variables
(this info is required for escape analysis). New transformclosure
pass, that runs just before order/walk, does all transformations
of a closure. And later walk of OCALLFUNC(OCLOSURE) transforms call site.
benchmark old ns/op new ns/op delta
BenchmarkClosureCall 4.89 3.09 -36.81%
BenchmarkCreateGoroutinesCapture 1634 1294 -20.81%
benchmark old allocs new allocs delta
BenchmarkCreateGoroutinesCapture 6 2 -66.67%
benchmark old bytes new bytes delta
BenchmarkCreateGoroutinesCapture 176 48 -72.73%
Change-Id: Ic85e1706e18c3235cc45b3c0c031a9c1cdb7a40e
Reviewed-on: https://go-review.googlesource.com/4050
Reviewed-by: Russ Cox <rsc@golang.org>
Consider an interface value i of type I and concrete value c of type C.
Prior to this CL, i==c was evaluated as
I(c) == i
Evaluating I(c) can allocate.
This CL changes the evaluation of i==c to
x, ok := i.(C); ok && x == c
The new generated code is shorter and does not allocate directly.
If C is small, as it is in every instance in the stdlib,
the new code also uses less stack space
and makes one runtime call instead of two.
If C is very large, the original implementation is used.
The cutoff for "very large" is 1<<16,
following the stack vs heap cutoff used elsewhere.
This kind of comparison occurs in 38 places in the stdlib,
mostly in the net and os packages.
benchmark old ns/op new ns/op delta
BenchmarkEqEfaceConcrete 29.5 7.92 -73.15%
BenchmarkEqIfaceConcrete 32.1 7.90 -75.39%
BenchmarkNeEfaceConcrete 29.9 7.90 -73.58%
BenchmarkNeIfaceConcrete 35.9 7.90 -77.99%
Fixes#9370.
Change-Id: I7c4555950bcd6406ee5c613be1f2128da2c9a2b7
Reviewed-on: https://go-review.googlesource.com/2096
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
No code modifications.
This is in preparation for improving the wbuf abstraction.
Change-Id: I719543a345c34d079b7e39b251eccd5dd8a07826
Reviewed-on: https://go-review.googlesource.com/4710
Reviewed-by: Rick Hudson <rlh@golang.org>
Plan 9's sysFree has an optimization where if the object being freed
is the last object allocated, it will roll back the brk to allow the
memory to be reused by sysAlloc. However, it does not zero this
"returned" memory, so as a result, sysAlloc can return non-zeroed
memory after a sysFree. This leads to corruption because the runtime
assumes sysAlloc returns zeroed memory.
Fix this by zeroing the memory returned by sysFree.
Fixes#9846.
Change-Id: Id328c58236eb7c464b31ac1da376a0b757a5dc6a
Reviewed-on: https://go-review.googlesource.com/4700
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: David du Colombier <0intro@gmail.com>
typedslicecopy is another write barrier that is not
understood by racewalk. It seems quite complex to handle it
in the compiler, so instead just instrument it in runtime.
Update #9796
Change-Id: I0eb6abf3a2cd2491a338fab5f7da22f01bf7e89b
Reviewed-on: https://go-review.googlesource.com/4370
Reviewed-by: Russ Cox <rsc@golang.org>
Support the following conversions in escape analysis:
[]rune("foo")
[]byte("foo")
string([]rune{})
If the result does not escape, allocate temp buffer on stack
and pass it to runtime functions.
Change-Id: I1d075907eab8b0109ad7ad1878104b02b3d5c690
Reviewed-on: https://go-review.googlesource.com/3590
Reviewed-by: Russ Cox <rsc@golang.org>
Add local workbufs to the m struct in order to reduce contention.
Add consistency checks for workbuf ownership.
Chain workbufs through call change to avoid swapping them
to and from the m struct.
Adjust the size of the workbuf so that the mutators can
more frequently pass modifications to the GC thus shifting
some work from the STW mark termination phase to the concurrent
mark phase.
Change-Id: I557b53af34ad9972265e0ed9f5996e52d548563d
Reviewed-on: https://go-review.googlesource.com/3972
Reviewed-by: Austin Clements <austin@google.com>
Fixes#9791
g.issystem flag setup races with other code wherever we set it.
Even if we set both in parent goroutine and in the system goroutine,
it is still possible that some other goroutine crashes
before the flag is set. We could pass issystem flag to newproc1,
but we start all goroutines with go nowadays.
Instead look at g.startpc to distinguish system goroutines (similar to topofstack).
Change-Id: Ia3467968dee27fa07d9fecedd4c2b00928f26645
Reviewed-on: https://go-review.googlesource.com/4113
Reviewed-by: Keith Randall <khr@golang.org>
Update #8832
This is probably not the root cause of the issue.
Resolve TODO about setting unusedsince on a wrong span.
Change-Id: I69c87e3d93cb025e3e6fa80a8cffba6ad6ad1395
Reviewed-on: https://go-review.googlesource.com/4390
Reviewed-by: Keith Randall <khr@golang.org>
Container symbols shouldn't be considered as functions in the functab.
Having them present probably messes up function lookup, as you might get
the descriptor of the container instead of the descriptor of the actual
function on the stack. It also messed up the findfunctab because these
entries caused off-by-one errors in how functab entries were counted.
Normal code is not affected - it only changes (& hopefully fixes) the
behavior for libraries linked as a unit, like:
net
runtime/cgo
runtime/race
Fixes#9804
Change-Id: I81e036e897571ac96567d59e1f1d7f058ca75e85
Reviewed-on: https://go-review.googlesource.com/4290
Reviewed-by: Russ Cox <rsc@golang.org>
This CL introduces new methods for 'context' type, so we can
manipulate its values in an architecture independent way.
Use new methods to replace both 386 and amd64 versions of
dosigprof with single piece of code.
There is more similar code to be converted in the following CLs.
Also remove os_windows_386.go and os_windows_amd64.go. These
contain unused functions.
Change-Id: I28f76aeb97f6e4249843d30d3d0c33fb233d3f7f
Reviewed-on: https://go-review.googlesource.com/2790
Reviewed-by: Minux Ma <minux@golang.org>
CL 2118 makes the assumption that all references to runtime.tlsg
should be accompanied by a declaration of runtime.tlsg if its type
should be a normal variable, instead of a placeholder for TLS
relocation.
Because if runtime.tlsg is not declared by the runtime package,
the type of runtime.tlsg will be zero, so fix the check in liblink
to look for 0 instead of STLSBSS (the type will be initialized by
cmd/ld, but cmd/ld doesn't run during assembly).
Change-Id: I691ac5c3faea902f8b9a0b963e781b22e7b269a7
Reviewed-on: https://go-review.googlesource.com/4030
Reviewed-by: David Crawshaw <crawshaw@golang.org>
This change is an implementation of the signal
runtime and os/signal package on Plan 9.
Contrary to Unix, on Plan 9 a signal is called
a note and is represented by a string.
For this reason, the sigsend and signal_recv
functions had to be reimplemented specifically
for Plan 9.
In order to reuse most of the code and internal
interface of the os/signal package, the note
strings are mapped to integers.
Thanks to Russ Cox for the early review.
Change-Id: I95836645efe21942bb1939f43f87fb3c0eaaef1a
Reviewed-on: https://go-review.googlesource.com/2164
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
It turns out -iex argument is not supported by all gdb versions,
but as we need to add the auto-load safe path before loading the
inferior, test -iex support first and skip the test if it's not
available.
We should still update our builders though.
Change-Id: I355697de51baf12162ba6cb82f389dad93f93dc5
Reviewed-on: https://go-review.googlesource.com/4070
Reviewed-by: Ian Lance Taylor <iant@golang.org>
On some systems, gdb refuses to load Python plugin from arbitrary
paths, so we have to add $GOROOT/src/runtime to auto-load-safe-path
in the gdb script test.
Change-Id: Icc44baab8d04a65bd21ceac2ab8ddb13c8d083e8
Reviewed-on: https://go-review.googlesource.com/2905
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
eqstring does not need to check the length of the strings.
Other architectures were done in a separate commit.
While we're here, add a pointer equality check.
Change-Id: Id2c8616a03a7da7037c1e9ccd56a549fc952bd98
Reviewed-on: https://go-review.googlesource.com/3956
Reviewed-by: Keith Randall <khr@golang.org>
eqstring does not need to check the length of the strings.
6g
benchmark old ns/op new ns/op delta
BenchmarkCompareStringEqual 7.03 6.14 -12.66%
BenchmarkCompareStringIdentical 3.36 3.04 -9.52%
5g
benchmark old ns/op new ns/op delta
BenchmarkCompareStringEqual 238 232 -2.52%
BenchmarkCompareStringIdentical 90.8 80.7 -11.12%
The equivalent PPC changes are in a separate commit
because I don't have the hardware to test them.
Change-Id: I292874324b9bbd9d24f57a390cfff8b550cdd53c
Reviewed-on: https://go-review.googlesource.com/3955
Reviewed-by: Keith Randall <khr@golang.org>
Only documentation / comment changes. Update references to
point to golang.org permalinks or go.googlesource.com/go.
References in historical release notes under doc are left as is.
Change-Id: Icfc14e4998723e2c2d48f9877a91c5abef6794ea
Reviewed-on: https://go-review.googlesource.com/4060
Reviewed-by: Ian Lance Taylor <iant@golang.org>
In the old code, liblink, cmd/ld and runtime all have code determine
whether runtime.tlsg is an actual variable or a placeholder for TLS
relocation. This change consolidate them into one: the runtime/tls_arm.s
will ultimately determine the type of that variable.
Change-Id: I3b3f80791a1db4c2b7318f81a115972cd2237e43
Reviewed-on: https://go-review.googlesource.com/2118
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
In android-L, logging is done through the logd daemon.
If logd daemon is available, send logging to logd.
Otherwise, fallback to the legacy mechanism (/dev/log files).
This change adds access/socket/connect calls to interact with the logd.
Fixesgolang/go#9398.
Change-Id: I3c52b81b451f5862107d7c675f799fc85548486d
Reviewed-on: https://go-review.googlesource.com/3350
Reviewed-by: David Crawshaw <crawshaw@golang.org>
The unbounded list-based defer pool can grow infinitely.
This can happen if a goroutine routinely allocates a defer;
then blocks on one P; and then unblocked, scheduled and
frees the defer on another P.
The scenario was reported on golang-nuts list.
We've been here several times. Any unbounded local caches
are bad and grow to infinite size. This change introduces
central defer pool; local pools become fixed-size
with the only purpose of amortizing accesses to the
central pool.
Change-Id: Iadcfb113ccecf912e1b64afc07926f0de9de2248
Reviewed-on: https://go-review.googlesource.com/3741
Reviewed-by: Keith Randall <khr@golang.org>
Using benchmark from the issue:
benchmark old ns/op new ns/op delta
BenchmarkRangeStringCast 2162 1152 -46.72%
benchmark old allocs new allocs delta
BenchmarkRangeStringCast 1 0 -100.00%
Fixes#2204
Change-Id: I92c5edd2adca4a7b6fba00713a581bf49dc59afe
Reviewed-on: https://go-review.googlesource.com/3790
Reviewed-by: Keith Randall <khr@golang.org>
Before 3c0fee1, runtime.gogo was just long enough to align to 64 bytes
on OSs with short get_tls implementations and 80 bytes on OSs with
longer get_tls implementations (Windows, Solaris, and Plan 9).
3c0fee1 added a few instructions, which pushed it to 80 on most OSs,
including Windows and Plan 9, and 96 on Solaris.
Fixes#9770.
Change-Id: Ie84810657c14ab16dce9f0e0a932955251b0bf33
Reviewed-on: https://go-review.googlesource.com/3850
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Use memprofilerate in GODEBUG instead of memprofrate to be
consistent with other uses.
Change-Id: Iaf6bd3b378b1fc45d36ecde32f3ad4e63ca1e86b
Reviewed-on: https://go-review.googlesource.com/3800
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The overflow happens only with -gcflags="-N -l"
and can be reproduced with:
$ go test -gcflags="-N -l" -a -run=none net
runtime.cgocall: nosplit stack overflow
504 assumed on entry to runtime.cgocall
480 after runtime.cgocall uses 24
472 on entry to runtime.cgocall_errno
408 after runtime.cgocall_errno uses 64
400 on entry to runtime.exitsyscall
288 after runtime.exitsyscall uses 112
280 on entry to runtime.exitsyscallfast
152 after runtime.exitsyscallfast uses 128
144 on entry to runtime.writebarrierptr
88 after runtime.writebarrierptr uses 56
80 on entry to runtime.writebarrierptr_nostore1
24 after runtime.writebarrierptr_nostore1 uses 56
16 on entry to runtime.acquirem
-24 after runtime.acquirem uses 40
Move closure creation into separate function so that
frames of writebarrierptr_shadow and writebarrierptr_nostore1
are overlapped.
Fixes#9721
Change-Id: I40851f0786763ee964af34814edbc3e3d73cf4e7
Reviewed-on: https://go-review.googlesource.com/3418
Reviewed-by: Russ Cox <rsc@golang.org>
Currently race detector produces the following reports on pprof tests:
WARNING: DATA RACE
Read by goroutine 4:
runtime/pprof_test.TestTraceStartStop()
src/runtime/pprof/trace_test.go:38 +0x1da
testing.tRunner()
src/testing/testing.go:448 +0x13a
Previous write by goroutine 5:
bytes.(*Buffer).grow()
src/bytes/buffer.go:102 +0x190
bytes.(*Buffer).Write()
src/bytes/buffer.go:127 +0x75
runtime/pprof.func·002()
src/runtime/pprof/pprof.go:633 +0xae
Trace writer goroutine synchronizes with StopTrace
using trace.shutdownSema runtime semaphore.
But race detector does not see that synchronization
and so produces false reports.
Teach race detector about the synchronization.
Change-Id: I1219817325d4e16b423f29a0cbee94c929793881
Reviewed-on: https://go-review.googlesource.com/3746
Reviewed-by: Russ Cox <rsc@golang.org>
The test for the framepointer experiment flag is cheaper and more
branch-predictable than the other parts of this conditional, so move
it first. This is also more readable.
(Originally, the flag check required parsing the experiments string,
which is why it was done last. Now that flag is cached.)
Change-Id: I84e00fa7e939e9064f0fa0a4a6fe00576dd61457
Reviewed-on: https://go-review.googlesource.com/3782
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Previously, we checked for a saved frame pointer by looking for a
2*ptrSize gap between the argument pointer and the locals pointer.
The intent of this check was to look for a two stack slot gap (caller
IP and saved frame pointer), but stack slots are regSize, not ptrSize.
Correct this by checking instead for a 2*regSize gap.
On most platforms, this made no difference because ptrSize==regSize.
However, on amd64p32 (nacl), the saved frame pointer check incorrectly
fired when there was no saved frame pointer because the one stack slot
for the caller IP left an 8 byte gap, which is 2*ptrSize (but not
2*regSize) on amd64p32.
Fixes#9760.
Change-Id: I6eedcf681fe5bf2bf924dde8a8f2d9860a4d758e
Reviewed-on: https://go-review.googlesource.com/3781
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Add memprofrate as a value recognized in GODEBUG. The
value provided is used as the new setting for
runtime.MemProfileRate, allowing the user to
adjust memory profiling.
Change-Id: If129a247683263b11e2dd42473cf9b31280543d5
Reviewed-on: https://go-review.googlesource.com/3450
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This adds a "framepointer" GOEXPERIMENT that that makes the amd64
toolchain maintain base pointer chains in the same way that gcc
-fno-omit-frame-pointer does. Go doesn't use these saved base
pointers, but this does enable external tools like Linux perf and
VTune to unwind Go stacks when collecting system-wide profiles.
This requires support in the compilers to not clobber BP, support in
liblink for generating the BP-saving function prologue and unwinding
epilogue, and support in the runtime to save BPs across preemption, to
skip saved BPs during stack unwinding and, and to adjust saved BPs
during stack moving.
As with other GOEXPERIMENTs, everything from the toolchain to the
runtime must be compiled with this experiment enabled. To do this,
run make.bash (or all.bash) with GOEXPERIMENT=framepointer.
Change-Id: I4024853beefb9539949e5ca381adfdd9cfada544
Reviewed-on: https://go-review.googlesource.com/2992
Reviewed-by: Russ Cox <rsc@golang.org>
Any place that clobbers BP in the runtime can potentially interfere
with frame pointer unwinding with GOEXPERIMENT=framepointer. This
change eliminates uses of BP in the runtime to address this problem.
We have spare registers everywhere this occurs, so there's no downside
to eliminating BP. Where possible, this uses the same new register as
the amd64p32 runtime, which doesn't use BP due to restrictions placed
on it by NaCL.
One nice side effect of this is that it will let perf/VTune unwind the
call stack even through a call to systemstack, which will let us get
really good call graphs from the garbage collector.
Change-Id: I0ffa14cb4dd2b613a7049b8ec59df37c52286212
Reviewed-on: https://go-review.googlesource.com/3390
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
m.gcing has become overloaded to mean "don't preempt this g" in
general. Once the garbage collector is preemptible, the one thing it
*won't* mean is that we're in the garbage collector.
So, rename gcing to "preemptoff" and make it a string giving a reason
that preemption is disabled. gcing was never set to anything but 0 or
1, so we don't have to worry about there being a stack of reasons.
Change-Id: I4337c29e8e942e7aa4f106fc29597e1b5de4ef46
Reviewed-on: https://go-review.googlesource.com/3660
Reviewed-by: Russ Cox <rsc@golang.org>
Commit 656be31 replaced onM with systemstack, but missed updating a
few comments that still referred to onM. Update these.
Change-Id: I0efb017e9a66ea0adebb6e1da6e518ee11263f69
Reviewed-on: https://go-review.googlesource.com/3664
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
The following line in sysFree:
n += (n + memRound) &^ memRound
doubles value of n (n += n).
Which is wrong and can lead to memory corruption.
Fixes#9712
Change-Id: I3c141b71da11e38837c09408cf4f1d22e8f7f36e
Reviewed-on: https://go-review.googlesource.com/3602
Reviewed-by: David du Colombier <0intro@gmail.com>
Set gcscanvalid=false after you have cased to _Grunning.
If you do it before the cas and the atomicstatus races to a scan state,
the scan will set gcscanvalid=true and we will be _Grunning
with gcscanvalid==true which is not a good thing.
Change-Id: Ie53ea744a5600392b47da91159d985fe6fe75961
Reviewed-on: https://go-review.googlesource.com/3510
Reviewed-by: Austin Clements <austin@google.com>
Yet another leftover from C: parfor took a func value for the
callback, casted it to an unsafe.Pointer for storage, and then casted
it back to a func value to call it. This is unnecessary, so just
store the body as a func value. Beyond general cleanup, this also
eliminates the last use of unsafe in parfor.
Change-Id: Ia904af7c6c443ba75e2699835aee8e9a39b26dd8
Reviewed-on: https://go-review.googlesource.com/3396
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Prior to the conversion of the runtime to Go, this void* was
necessary to get closure information in to C callbacks. There
are no more C callbacks and parfor is perfectly capable of
invoking a Go closure now, so eliminate ctx and all of its
unsafe-ness. (Plus, the runtime currently doesn't use ctx for
anything.)
Change-Id: I39fc53b7dd3d7f660710abc76b0d831bfc6296d8
Reviewed-on: https://go-review.googlesource.com/3395
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
parfor originally used a tail array for its thread array. This got
replaced with a slice allocation in the conversion to Go, but many of
its gnarlier effects remained. Instead of keeping track of the
pointer to the first element of the slice and using unsafe pointer
math to get at the ith element, just keep the slice around and use
regular slice indexing. There is no longer any need for padding to
64-bit align the tail array (there hasn't been since the Go
conversion), so remove this unnecessary padding from the parfor
struct. Finally, since the slice tracks its own length, replace the
nthrmax field with len(thr).
Change-Id: I0020a1815849bca53e3613a8fa46ae4fbae67576
Reviewed-on: https://go-review.googlesource.com/3394
Reviewed-by: Russ Cox <rsc@golang.org>
This cleanup was slated for after the conversion of the runtime to Go.
Also improve type and function documentation.
Change-Id: I55a16b09e00cf701f246deb69e7ce7e3e04b26e7
Reviewed-on: https://go-review.googlesource.com/3393
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently, if we do an atomic{load,store}64 of an unaligned address on
386, we'll simply get a non-atomic load/store. This has been the
source of myriad bugs, so add alignment checks to these two
operations. These checks parallel the equivalent checks in
sync/atomic.
The alignment check is not necessary in cas64 because it uses a locked
instruction. The CPU will either execute this atomically or raise an
alignment fault (#AC)---depending on the alignment check flag---either
of which is fine.
This also fixes the two places in the runtime that trip the new
checks. One is in the runtime self-test and shouldn't have caused
real problems. The other is in tickspersecond and could, in
principle, have caused a misread of the ticks per second during
initialization.
Change-Id: If1796667012a6154f64f5e71d043c7f5fb3dd050
Reviewed-on: https://go-review.googlesource.com/3521
Reviewed-by: Russ Cox <rsc@golang.org>
Language specification says that variables are captured by reference.
And that is what gc compiler does. However, in lots of cases it is
possible to capture variables by value under the hood without
affecting visible behavior of programs. For example, consider
the following typical pattern:
func (o *Obj) requestMany(urls []string) []Result {
wg := new(sync.WaitGroup)
wg.Add(len(urls))
res := make([]Result, len(urls))
for i := range urls {
i := i
go func() {
res[i] = o.requestOne(urls[i])
wg.Done()
}()
}
wg.Wait()
return res
}
Currently o, wg, res, and i are captured by reference causing 3+len(urls)
allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap).
But all of them can be captured by value without changing behavior.
This change implements simple strategy for capturing by value:
if a captured variable is not addrtaken and never assigned to,
then it is captured by value (it is effectively const).
This simple strategy turned out to be very effective:
~80% of all captures in std lib are turned into value captures.
The remaining 20% are mostly in defers and non-escaping closures,
that is, they do not cause allocations anyway.
benchmark old allocs new allocs delta
BenchmarkCompressedZipGarbage 153 126 -17.65%
BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18%
BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53%
BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40%
BenchmarkEncodeDigitsDefault1e4 100 75 -25.00%
BenchmarkEncodeDigitsDefault1e5 193 139 -27.98%
BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63%
BenchmarkEncodeDigitsCompress1e4 100 75 -25.00%
BenchmarkEncodeDigitsCompress1e5 193 139 -27.98%
BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63%
BenchmarkEncodeTwainSpeed1e4 109 81 -25.69%
BenchmarkEncodeTwainSpeed1e5 211 151 -28.44%
BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92%
BenchmarkEncodeTwainDefault1e4 103 77 -25.24%
BenchmarkEncodeTwainDefault1e5 199 143 -28.14%
BenchmarkEncodeTwainDefault1e6 1324 917 -30.74%
BenchmarkEncodeTwainCompress1e4 103 77 -25.24%
BenchmarkEncodeTwainCompress1e5 190 137 -27.89%
BenchmarkEncodeTwainCompress1e6 1327 919 -30.75%
BenchmarkConcurrentDBExec 16223 16220 -0.02%
BenchmarkConcurrentStmtQuery 17687 16182 -8.51%
BenchmarkConcurrentStmtExec 5191 5186 -0.10%
BenchmarkConcurrentTxQuery 17665 17661 -0.02%
BenchmarkConcurrentTxExec 15154 15150 -0.03%
BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52%
BenchmarkConcurrentTxStmtExec 3677 3673 -0.11%
BenchmarkConcurrentRandom 14000 13614 -2.76%
BenchmarkManyConcurrentQueries 25 22 -12.00%
BenchmarkDecodeComplex128Slice 318 252 -20.75%
BenchmarkDecodeFloat64Slice 318 252 -20.75%
BenchmarkDecodeInt32Slice 318 252 -20.75%
BenchmarkDecodeStringSlice 2318 2252 -2.85%
BenchmarkDecode 11 8 -27.27%
BenchmarkEncodeGray 64 56 -12.50%
BenchmarkEncodeNRGBOpaque 64 56 -12.50%
BenchmarkEncodeNRGBA 67 58 -13.43%
BenchmarkEncodePaletted 68 60 -11.76%
BenchmarkEncodeRGBOpaque 64 56 -12.50%
BenchmarkGoLookupIP 153 139 -9.15%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServer 62 59 -4.84%
BenchmarkClientServerParallel4 62 59 -4.84%
BenchmarkClientServerParallel64 62 59 -4.84%
BenchmarkClientServerParallelTLS4 79 76 -3.80%
BenchmarkClientServerParallelTLS64 112 109 -2.68%
BenchmarkCreateGoroutinesCapture 10 6 -40.00%
BenchmarkAfterFunc 1006 1005 -0.10%
Fixes#6632.
Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca
Reviewed-on: https://go-review.googlesource.com/3166
Reviewed-by: Russ Cox <rsc@golang.org>
Set the minimum heap size to 4Mbytes except when the hash
table code wants to force a GC. In an unrelated change when a
mutator is asked to assist the GC by marking pointer workbufs
it will keep working until the requested number of pointers
are processed even if it means asking for additional workbufs.
Change-Id: I661cfc0a7f2efcf6286b5d37d73e593d9ecd04d5
Reviewed-on: https://go-review.googlesource.com/3392
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
If result of string(i) does not escape,
allocate a [4]byte temp on stack for it.
Change-Id: If31ce9447982929d5b3b963fd0830efae4247c37
Reviewed-on: https://go-review.googlesource.com/3411
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we always allocate string buffers in heap.
For example, in the following code we allocate a temp string
just for comparison:
if string(byteSlice) == "abc" { ... }
This change extends escape analysis to cover []byte->string
conversions and string concatenation. If the result of operations
does not escape, compiler allocates a small buffer
on stack and passes it to slicebytetostring and concatstrings.
Then runtime uses the buffer if the result fits into it.
Size of the buffer is 32 bytes. There is no fundamental theory
behind this number. Just an observation that on std lib
tests/benchmarks frequency of string allocation is inversely
proportional to string length; and there is significant number
of allocations up to length 32.
benchmark old allocs new allocs delta
BenchmarkFprintfBytes 2 1 -50.00%
BenchmarkDecodeComplex128Slice 318 316 -0.63%
BenchmarkDecodeFloat64Slice 318 316 -0.63%
BenchmarkDecodeInt32Slice 318 316 -0.63%
BenchmarkDecodeStringSlice 2318 2316 -0.09%
BenchmarkStripTags 11 5 -54.55%
BenchmarkDecodeGray 111 102 -8.11%
BenchmarkDecodeNRGBAGradient 200 188 -6.00%
BenchmarkDecodeNRGBAOpaque 165 152 -7.88%
BenchmarkDecodePaletted 319 309 -3.13%
BenchmarkDecodeRGB 166 157 -5.42%
BenchmarkDecodeInterlacing 279 268 -3.94%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServerParallel4 62 61 -1.61%
BenchmarkClientServerParallel64 62 61 -1.61%
BenchmarkClientServerParallelTLS4 79 78 -1.27%
BenchmarkClientServerParallelTLS64 112 111 -0.89%
benchmark old ns/op new ns/op delta
BenchmarkFprintfBytes 381 311 -18.37%
BenchmarkStripTags 2615 2351 -10.10%
BenchmarkDecodeNRGBAGradient 3715887 3635096 -2.17%
BenchmarkDecodeNRGBAOpaque 3047645 2928644 -3.90%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
Change-Id: I9ec01da816945c3329d7be3c7794b520418c3f99
Reviewed-on: https://go-review.googlesource.com/3120
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
During a concurrent GC stacks are scanned in
an initial scan phase informing the GC of all
pointers on the stack. The GC only needs to rescan
the stack if it potentially changes which can only
happen if the goroutine runs.
This CL tracks whether the Goroutine has run
since it was last scanned and thus may have changed
its stack. If necessary the stack is rescanned.
Change-Id: I5fb1c4338d42e3f61ab56c9beb63b7b2da25f4f1
Reviewed-on: https://go-review.googlesource.com/3275
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we allocate a new string during []byte->string conversion
in string comparison expressions. String allocation is unnecessary in
this case, because comparison does memorize the strings for later use.
This change uses slicebytetostringtmp to construct temp string directly
from []byte buffer and passes it to runtime.eqstring.
Change-Id: If00f1faaee2076baa6f6724d245d5b5e0f59b563
Reviewed-on: https://go-review.googlesource.com/3410
Reviewed-by: Russ Cox <rsc@golang.org>
Coarse-grained test skips to fix bots.
Need to look closer at windows and nacl failures.
Change-Id: I767ef1707232918636b33f715459ee3c0349b45e
Reviewed-on: https://go-review.googlesource.com/3416
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Call frame allocations can account for significant portion
of all allocations in a program, if call is executed
in an inner loop (e.g. to process every line in a log).
On the other hand, the allocation is easy to remove
using sync.Pool since the allocation is strictly scoped.
benchmark old ns/op new ns/op delta
BenchmarkCall 634 338 -46.69%
BenchmarkCall-4 496 167 -66.33%
benchmark old allocs new allocs delta
BenchmarkCall 1 0 -100.00%
BenchmarkCall-4 1 0 -100.00%
Update #7818
Change-Id: Icf60cce0a9be82e6171f0c0bd80dee2393db54a7
Reviewed-on: https://go-review.googlesource.com/1954
Reviewed-by: Keith Randall <khr@golang.org>
The %61 hack was added when runtime was is in C.
Now the Go compiler does the optimization.
Change-Id: I79c3302ec4b931eaaaaffe75e7101c92bf287fc7
Reviewed-on: https://go-review.googlesource.com/3289
Reviewed-by: Keith Randall <khr@golang.org>
Consider the following code:
s := "(" + string(byteSlice) + ")"
Currently we allocate a new string during []byte->string conversion,
and pass it to concatstrings. String allocation is unnecessary in
this case, because concatstrings does memorize the strings for later use.
This change uses slicebytetostringtmp to construct temp string directly
from []byte buffer and passes it to concatstrings.
I've found few such cases in std lib:
s += string(msg[off:off+c]) + "."
buf.WriteString("Sec-WebSocket-Accept: " + string(c.accept) + "\r\n")
bw.WriteString("Sec-WebSocket-Key: " + string(nonce) + "\r\n")
err = xml.Unmarshal([]byte("<Top>"+string(data)+"</Top>"), &logStruct)
d.err = d.syntaxError("invalid XML name: " + string(b))
return m, ProtocolError("malformed MIME header line: " + string(kv))
But there are much more in our internal code base.
Change-Id: I42f401f317131237ddd0cb9786b0940213af16fb
Reviewed-on: https://go-review.googlesource.com/3163
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Half of tests currently crash with GODEBUG=wbshadow.
_PageSize is set to 8192. So data can be extended outside
of actually mapped region during rounding. Which leads to crash
during initial copying to shadow.
Use _PhysPageSize instead.
Change-Id: Iaa89992bd57f86dafa16b092b53fdc0606213acb
Reviewed-on: https://go-review.googlesource.com/3286
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we scan maps even if k/v does not contain pointers.
This is required because overflow buckets are hanging off the main table.
This change introduces a separate array that contains pointers to all
overflow buckets and keeps them alive. Buckets themselves are marked
as containing no pointers and are not scanned by GC (if k/v does not
contain pointers).
This brings maps in line with slices and chans -- GC does not scan
their contents if elements do not contain pointers.
Currently scanning of a map[int]int with 2e8 entries (~8GB heap)
takes ~8 seconds. With this change scanning takes negligible time.
Update #9477.
Change-Id: Id8a04066a53d2f743474cad406afb9f30f00eaae
Reviewed-on: https://go-review.googlesource.com/3288
Reviewed-by: Keith Randall <khr@golang.org>
Adjust triggergc so that we trigger when we have used 7/8
of the available heap memory. Do first collection when we
exceed 4Mbytes.
Change-Id: I467b4335e16dc9cd1521d687fc1f99a51cc7e54b
Reviewed-on: https://go-review.googlesource.com/3149
Reviewed-by: Austin Clements <austin@google.com>
Adujst triggergc so that we trigger when we have used 7/8
of the available memory.
Change-Id: I7ca02546d3084e6a04d60b09479e04a9a9837ae2
Reviewed-on: https://go-review.googlesource.com/3061
Reviewed-by: Russ Cox <rsc@golang.org>
Print out the object holding the reference to the object
that checkmark detects as not being properly marked.
Change-Id: Ieedbb6fddfaa65714504af9e7230bd9424cd0ae0
Reviewed-on: https://go-review.googlesource.com/2744
Reviewed-by: Austin Clements <austin@google.com>
The code in mfinal.go is moved from malloc*.go and mgc*.go
and substantially unchanged.
The code in mbitmap.go is also moved from those files, but
cleaned up so that it can be called from those files (in most cases
the code being moved was not already a standalone function).
I also renamed the constants and wrote comments describing
the format. The result is a significant cleanup and isolation of
the bitmap code, but, roughly speaking, it should be treated
and reviewed as new code.
The other files changed only as much as necessary to support
this code movement.
This CL does NOT change the semantics of the heap or type
bitmaps at all, although there are now some obvious opportunities
to do so in followup CLs.
Change-Id: I41b8d5de87ad1d3cd322709931ab25e659dbb21d
Reviewed-on: https://go-review.googlesource.com/2991
Reviewed-by: Keith Randall <khr@golang.org>
I also added new comments at the top of mbarrier.go,
but the rest of the code is just copy-and-paste.
Change-Id: Iaeb2b12f8b1eaa33dbff5c2de676ca902bfddf2e
Reviewed-on: https://go-review.googlesource.com/2990
Reviewed-by: Austin Clements <austin@google.com>
Otherwise, if you mistakenly refer to an undeclared 'shift' variable, you get 52.
Change-Id: I845fb29f23baee1d8e17b37bde0239872eb54316
Reviewed-on: https://go-review.googlesource.com/2909
Reviewed-by: Austin Clements <austin@google.com>
The function is here ONLY for symmetry with package bytes.
This function should be used ONLY if it makes code clearer.
It is not here for performance. Remove any performance benefit.
If performance becomes an issue, the compiler should be fixed to
recognize the three-way compare (for all comparable types)
rather than encourage people to micro-optimize by using this function.
Change-Id: I71f4130bce853f7aef724c6044d15def7987b457
Reviewed-on: https://go-review.googlesource.com/3012
Reviewed-by: Rob Pike <r@golang.org>
This manually reverts 555da73 from #6372 which implies a
minimum FreeBSD version of 8-STABLE.
Updates docs to mention new minimum requirement.
Fixes#9627
Change-Id: I40ae64be3682d79dd55024e32581e3e5e2be8aa7
Reviewed-on: https://go-review.googlesource.com/3020
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The implementation is the same assembly (or Go) routine.
Change-Id: Ib937c461c24ad2d5be9b692b4eed40d9eb031412
Reviewed-on: https://go-review.googlesource.com/2828
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
runtime.rtype was a copy of reflect.rtype - update script to use that directly.
Introduces a basic test which will skip on systems without appropriate GDB.
Fixes#9326
Change-Id: I6ec74e947bd2e1295492ca34b3a8c1b49315a8cb
Reviewed-on: https://go-review.googlesource.com/2821
Reviewed-by: Ian Lance Taylor <iant@golang.org>
6g does not implement dead code elimination for const switches like it
does for const if statements, so the undefined raiseproc() function
was resulting in a link-time failure.
Change-Id: Ie4fcb3716cb4fe6e618033071df9de545ab3e0af
Reviewed-on: https://go-review.googlesource.com/2830
Reviewed-by: Russ Cox <rsc@golang.org>
printf, vprintf, snprintf, gc_m_ptr, gc_g_ptr, gc_itab_ptr, gc_unixnanotime.
These were called from C.
There is no more C.
Now that vprintf is gone, delete roundup, which is unsafe (see CL 2814).
Change-Id: If8a7b727d497ffa13165c0d3a1ed62abc18f008c
Reviewed-on: https://go-review.googlesource.com/2824
Reviewed-by: Austin Clements <austin@google.com>
Moving the "don't really preempt" check up earlier in the function
introduced a race where gp.stackguard0 might change between
the early check and the later one. Since the later one is missing the
"don't really preempt" logic, it could decide to preempt incorrectly.
Pull the result of the check into a local variable and use an atomic
to access stackguard0, to eliminate the race.
I believe this will fix the broken OS X and Solaris builders.
Change-Id: I238350dd76560282b0c15a3306549cbcf390dbff
Reviewed-on: https://go-review.googlesource.com/2823
Reviewed-by: Austin Clements <austin@google.com>
Since CL 2750, the build is broken on Plan 9,
because a new function netpollinited was added
and called from findrunnable in proc1.go.
However, netpoll is not implemented on Plan 9.
Thus, we define netpollinited in netpoll_stub.go.
Fixes#9590
Change-Id: I0895607b86cbc7e94c1bfb2def2b1a368a8efbe6
Reviewed-on: https://go-review.googlesource.com/2759
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
These were fixed in my local commit,
but I forgot that the web Submit button can't see that.
Change-Id: Iec3a70ce3ccd9db2a5394ae2da0b293e45ac2fb5
Reviewed-on: https://go-review.googlesource.com/2822
Reviewed-by: Russ Cox <rsc@golang.org>
During all.bash I got a crash in the GOMAXPROCS=2 runtime test reporting
that the write barrier in the assignment 'c.tiny = add(x, size)' had been
given a pointer pointing into an unexpected span. The problem is that
the tiny allocation was at the end of a span and c.tiny was now pointing
to the end of the allocation and therefore to the end of the span aka
the beginning of the next span.
Rewrite tinyalloc not to do that.
More generally, it's not okay to call add(p, size) unless you know that p
points at > (not just >=) size bytes. Similarly, pretty much any call to
roundup doesn't know how much space p points at, so those are all
broken.
Rewrite persistentalloc not to use add(p, totalsize) and not to use roundup.
There is only one use of roundup left, in vprintf, which is dead code.
I will remove that code and roundup itself in a followup CL.
Change-Id: I211e307d1a656d29087b8fd40b2b71010722fb4a
Reviewed-on: https://go-review.googlesource.com/2814
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
It could happen that mp.printlock++ happens, then on entry to lock,
the goroutine is preempted and then rescheduled onto another m
for the actual call to lock. Now the lock and the printlock++ have
happened on different m's. This can lead to printlock not being
unlocked, which either gives a printing deadlock or a crash when
the goroutine reschedules, because m.locks > 0.
Change-Id: Ib0c08740e1b53de3a93f7ebf9b05f3dceff48b9f
Reviewed-on: https://go-review.googlesource.com/2819
Reviewed-by: Rick Hudson <rlh@golang.org>
Mostly this is using uint32 instead of int32 for unsigned values
like instruction encodings or float32 bit representations,
removal of ternary operations, and removal of #defines.
Delete sched9.c, because it is not compiled (it is still in the history
if we ever need it).
Change-Id: I68579cfea679438a27a80416727a9af932b088ae
Reviewed-on: https://go-review.googlesource.com/2658
Reviewed-by: Austin Clements <austin@google.com>
Normally, a panic/throw only shows the thread stack for the current thread
and all paused goroutines. Goroutines running on other threads, or other threads
running on their system stacks, are opaque. Change that when GODEBUG=crash,
by passing a SIGQUIT around to all the threads when GODEBUG=crash.
If this works out reasonably well, we might make the SIGQUIT relay part of
the standard panic/throw death, perhaps eliding idle m's.
Change-Id: If7dd354f7f3a6e326d17c254afcf4f7681af2f8b
Reviewed-on: https://go-review.googlesource.com/2811
Reviewed-by: Rick Hudson <rlh@golang.org>
There is a small possibility that runtime deadlocks when netpoll is just activated.
Consider the following scenario:
GOMAXPROCS=1
epfd=-1 (netpoll is not activated yet)
A thread is in findrunnable, sets sched.lastpoll=0, calls netpoll(true),
which returns nil. Now the thread is descheduled for some time.
Then sysmon retakes a P from syscall and calls handoffp.
The "If this is the last running P and nobody is polling network" check in handoffp fails,
since the first thread set sched.lastpoll=0. So handoffp decides that there is already
a thread that polls network and so it calls pidleput.
Now the first thread is scheduled again, finds no work and calls stopm.
There is no thread that polls network and so checkdead reports deadlock.
To fix this, don't set sched.lastpoll=0 when netpoll is not activated.
The deadlock can happen if cgo is disabled (-tag=netgo) and only on program startup
(when netpoll is just activated).
The test is from issue 5216 that lead to addition of the
"If this is the last running P and nobody is polling network" check in handoffp.
Update issue 9576.
Change-Id: I9405f627a4d37bd6b99d5670d4328744aeebfc7a
Reviewed-on: https://go-review.googlesource.com/2750
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The old name was too ambiguous (is it a verb? is it a predicate? is
it a constant?) and too close to debug.gccheckmark. Hopefully the new
name conveys that this variable indicates that we are currently doing
mark checking.
Change-Id: I031cd48b0906cdc7774f5395281d3aeeb8ef3ec9
Reviewed-on: https://go-review.googlesource.com/2656
Reviewed-by: Rick Hudson <rlh@golang.org>
1) Move non-preemption check even earlier in newstack.
This avoids a few priority inversion problems.
2) Always use atomic operations to update bitmap for 1-word objects.
This avoids lost mark bits during concurrent GC.
3) Stop using work.nproc == 1 as a signal for being single-threaded.
The concurrent GC runs with work.nproc == 1 but other procs are
running mutator code.
The use of work.nproc == 1 in getfull *is* safe, but remove it anyway,
since it is saving only a single atomic operation per GC round.
Fixes#9225.
Change-Id: I24134f100ad592ea8cb59efb6a54f5a1311093dc
Reviewed-on: https://go-review.googlesource.com/2745
Reviewed-by: Rick Hudson <rlh@golang.org>
Make auxv parsing in linux/arm less of a special case.
* rename setup_auxv to sysargs
* exclude linux/arm from vdso_none.go
* move runtime.checkarm after runtime.sysargs so arm specific
values are properly initialised
Change-Id: I1ca7f5844ad5a162337ff061a83933fc9a2b5ff6
Reviewed-on: https://go-review.googlesource.com/2681
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
In the previous sandbox implementation we read all sandboxed output
from standard output, and so all fake time writes were made to
standard output. Now we have a more sophisticated sandbox server
(see golang.org/x/playground/sandbox) that is capable of recording
both standard output and standard error, so allow fake time writes to
go to either file descriptor.
Change-Id: I79737deb06fd8e0f28910f21f41bd3dc1726781e
Reviewed-on: https://go-review.googlesource.com/2713
Reviewed-by: Minux Ma <minux@golang.org>
Previously, gccheckmark could only be enabled or disabled by calling
runtime.GCcheckmarkenable/GCcheckmarkdisable. This was a necessary
hack because GODEBUG was broken.
Now that GODEBUG works again, move control over gccheckmark to a
GODEBUG variable and remove these runtime functions. Currently,
gccheckmark is enabled by default (and will probably remain so for
much of the 1.5 development cycle).
Change-Id: I2bc6f30c21b795264edf7dbb6bd7354b050673ab
Reviewed-on: https://go-review.googlesource.com/2603
Reviewed-by: Rick Hudson <rlh@golang.org>
Also fix one unaligned stack size for nacl that is caught
by this change.
Fixes#9539.
Change-Id: Ib696a573d3f1f9bac7724f3a719aab65a11e04d3
Reviewed-on: https://go-review.googlesource.com/2600
Reviewed-by: Keith Randall <khr@golang.org>
Recognize loops of the form
for i := range a {
a[i] = zero
}
in which the evaluation of a is free from side effects.
Replace these loops with calls to memclr.
This occurs in the stdlib in 18 places.
The motivating example is clearing a byte slice:
benchmark old ns/op new ns/op delta
BenchmarkGoMemclr5 3.31 3.26 -1.51%
BenchmarkGoMemclr16 13.7 3.28 -76.06%
BenchmarkGoMemclr64 50.8 4.14 -91.85%
BenchmarkGoMemclr256 157 6.02 -96.17%
Update #5373.
Change-Id: I99d3e6f5f268e8c6499b7e661df46403e5eb83e4
Reviewed-on: https://go-review.googlesource.com/2520
Reviewed-by: Keith Randall <khr@golang.org>
Fixes#9541.
Change-Id: I5d659ad50d7c3d1c92ed9feb86cda4c1a6e62054
Reviewed-on: https://go-review.googlesource.com/2584
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Random is bad, it can block and prevent binaries from starting.
Use urandom instead. We'd rather have bad random bits than no
random bits.
Change-Id: I360e1cb90ace5518a1b51708822a1dae27071ebd
Reviewed-on: https://go-review.googlesource.com/2582
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Minux Ma <minux@golang.org>
In 32-bit worlds, 8-byte objects are only aligned to 4-byte boundaries.
Change-Id: I91469a9a67b1ee31dd508a4e105c39c815ecde58
Reviewed-on: https://go-review.googlesource.com/2581
Reviewed-by: Keith Randall <khr@golang.org>
For a non-zero-sized struct with a final zero-sized field,
add a byte to the size (before rounding to alignment). This
change ensures that taking the address of the zero-sized field
will not incorrectly leak the following object in memory.
reflect.funcLayout also needs this treatment.
Fixes#9401
Change-Id: I1dc503dc5af4ca22c8f8c048fb7b4541cc957e0f
Reviewed-on: https://go-review.googlesource.com/2452
Reviewed-by: Russ Cox <rsc@golang.org>
run GC in its own background goroutine making the
caller runnable if resources are available. This is
critical in single goroutine applications.
Allow goroutines that allocate a lot to help out
the GC and in doing so throttle their own allocation.
Adjust test so that it only detects that a GC is run
during init calls and not whether the GC is memory
efficient. Memory efficiency work will happen later
in 1.5.
Change-Id: I4306f5e377bb47c69bda1aedba66164f12b20c2b
Reviewed-on: https://go-review.googlesource.com/2349
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This improves the printing of GC times to be both more human-friendly
and to provide enough information for the construction of MMU curves
and other statistics. The new times look like:
GC: #8 72413852ns @143036695895725 pause=622900 maxpause=427037 goroutines=11 gomaxprocs=4
GC: sweep term: 190584ns max=190584 total=275001 procs=4
GC: scan: 260397ns max=260397 total=902666 procs=1
GC: install wb: 5279ns max=5279 total=18642 procs=4
GC: mark: 71530555ns max=71530555 total=186694660 procs=1
GC: mark term: 427037ns max=427037 total=1691184 procs=4
This prints gomaxprocs and the number of procs used in each phase for
the benefit of analyzing mutator utilization during concurrent phases.
This also means the analysis doesn't have to hard-code which phases
are STW.
This prints the absolute start time only for the GC cycle. The other
start times can be derived from the phase durations. This declutters
the view for humans readers and doesn't pose any additional complexity
for machine readers.
This removes the confusing "cycle" terminology. Instead, this places
the phase duration after the phase name and adds a "ns" unit, which
both makes it implicitly clear that this is the duration of that phase
and indicates the units of the times.
This adds a "GC:" prefix to all lines for easier identification.
Finally, this generally cleans up the code as well as the placement of
spaces in the output and adds print locking so the statistics blocks
are never interrupted by other prints.
Change-Id: Ifd056db83ed1b888de7dfa9a8fc5732b01ccc631
Reviewed-on: https://go-review.googlesource.com/2542
Reviewed-by: Rick Hudson <rlh@golang.org>
The equal algorithm used to take the size
equal(p, q *T, size uintptr) bool
With this change, it does not
equal(p, q *T) bool
Similarly for the hash algorithm.
The size is rarely used, as most equal functions know the size
of the thing they are comparing. For instance f32equal already
knows its inputs are 4 bytes in size.
For cases where the size is not known, we allocate a closure
(one for each size needed) that points to an assembly stub that
reads the size out of the closure and calls generic code that
has a size argument.
Reduces the size of the go binary by 0.07%. Performance impact
is not measurable.
Change-Id: I6e00adf3dde7ad2974adbcff0ee91e86d2194fec
Reviewed-on: https://go-review.googlesource.com/2392
Reviewed-by: Russ Cox <rsc@golang.org>
Use a lookup table to find the function which contains a pc. It is
faster than the old binary search. findfunc is used primarily for
stack copying and garbage collection.
benchmark old ns/op new ns/op delta
BenchmarkStackCopy 294746596 255400980 -13.35%
(findfunc is one of several tasks done by stack copy, the findfunc
time itself is about 2.5x faster.)
The lookup table is built at link time. The table grows the binary
size by about 0.5% of the text segment.
We impose a lower limit of 16 bytes on any function, which should not
have much of an impact. (The real constraint required is <=256
functions in every 4096 bytes, but 16 bytes/function is easier to
implement.)
Change-Id: Ic315b7a2c83e1f7203cd2a50e5d21a822e18fdca
Reviewed-on: https://go-review.googlesource.com/2097
Reviewed-by: Russ Cox <rsc@golang.org>
This implements support for calls to and from C in the ppc64 C ABI, as
well as supporting functionality such as an entry point from the
dynamic linker.
Change-Id: I68da6df50d5638cb1a3d3fef773fb412d7bf631a
Reviewed-on: https://go-review.googlesource.com/2009
Reviewed-by: Russ Cox <rsc@golang.org>
Cgo will need this for calls from C to Go and for handling signals
that may occur in C code.
Change-Id: I50cc4caf17cd142bff501e7180a1e27721463ada
Reviewed-on: https://go-review.googlesource.com/2008
Reviewed-by: Russ Cox <rsc@golang.org>
Cache 2KB, 4KB, 8KB, and 16KB stacks. Larger stacks
will be allocated directly. There is no point in cacheing
32KB+ stacks as we ask for and return 32KB at a time
from the allocator.
Note that the minimum stack is 8K on windows/64bit and 4K on
windows/32bit and plan9. For these os/arch combinations,
the number of stack orders is less so that we have the same
maximum cached size.
Fixes#9045
Change-Id: Ia4195dd1858fb79fc0e6a91ae29c374d28839e44
Reviewed-on: https://go-review.googlesource.com/2098
Reviewed-by: Russ Cox <rsc@golang.org>
The ones at the end of M and G are just used to compute
their size for use in assembly. Generate the size explicitly.
The one at the end of itab is variable-sized, and at least one.
The ones at the end of interfacetype and uncommontype are not
needed, as the preceding slice references them (the slice was
originally added for use by reflect?).
The one at the end of stackmap is already accessed correctly,
and the runtime never allocates one.
Update #9401
Change-Id: Ia75e3aaee38425f038c506868a17105bd64c712f
Reviewed-on: https://go-review.googlesource.com/2420
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Fold in some startup randomness to make the hash vary across
different runs. This helps prevent attackers from choosing
keys that all map to the same bucket.
Also, reorganize the hash a bit. Move the *m1 multiply to after
the xor of the current hash and the message. For hash quality
it doesn't really matter, but for DDOS resistance it helps a lot
(any processing done to the message before it is merged with the
random seed is useless, as it is easily inverted by an attacker).
Update #9365
Change-Id: Ib19968168e1bbc541d1d28be2701bb83e53f1e24
Reviewed-on: https://go-review.googlesource.com/2344
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This CL only fixes the build, there are two failing tests:
RaceMapBigValAccess1 and RaceMapBigValAccess2
in runtime/race tests. I haven't investigated why yet.
Updates #9516.
Change-Id: If5bd2f0bee1ee45b1977990ab71e2917aada505f
Reviewed-on: https://go-review.googlesource.com/2401
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
sysReserve doesn't actually reserve the full amount requested on
64-bit systems, because of problems with ulimit. Instead it checks
that it can get the first 64 kB and assumes it can grab the rest as
needed. This doesn't work well with the "let the kernel pick an address"
mode, so don't do that. Pick a high address instead.
Change-Id: I4de143a0e6fdeb467fa6ecf63dcd0c1c1618a31c
Reviewed-on: https://go-review.googlesource.com/2345
Reviewed-by: Rick Hudson <rlh@golang.org>
The line 'mp.schedlink = mnext' has an implicit write barrier call,
which needs a valid g. Move it above the setg(nil).
Change-Id: If3e86c948e856e10032ad89f038bf569659300e0
Reviewed-on: https://go-review.googlesource.com/2347
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
First, call clearcheckmarks immediately after changing checkmark,
so that there is less time when the checkmark flag and the bitmap
are inconsistent. The tiny gap between the two lines is fine, because
the world is stopped. Before, the gap was much larger and included
such code as "go bgsweep()", which allocated.
Second, modify gcphase only when the world is stopped.
As written, gcscan_m was changing gcphase from 0 to GCscan
and back to 0 while other goroutines were running.
Another goroutine running at the same time might decide to
sleep, see GCscan, call gcphasework, and start "helping" by
scanning its stack. That's fine, except that if gcphase flips back
to 0 as the goroutine calls scanblock, it will start draining the
work buffers prematurely.
Both of these were found wbshadow=2 (and a lot of hard work).
Eventually that will run automatically, but right now it still
doesn't quite work for all.bash, due to mmap conflicts with
pthread-created threads.
Change-Id: I99aa8210cff9c6e7d0a1b62c75be32a23321897b
Reviewed-on: https://go-review.googlesource.com/2340
Reviewed-by: Rick Hudson <rlh@golang.org>
Use typedmemmove, typedslicecopy, and adjust reflect.call
to execute the necessary write barriers.
Found with GODEBUG=wbshadow=2 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: Iec5b5b0c1be5589295e28e5228e37f1a92e07742
Reviewed-on: https://go-review.googlesource.com/2312
Reviewed-by: Keith Randall <khr@golang.org>
A side effect of this change is that when assertI2T writes to the
memory for the T being extracted, it can use typedmemmove
for write barriers.
There are other ways we could have done this, but this one
finishes a TODO in package runtime.
Found with GODEBUG=wbshadow=2 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: Icbc8aabfd8a9b1f00be2e421af0e3b29fa54d01e
Reviewed-on: https://go-review.googlesource.com/2279
Reviewed-by: Keith Randall <khr@golang.org>
Found with GODEBUG=wbshadow=2 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: Iea83d693480c2f3008b4e80d55821acff65970a6
Reviewed-on: https://go-review.googlesource.com/2277
Reviewed-by: Keith Randall <khr@golang.org>
Preparation for replacing many memmove calls in runtime
with typedmemmove, which is a clearer description of what
the routine is doing.
For the same reason, rename writebarriercopy to typedslicecopy.
Change-Id: I6f23bef2c2215509fefba175b16908f76dc7538c
Reviewed-on: https://go-review.googlesource.com/2276
Reviewed-by: Rick Hudson <rlh@golang.org>
Add write barrier to atomic operations manipulating pointers.
In general an atomic write of a pointer word may indicate racy accesses,
so there is no strictly safe way to attempt to keep the shadow copy
in sync with the real one. Instead, mark the shadow copy as not used.
Redirect sync/atomic pointer routines back to the runtime ones,
so that there is only one copy of the write barrier and shadow logic.
In time we might consider doing this for most of the sync/atomic
functions, but for now only the pointer routines need that treatment.
Found with GODEBUG=wbshadow=1 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: I852936b9a111a6cb9079cfaf6bd78b43016c0242
Reviewed-on: https://go-review.googlesource.com/2066
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The Gobuf.g goroutine pointer is almost always updated by assembly code.
In one of the few places it is updated by Go code - func save - it must be
treated as a uintptr to avoid a write barrier being emitted at a bad time.
Instead of figuring out how to emit the write barriers missing in the
assembly manipulation, change the type of the field to uintptr, so that
it does not require write barriers at all.
Goroutine structs are published in the allg list and never freed.
That will keep the goroutine structs from being collected.
There is never a time that Gobuf.g's contain the only references
to a goroutine: the publishing of the goroutine in allg comes first.
Goroutine pointers are also kept in non-GC-visible places like TLS,
so I can't see them ever moving. If we did want to start moving data
in the GC, we'd need to allocate the goroutine structs from an
alternate arena. This CL doesn't make that problem any worse.
Found with GODEBUG=wbshadow=1 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: I85f91312ec3e0ef69ead0fff1a560b0cfb095e1a
Reviewed-on: https://go-review.googlesource.com/2065
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Found with GODEBUG=wbshadow=1 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: Ic8624401d7c8225a935f719f96f2675c6f5c0d7c
Reviewed-on: https://go-review.googlesource.com/2064
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This is the detection code. It works well enough that I know of
a handful of missing write barriers. However, those are subtle
enough that I'll address them in separate followup CLs.
GODEBUG=wbshadow=1 checks for a write that bypassed the
write barrier at the next write barrier of the same word.
If a bug can be detected in this mode it is typically easy to
understand, since the crash says quite clearly what kind of
word has missed a write barrier.
GODEBUG=wbshadow=2 adds a check of the write barrier
shadow copy during garbage collection. Bugs detected at
garbage collection can be difficult to understand, because
there is no context for what the found word means.
Typically you have to reproduce the problem with allocfreetrace=1
in order to understand the type of the badly updated word.
Change-Id: If863837308e7c50d96b5bdc7d65af4969bf53a6e
Reviewed-on: https://go-review.googlesource.com/2061
Reviewed-by: Austin Clements <austin@google.com>
Noticed while investigating the speed of the runtime tests, as part
of debugging while Plan 9's runtime tests are timing out on GCE.
Change-Id: I95f5a3d967a0b45ec1ebf10067e193f51db84e26
Reviewed-on: https://go-review.googlesource.com/2283
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This reverts commit ab0535ae3f.
I think it will remain useful to distinguish code that must
run on a system stack from code that can run on either stack,
even if that distinction is no
longer based on the implementation language.
That is, I expect to add a //go:systemstack comment that,
in terms of the old implementation, tells the compiler,
to pretend this function was written in C.
Change-Id: I33d2ebb2f99ae12496484c6ec8ed07233d693275
Reviewed-on: https://go-review.googlesource.com/2275
Reviewed-by: Russ Cox <rsc@golang.org>
Shell out to `uname -r` this time, so that the test will compile
even if the platform doesn't have syscall.Sysctl.
Change-Id: I3a19ab5d820bdb94586a97f4507b3837d7040525
Reviewed-on: https://go-review.googlesource.com/2271
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The test program requires static constructor, which in turn needs
external linking to work, but external linking never works on 10.6.
This should fix the darwin-{386,amd64} builders.
Change-Id: I714fdd3e35f9a7e5f5659cf26367feec9412444f
Reviewed-on: https://go-review.googlesource.com/2235
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Fixes build on plan9 and windows.
Change-Id: Ic9b02c641ab84e4f6d8149de71b9eb495e3343b2
Reviewed-on: https://go-review.googlesource.com/2233
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
I missed this one in golang.org/cl/2232 and only tested the patch
on openbsd/amd64.
Change-Id: I4ff437ae0bfc61c989896c01904b6d33f9bdf0ec
Reviewed-on: https://go-review.googlesource.com/2234
Reviewed-by: Minux Ma <minux@golang.org>
This is a genuine bug exposed by our test for issue 9456: our wrapper
for pthread_create is not initialized until we initialize cgo itself,
but it is possible that a static constructor could call pthread_create,
and in that case, it will be calling a nil function pointer.
Fix that by also initializing the sys_pthread_create function pointer
inside our pthread_create wrapper function, and use a pthread_once to
make sure it is only initialized once.
Fix build for openbsd.
Change-Id: Ica4da2c21fcaec186fdd3379128ef46f0e767ed7
Reviewed-on: https://go-review.googlesource.com/2232
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Some libraries, for example, OpenBLAS, create work threads in a global constructor.
If we're doing cpu profiling, it's possible that SIGPROF might come to some of the
worker threads before we make our first cgo call. Cgocallback used to terminate the
process when that happens, but it's better to miss a couple profiling signals than
to abort in this case.
Fixes#9456.
Change-Id: I112b8e1a6e10e6cc8ac695a4b518c0f577309b6b
Reviewed-on: https://go-review.googlesource.com/2141
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Following change 2154, the goatoi function
was renamed atoi.
However, this definition conflicts with the
atoi function defined in the Plan 9 runtime,
which takes a []byte instead of a string.
This change fixes the build on Plan 9.
Change-Id: Ia0f7ca2f965bd5e3cce3177bba9c806f64db05eb
Reviewed-on: https://go-review.googlesource.com/2165
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
They are no longer needed now that C is gone.
goatoi -> atoi
gofuncname/funcname -> funcname/cfuncname
goroundupsize -> already existing roundupsize
Change-Id: I278bc33d279e1fdc5e8a2a04e961c4c1573b28c7
Reviewed-on: https://go-review.googlesource.com/2154
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Now that we've removed all the C code in runtime and the C compilers,
there is no need to have a separate stackguard field to check for C
code on Go stack.
Remove field g.stackguard1 and rename g.stackguard0 to g.stackguard.
Adjust liblink and cmd/ld as necessary.
Change-Id: I54e75db5a93d783e86af5ff1a6cd497d669d8d33
Reviewed-on: https://go-review.googlesource.com/2144
Reviewed-by: Keith Randall <khr@golang.org>
The goalg function was a holdover from when we had algorithm
tables in both C and Go. It is no longer needed.
Change-Id: Ia0c1af35bef3497a899f22084a1a7b42daae72a0
Reviewed-on: https://go-review.googlesource.com/2099
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Rename "gothrow" to "throw" now that the C version of "throw"
is no longer needed.
This change is purely mechanical except in panic.go where the
old version of "throw" has been deleted.
sed -i "" 's/[[:<:]]gothrow[[:>:]]/throw/g' runtime/*.go
Change-Id: Icf0752299c35958b92870a97111c67bcd9159dc3
Reviewed-on: https://go-review.googlesource.com/2150
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Currently we do very a complex rebalancing of runnable goroutines
between queues, which tries to preserve scheduling fairness.
Besides being complex and error-prone, it also destroys all locality
of scheduling.
This change uses simpler scheme: leave runnable goroutines where
they are, during starttheworld start all Ps with local work,
plus start one additional P in case we have excessive runnable
goroutines in local queues or in the global queue.
The schedler must be able to operate efficiently w/o the rebalancing,
because garbage collections do not have to happen frequently.
The immediate need is execution tracing support: handling of
garabage collection which does stoptheworld/starttheworld several
times becomes exceedingly complex if the current execution can
jump between Ps during starttheworld.
Change-Id: I4fdb7a6d80ca4bd08900d0c6a0a252a95b1a2c90
Reviewed-on: https://go-review.googlesource.com/1951
Reviewed-by: Rick Hudson <rlh@golang.org>
Add a nil byte at the end of the itoa buffer,
before calling gostringnocopy. This prevents
gostringnocopy to read past the buffer size.
Change-Id: I87494a8dd6ea45263882536bf6c0f294eda6866d
Reviewed-on: https://go-review.googlesource.com/2033
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
Replace with uses of //go:linkname in Go files, direct use of name in .s files.
The only one that really truly needs a jump is reflect.call; the jump is now
next to the runtime.reflectcall assembly implementations.
Change-Id: Ie7ff3020a8f60a8e4c8645fe236e7883a3f23f46
Reviewed-on: https://go-review.googlesource.com/1962
Reviewed-by: Austin Clements <austin@google.com>
These signals are used by glibc to broadcast setuid/setgid to all
threads and to send pthread cancellations. Unlike other signals, the
Go runtime does not intercept these because they must invoke the libc
handlers (see issues #3871 and #6997). However, because 1) these
signals may be issued asynchronously by a thread running C code to
another thread running Go code and 2) glibc does not set SA_ONSTACK
for its handlers, glibc's signal handler may be run on a Go stack.
Signal frames range from 1.5K on amd64 to many kilobytes on ppc64, so
this may overflow the Go stack and corrupt heap (or other stack) data.
Fix this by ensuring that these signal handlers have the SA_ONSTACK
flag (but not otherwise taking over the handler).
This has been a problem since Go 1.1, but it's likely that people
haven't encountered it because it only affects setuid/setgid and
pthread_cancel.
Fixes#9600.
Change-Id: I6cf5f5c2d3aa48998d632f61f1ddc2778dcfd300
Reviewed-on: https://go-review.googlesource.com/1887
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Calls to goproc/deferproc used to push & pop two extra arguments,
the argument size and the function to call. Now, we allocate space
for those arguments in the outargs section so we don't have to
modify the SP.
Defers now use the stack pointer (instead of the argument pointer)
to identify which frame they are associated with.
A followon CL might simplify funcspdelta and some of the stack
walking code.
Fixes issue #8641
Change-Id: I835ec2f42f0392c5dec7cb0fe6bba6f2aed1dad8
Reviewed-on: https://go-review.googlesource.com/1601
Reviewed-by: Russ Cox <rsc@golang.org>
For arm and powerpc, as well as x86 without aes instructions.
Contains a mixture of ideas from cityhash and xxhash.
Compared to our old fallback on ARM, it's ~no slower on
small objects and up to ~50% faster on large objects. More
importantly, it is a much better hash function and thus has
less chance of bad behavior.
Fixes#8737
benchmark old ns/op new ns/op delta
BenchmarkHash5 173 181 +4.62%
BenchmarkHash16 252 212 -15.87%
BenchmarkHash64 575 419 -27.13%
BenchmarkHash1024 7173 3995 -44.31%
BenchmarkHash65536 516940 313173 -39.42%
BenchmarkHashStringSpeed 300 279 -7.00%
BenchmarkHashBytesSpeed 478 424 -11.30%
BenchmarkHashInt32Speed 217 207 -4.61%
BenchmarkHashInt64Speed 262 231 -11.83%
BenchmarkHashStringArraySpeed 609 631 +3.61%
Change-Id: I0a9335028f32b10ad484966e3019987973afd3eb
Reviewed-on: https://go-review.googlesource.com/1360
Reviewed-by: Russ Cox <rsc@golang.org>
Pointers to zero-sized values may end up pointing to the next
object in memory, and possibly off the end of a span. This
can cause memory leaks and/or confuse the garbage collector.
By putting the overflow pointer at the end of the bucket, we
make sure that pointers to any zero-sized keys or values don't
accidentally point to the next object in memory.
fixes#9384
Change-Id: I5d434df176984cb0210b4d0195dd106d6eb28f73
Reviewed-on: https://go-review.googlesource.com/1869
Reviewed-by: Russ Cox <rsc@golang.org>
with uintptr, the check for < 0 will never succeed in mem_plan9.go's
sbrk() because the brk_ syscall returns -1 on failure. fixes the plan9/amd64 build.
this failed on plan9/amd64 because of the attempt to allocate 136GB in mallocinit(),
which failed. it was just by chance that on plan9/386 allocations never failed.
Change-Id: Ia3059cf5eb752e20d9e60c9619e591b80e8fb03c
Reviewed-on: https://go-review.googlesource.com/1590
Reviewed-by: Anthony Martin <ality@pbrane.org>
Reviewed-by: David du Colombier <0intro@gmail.com>
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
"x*41" computes the same value as "x*31 + x*7 + x*3" and (when
compiled by gc) requires just one multiply instruction instead of
three.
Alternatively, the expression could be written as "(x<<2+x)<<3 + x" to
use shifts instead of multiplies (which is how GCC optimizes "x*41").
But gc currently emits suboptimal instructions for this expression
anyway (e.g., separate SHL+ADD instructions rather than LEA on
386/amd64). Also, if such an optimization was worthwhile, it would
seem better to implement it as part of gc's strength reduction logic.
Change-Id: I7156b793229d723bbc9a52aa9ed6111291335277
Reviewed-on: https://go-review.googlesource.com/1830
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
It shouldn't semacquire() inside an acquirem(), the runtime
thinks that means deadlock. It actually isn't a deadlock, but it
looks like it because acquirem() does m.locks++.
Candidate for inclusion in 1.4.1. runtime.Stack with all=true
is pretty unuseable in GOMAXPROCS>1 environment.
fixes#9321
Change-Id: Iac6b664217d24763b9878c20e49229a1ecffc805
Reviewed-on: https://go-review.googlesource.com/1600
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Most types are reflexive (k == k for all k of type t), so don't
bother calling equal(k, k) when the key type is reflexive.
Change-Id: Ia716b4198b8b298687843b94b878dbc5e8fc2c65
Reviewed-on: https://go-review.googlesource.com/1480
Reviewed-by: Russ Cox <rsc@golang.org>
//go:nowritebarrier can only be used in package runtime.
It does not disable write barriers; it is an assertion, checked
by the compiler, that the following function needs no write
barriers.
Change-Id: Id7978b779b66dc1feea39ee6bda9fd4d80280b7c
Reviewed-on: https://go-review.googlesource.com/1224
Reviewed-by: Rick Hudson <rlh@golang.org>
I tried to submit this in Go 1.4 as cl/107540044 but tripped over the
changes for getting C off the G stack. This is a rewritten version that
avoids cgo and works directly with the underlying log device.
Change-Id: I14c227dbb4202690c2c67c5a613d6c6689a6662a
Reviewed-on: https://go-review.googlesource.com/1285
Reviewed-by: Keith Randall <khr@golang.org>
It could only handle one finalizer before it raised an out-of-bounds error.
Fixes issue #9172
Change-Id: Ibb4d0c8aff2d78a1396e248c7129a631176ab427
Reviewed-on: https://go-review.googlesource.com/1201
Reviewed-by: Russ Cox <rsc@golang.org>
needm used to print an error before exiting when it was called too
early, but this error was lost in the transition to Go. Bring back
the error so we don't silently exit(1) when this happens.
Change-Id: I8086932783fd29a337d7dea31b9d6facb64cb5c1
Reviewed-on: https://go-review.googlesource.com/1226
Reviewed-by: Russ Cox <rsc@golang.org>
Avoids a potential O(n^2) performance problem when dequeueing
from very popular channels.
benchmark old ns/op new ns/op delta
BenchmarkChanPopular 2563782 627201 -75.54%
Change-Id: I231aaeafea0ecd93d27b268a0b2128530df3ddd6
Reviewed-on: https://go-review.googlesource.com/1200
Reviewed-by: Russ Cox <rsc@golang.org>
If the symbol table isn't sorted, we print it and abort. However, we
were missing the line break after each symbol, resulting in one
gigantic line instead of a nicely formatted table.
Change-Id: Ie5c6f3c256d0e648277cb3db4496512a79d266dd
Reviewed-on: https://go-review.googlesource.com/1182
Reviewed-by: Russ Cox <rsc@golang.org>
When we start work on Gerrit, ppc64 and garbage collection
work will continue in the master branch, not the dev branches.
(We may still use dev branches for other things later, but
these are ready to be merged, and doing it now, before moving
to Git means we don't have to have dev branches working
in the Gerrit workflow on day one.)
TBR=rlh
CC=golang-codereviews
https://golang.org/cl/183140043
640 bytes ought to be enough for anybody.
We'll bring this back down before Go 1.5. That's issue 9214.
TBR=rlh
CC=golang-codereviews
https://golang.org/cl/188730043
This is going to hurt a bit but we'll make it better later.
Now the race detector can be run again.
I added the write barrier optimizations from
CL 183020043 to try to make it hurt a little less.
TBR=rlh
CC=golang-codereviews
https://golang.org/cl/185070043
This is the last system-dependent file written by cmd/dist.
They are all now written by go generate.
cmd/dist is not needed to start building package runtime
for a different system anymore.
Now all the generated files can be assumed generated, so
delete the clumsy hacks in cmd/api.
Re-enable api check in run.bash.
LGTM=bradfitz
R=bradfitz
CC=golang-codereviews
https://golang.org/cl/185040044
During garbage collection, after scanning a stack, we think about
shrinking it to reclaim some memory. The shrinking code (called
while the world is stopped) checked that the status was Gwaiting
or Grunnable and then changed the state to Gcopystack, to essentially
lock the stack so that no other GC thread is scanning it.
The same locking happens for stack growth (and is more necessary there).
oldstatus = runtime·readgstatus(gp);
oldstatus &= ~Gscan;
if(oldstatus == Gwaiting || oldstatus == Grunnable)
runtime·casgstatus(gp, oldstatus, Gcopystack); // oldstatus is Gwaiting or Grunnable
else
runtime·throw("copystack: bad status, not Gwaiting or Grunnable");
Unfortunately, "stop the world" doesn't stop everything. It stops all
normal goroutine execution, but the network polling thread is still
blocked in epoll and may wake up. If it does, and it chooses a goroutine
to mark runnable, and that goroutine is the one whose stack is shrinking,
then it can happen that between readgstatus and casgstatus, the status
changes from Gwaiting to Grunnable.
casgstatus assumes that if the status is not what is expected, it is a
transient change (like from Gwaiting to Gscanwaiting and back, or like
from Gwaiting to Gcopystack and back), and it loops until the status
has been restored to the expected value. In this case, the status has
changed semi-permanently from Gwaiting to Grunnable - it won't
change again until the GC is done and the world can continue, but the
GC is waiting for the status to change back. This wedges the program.
To fix, call a special variant of casgstatus that accepts either Gwaiting
or Grunnable as valid statuses.
Without the fix bug with the extra check+throw in casgstatus, the
program below dies in a few seconds (2-10) with GOMAXPROCS=8
on a 2012 Retina MacBook Pro. With the fix, it runs for minutes
and minutes.
package main
import (
"io"
"log"
"net"
"runtime"
)
func main() {
const N = 100
for i := 0; i < N; i++ {
l, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
log.Fatal(err)
}
ch := make(chan net.Conn, 1)
go func() {
var err error
c1, err := net.Dial("tcp", l.Addr().String())
if err != nil {
log.Fatal(err)
}
ch <- c1
}()
c2, err := l.Accept()
if err != nil {
log.Fatal(err)
}
c1 := <-ch
l.Close()
go netguy(c1, c2)
go netguy(c2, c1)
c1.Write(make([]byte, 100))
}
for {
runtime.GC()
}
}
func netguy(r, w net.Conn) {
buf := make([]byte, 100)
for {
bigstack(1000)
_, err := io.ReadFull(r, buf)
if err != nil {
log.Fatal(err)
}
w.Write(buf)
}
}
var g int
func bigstack(n int) {
var buf [100]byte
if n > 0 {
bigstack(n - 1)
}
g = int(buf[0]) + int(buf[99])
}
Fixes#9186.
LGTM=rlh
R=austin, rlh
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/179680043
Otherwise both zgoos_linux.go and zgoos_android.go will be compiled
for GOOS=android.
LGTM=crawshaw, rsc
R=rsc, crawshaw
CC=golang-codereviews
https://golang.org/cl/178110043
We don't know what we need yet, so add them all.
Add them even on x86 architectures (as no-ops) so that
the GC can refer to them unconditionally.
Eventually we'll know what we want and probably
have just one 'prefetch' with an appropriate meaning
on each architecture.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/179160043
Thanks to Aram Hăvărneanu, Nick Owens
and Russ Cox for the early reviews.
LGTM=aram, rsc
R=rsc, lucio.dere, aram, ality
CC=golang-codereviews, mischief
https://golang.org/cl/175370043
Race detector runtime does not tolerate operations on addresses
that was not previously declared with __tsan_map_shadow
(namely, data, bss and heap). The corresponding address
checks for atomic operations were removed in
https://golang.org/cl/111310044
Restore these checks.
It's tricker than just not calling into race runtime,
because it is the race runtime that makes the atomic
operations themselves (if we do not call into race runtime
we skip the atomic operation itself as well). So instead we call
__tsan_go_ignore_sync_start/end around the atomic operation.
This forces race runtime to skip all other processing
except than doing the atomic operation itself.
Fixes#9136.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/179030043
The assumption can be violated by external linkers reordering them or
inserting non-Go sections in between them. I looked briefly at trying
to write out the _go_.o in external linking mode in a way that forced
the ordering, but no matter what there's no way to force Go's data
and Go's bss to be next to each other. If there is any data or bss from
non-Go objects, it's very likely to get stuck in between them.
Instead, rewrite the two places we know about that make the assumption.
I grepped for noptrdata to look for more and didn't find any.
The added race test (os/exec in external linking mode) fails without
the changes in the runtime. It crashes with an invalid pointer dereference.
Fixes#9133.
LGTM=dneil
R=dneil
CC=dvyukov, golang-codereviews, iant
https://golang.org/cl/179980043
struct siginfo_t's si_addr field is part of a union.
Previously, we represented this union in Go using an opaque
byte array and accessed the si_addr field using unsafe (and
wrong on 386 and arm!) pointer arithmetic. Since si_addr is
the only field we use from this union, this replaces the
opaque byte array with an explicit declaration of the si_addr
field and accesses it directly.
LGTM=minux, rsc
R=rsc, minux
CC=golang-codereviews
https://golang.org/cl/179970044
Previously, this used the top 8 bits of an instruction as a
sort-of opcode and ignored the top two bits of the relative
PC. This worked because these jumps are always negative and
never big enough for the top two bits of the relative PC (also
the bottom 2 bits of the sort-of opcode) to be anything other
than 0b11, but the code is confusing because it doesn't match
the actual structure of the instruction.
Instead, use the real 6 bit opcode and use all 24 bits of
relative PC.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/179960043
Previously, lfstack assumed Linux limited user space addresses
to 43 bits on Power64 based on a paper from 2001. It turns
out the limit is now 46 bits, so lfstack was truncating
pointers.
Raise the limit to 48 bits (for some future proofing and to
make it match amd64) and add a self-test that will fail in a
useful way if ever unpack(pack(x)) != x.
With this change, dev.cc passes all.bash on power64le.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/174430043
This is the power64 component of CL 174950043.
With this, dev.cc compiles on power64 and power64le and passes
most tests if GOGC=off (but crashes in go_bootstrap if GC is
on).
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/175290043
Fix a constant conversion error. Add set_{sec,nsec} for
timespec and set_usec for timeval. Fix type of
sigaltstackt.ss_size.
LGTM=rsc
R=rsc, bradfitz
CC=golang-codereviews
https://golang.org/cl/180840043
Eventually I'd like almost everything cmd/dist generates
to be done with 'go generate' and checked in, to simplify
the bootstrap process. The only thing cmd/dist really needs
to do is write things like the current experiment info and
the current version.
This is a first step toward that. It replaces the _NaCl etc
constants with generated ones goos_nacl, goos_darwin,
goarch_386, and so on.
LGTM=dave, austin
R=austin, dave, bradfitz
CC=golang-codereviews, iant, r
https://golang.org/cl/174290043
The SudoG used to sit on the stack, so it was cheap to allocated
and didn't need to be cleaned up when finished.
For the conversion to Go, we had to move sudog off the stack
for a few reasons, so we added a cache of recently used sudogs
to keep allocation cheap. But we didn't add any of the necessary
cleanup before adding a SudoG to the new cache, and so the cached
SudoGs had stale pointers inside them that have caused all sorts
of awful, hard to debug problems.
CL 155760043 made sure SudoG.elem is cleaned up.
CL 150520043 made sure SudoG.selectdone is cleaned up.
This CL makes sure SudoG.next, SudoG.prev, and SudoG.waitlink
are cleaned up. I should have done this when I did the other two
fields; instead I wasted a week tracking down a leak they caused.
A dangling SudoG.waitlink can point into a sudogcache list that
has been "forgotten" in order to let the GC collect it, but that
dangling .waitlink keeps the list from being collected.
And then the list holding the SudoG with the dangling waitlink
can find itself in the same situation, and so on. We end up
with lists of lists of unusable SudoGs that are still linked into
the object graph and never collected (given the right mix of
non-trivial selects and non-channel synchronization).
More details in golang.org/issue/9110.
Fixes#9110.
LGTM=r
R=r
CC=dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/177870043
I just created that redirect, so we can change
it once the wiki moves.
LGTM=bradfitz, khr
R=khr, bradfitz
CC=golang-codereviews
https://golang.org/cl/177780043
The garbage collector is now written in Go.
There is plenty to clean up (just like on dev.cc).
all.bash passes on darwin/amd64, darwin/386, linux/amd64, linux/386.
TBR=rlh
R=austin, rlh, bradfitz
CC=golang-codereviews
https://golang.org/cl/173250043
* _sfloat dispatches to runtime._sfloat2 with the Go calling convention, so the seecond argument is a [15]uint32, not a *[15]uint32.
* adjust _sfloat2 to return the new pc in 68(R13) as expected.
LGTM=rsc
R=minux, austin, rsc
CC=golang-codereviews
https://golang.org/cl/174160043
It's rather unsporting of the kernel to give us a pointer to unaligned memory.
This fixes one crash, the next crash occurs in the soft float emulation.
LGTM=minux, rsc, austin
R=minux, rsc, austin
CC=golang-codereviews
https://golang.org/cl/177730043
This is to reduce the delta between dev.cc and dev.garbage to just garbage collector changes.
These are the files that had merge conflicts and have been edited by hand:
malloc.go
mem_linux.go
mgc.go
os1_linux.go
proc1.go
panic1.go
runtime1.go
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/174180043
Now the only difference between dev.cc and dev.garbage
is the runtime conversion on the one side and the
garbage collection on the other. They both have the
same set of changes from default and dev.power64.
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/172570043
This was originally done to the C port in rev 17d3b45534b5 and
seemingly got lost during the conversion.
LGTM=bradfitz
R=rsc, bradfitz
CC=golang-codereviews
https://golang.org/cl/167700043
Memory management was consolitated with the BSD ports, since
it was almost identical.
Assembly thunks are gone, being replaced by the new //go:linkname
feature.
This change supersedes CL 138390043 (runtime: convert solaris
netpoll to Go), which was previously reviewed and tested.
This change is only the first step, the port now builds,
but doesn't run. Binaries fail to exec:
ld.so.1: 6.out: fatal: 6.out: TLS requirement failure : TLS support is unavailable
Killed
This seems to happen because binaries don't link with libc.so
anymore. We will have to solve that in a different CL.
Also this change is just a rough translation of the original
C code, cleanup will come in a different CL.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=rsc
R=rsc, dave
CC=golang-codereviews, iant, khr, minux, r, rlh
https://golang.org/cl/174960043
Scalararg and ptrarg are not "signal safe".
Go code filling them out can be interrupted by a signal,
and then the signal handler runs, and if it also ends up
in Go code that uses scalararg or ptrarg, now the old
values have been smashed.
For the pieces of code that do need to run in a signal handler,
we introduced onM_signalok, which is really just onM
except that the _signalok is meant to convey that the caller
asserts that scalarg and ptrarg will be restored to their old
values after the call (instead of the usual behavior, zeroing them).
Scalararg and ptrarg are also untyped and therefore error-prone.
Go code can always pass a closure instead of using scalararg
and ptrarg; they were only really necessary for C code.
And there's no more C code.
For all these reasons, delete scalararg and ptrarg, converting
the few remaining references to use closures.
Once those are gone, there is no need for a distinction between
onM and onM_signalok, so replace both with a single function
equivalent to the current onM_signalok (that is, it can be called
on any of the curg, g0, and gsignal stacks).
The name onM and the phrase 'm stack' are misnomers,
because on most system an M has two system stacks:
the main thread stack and the signal handling stack.
Correct the misnomer by naming the replacement function systemstack.
Fix a few references to "M stack" in code.
The main motivation for this change is to eliminate scalararg/ptrarg.
Rick and I have already seen them cause problems because
the calling sequence m.ptrarg[0] = p is a heap pointer assignment,
so it gets a write barrier. The write barrier also uses onM, so it has
all the same problems as if it were being invoked by a signal handler.
We worked around this by saving and restoring the old values
and by calling onM_signalok, but there's no point in keeping this nice
home for bugs around any longer.
This CL also changes funcline to return the file name as a result
instead of filling in a passed-in *string. (The *string signature is
left over from when the code was written in and called from C.)
That's arguably an unrelated change, except that once I had done
the ptrarg/scalararg/onM cleanup I started getting false positives
about the *string argument escaping (not allowed in package runtime).
The compiler is wrong, but the easiest fix is to write the code like
Go code instead of like C code. I am a bit worried that the compiler
is wrong because of some use of uninitialized memory in the escape
analysis. If that's the reason, it will go away when we convert the
compiler to Go. (And if not, we'll debug it the next time.)
LGTM=khr
R=r, khr
CC=austin, golang-codereviews, iant, rlh
https://golang.org/cl/174950043
Also include onM_signalok fix from issue 8995.
Fixes linux/arm build.
Fixes#8995.
LGTM=r
R=r, dave
CC=golang-codereviews
https://golang.org/cl/168580043
This was recorded as an hg mv instead of an hg cp.
For now a C version is needed for the Go compiler.
TBR=r
CC=golang-codereviews
https://golang.org/cl/174020043
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
vlrt.c was only called from C. Pure delete.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r, austin
CC=dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/174860043
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/174830044
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r, daniel.morsing
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/172260043
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/172250044
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r, austin
CC=dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/172250043
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
In a few cases, defs_$GOOS_$GOARCH.go already existed,
so the target here is defs1_$GOOS_$GOARCH.go.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/171490043
float.c held bit patterns for special float64 values,
hiding from the real uses. Rewrite Go code not to
refer to those values directly.
Convert library routines in runtime.c and string.c.
LGTM=r
R=r, dave
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/170330043
The main change is that #include "zasm_GOOS_GOARCH.h"
is now #include "go_asm.h" and/or #include "go_tls.h".
Also, because C StackGuard is now Go _StackGuard,
the assembly name changes from const_StackGuard to
const__StackGuard.
In asm_$GOARCH.s, add new function getg, formerly
implemented in C.
The renamed atomics now have Go wrappers, to get
escape analysis annotations right. Those wrappers
are in CL 174860043.
LGTM=r, aram
R=r, aram
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/168510043
This code overused macros and could not be
converted automatically. Instead a new sigctxt
type had to be defined for each os/arch combination,
with a common (implicit) interface used by the
arch-specific signal handler code.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/168500044
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/168500043
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r, austin
CC=dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/167550043
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/167540043
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r, dave
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/166520043
Add missing write barrier when initializing state
for newly created goroutine. Add write barrier for
same slot when preempting a goroutine.
Disable write barrier during goroutine death,
because dopanic does pointer writes.
With concurrent mark enabled (not in this CL), all.bash passed once.
The second time, TestGoexitCrash-2 failed.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/167610043
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
- Remove references to C compiler directories.
- Remove generation of special header files.
- Remove generation of Go source files from C declarations.
- Compile Go sources before rest of package (was after),
so that Go compiler can write go_asm.h for use in assembly.
- Move TLS information from cmd/dist (was embedding in output)
to src/runtime/go_tls.h, which it can be maintained directly.
LGTM=r
R=r, dave
CC=austin, golang-codereviews, iant, khr
https://golang.org/cl/172960043
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
Adjustments for changes made in CL 169360043.
This change is already present in the dev.garbage branch.
LGTM=r
R=r
CC=austin, golang-codereviews, iant, khr
https://golang.org/cl/167520044
To turn concurrent gc on alter the if false in func gogc
currently at line 489 in malloc.go
LGTM=rsc
R=rsc
CC=golang-codereviews, rlh
https://golang.org/cl/172190043
Manifested as increased memory usage in a Google production system.
Not an unbounded leak, but can significantly increase the number
of sudogs allocated between garbage collections.
I checked all the other calls to acquireSudog.
This is the only one that was missing a releaseSudog.
LGTM=r, dneil
R=dneil, r
CC=golang-codereviews
https://golang.org/cl/169260043
These are being built into the runtime/cgo for every
operating system. It doesn't seem to matter, but
restore the Go 1.3 behavior anyway.
LGTM=r
R=r, dave
CC=golang-codereviews
https://golang.org/cl/171290043
Stack bitmaps need to be scanned past any BitsDead entries.
Object bitmaps will not have any BitsDead in them (bitmap extraction stops at
the first BitsDead entry in makeheapobjbv). data/bss bitmaps also have no BitsDead entries.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/168270043
Gentraceback may grow the stack.
One of the gentraceback wrappers may grow the stack.
One of the gentraceback callback calls may grow the stack.
Various stack pointers are stored in various stack locations
as type uintptr during the execution of these calls.
If the stack does grow, these stack pointers will not be
updated and will start trying to decode stack memory that
is no longer valid.
It may be possible to change the type of the stack pointer
variables to be unsafe.Pointer, but that's pretty subtle and
may still have problems, even if we catch every last one.
An easier, more obviously correct fix is to require that
gentraceback of the currently running goroutine must run
on the g0 stack, not on the goroutine's own stack.
Not doing this causes faults when you set
StackFromSystem = 1
StackFaultOnFree = 1
The new check in gentraceback will catch future lapses.
The more general problem is calling getcallersp but then
calling a function that might relocate the stack, which would
invalidate the result of getcallersp. Add note to stubs.go
declaration of getcallersp explaining the problem, and
check all existing calls to getcallersp. Most needed fixes.
This affects Callers, Stack, and nearly all the runtime
profiling routines. It does not affect stack copying directly
nor garbage collection.
LGTM=khr
R=khr, bradfitz
CC=golang-codereviews, r
https://golang.org/cl/167060043
Now each C printf, Go print, or Go println is guaranteed
not to be interleaved with other calls of those functions.
This should help when debugging concurrent failures.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/169120043
- Some sequencing issues with stopping the first gc_m round
at the right place to set up correctly for the second round.
- atomicxor8 is not idempotent; avoid xor.
- Maintain BitsDead type bits correctly; see long comment added.
- Enable checkmark phase by default for now.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/171090043
This adds an independent mark phase to the GC that can be used to
verify the the default concurrent mark phase has found all reachable
objects. It uses the upper 2 bits of the boundary nibble to encode
the mark leaving the lower bits to encode the boundary and the
normal mark bit.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/167130043
Previously, the flags argument to mallocgc was an int in Go,
but a uint32 in C. Change the Go type to use uint32 so these
agree. The largest flag value is 2 (and of course no flag
values are negative), so this won't change anything on little
endian architectures, but it matters on big endian.
LGTM=rsc
R=khr, rsc
CC=golang-codereviews
https://golang.org/cl/169920043
The GC info masks for slices and strings were changed in
commit caab29a25f68, but the reference masks used by
gcinfo_test for power64x hadn't caught up. Now they're
identical to amd64, so this CL fixes this test by combining
the reference masks for these platforms.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/162620044
fastrand1 depends on testing the high bit of its uint32 state.
For efficiency, all of the architectures implement this as a
sign bit test. However, on power64, fastrand1 was using a
64-bit sign test on the zero-extended 32-bit state. This
always failed, causing fastrand1 to have very short periods
and often decay to 0 and get stuck.
Fix this by using a 32-bit signed compare instead of a 64-bit
compare. This fixes various tests for the randomization of
select of map iteration.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/166990043
All three cases of clearfat were wrong on power64x.
The cases that handle 1032 bytes and up and 32 bytes and up
both use MOVDU (one directly generated in a loop and the other
via duffzero), which leaves the pointer register pointing at
the *last written* address. The generated code was not
accounting for this, so the byte fill loop was re-zeroing the
last zeroed dword, rather than the bytes following the last
zeroed dword. Fix this by simply adding an additional 8 byte
offset to the byte zeroing loop.
The case that handled under 32 bytes was also wrong. It
didn't update the pointer register at all, so the byte zeroing
loop was simply re-zeroing the beginning of region. Again,
the fix is to add an offset to the byte zeroing loop to
account for this.
LGTM=dave, bradfitz
R=rsc, dave, bradfitz
CC=golang-codereviews
https://golang.org/cl/168870043
Apparently I had already moved on to fixing another problem
when I submitted CL 169790043.
LGTM=dave
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/165210043
No real problems found. Just lots of argument names that
didn't quite match up.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/169790043
This adds a test to runtime·check to ensure CAS of large
unsigned 32-bit numbers does not accidentally sign-extend its
arguments.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/162490044
Previously, the power64x runtime assembly was sloppy about
using sign-extending versus zero-extending moves of arguments
and return values. I think all of the cases that actually
mattered have been fixed in recent CLs; this CL fixes up the
few remaining mismatches.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/162480043
This CL implements the many multiword write barriers by calling
writebarrierptr, so that only writebarrierptr needs the actual barrier.
In lieu of an actual barrier, writebarrierptr checks that the value
being copied is not a small non-zero integer. This is enough to
shake out bugs where the barrier is being called when it should not
(for non-pointer values). It also found a few tests in sync/atomic
that were being too clever.
This CL adds a write barrier for the memory moved during the
builtin copy function, which I forgot when inserting barriers for Go 1.4.
This CL re-enables some write barriers that were disabled for Go 1.4.
Those were disabled because it is possible to change the generated
code so that they are unnecessary most of the time, but we have not
changed the generated code yet. For safety they must be enabled.
None of this is terribly efficient. We are aiming for correct first.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/168770043
If you get a stack of PCs from Callers, it would be expected
that every PC is immediately after a call instruction, so to find
the line of the call, you look up the line for PC-1.
CL 163550043 now explicitly documents that.
The most common exception to this is the top-most return PC
on the stack, which is the entry address of the runtime.goexit
function. Subtracting 1 from that PC will end up in a different
function entirely.
To remove this special case, make the top-most return PC
goexit+PCQuantum and then implement goexit in assembly
so that the first instruction can be skipped.
Fixes#7690.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/170720043
Originally traceback was only used for printing the stack
when an unexpected signal came in. In that case, the
initial PC is taken from the signal and should be used
unaltered. For the callers, the PC is the return address,
which might be on the line after the call; we subtract 1
to get to the CALL instruction.
Traceback is now used for a variety of things, and for
almost all of those the initial PC is a return address,
whether from getcallerpc, or gp->sched.pc, or gp->syscallpc.
In those cases, we need to subtract 1 from this initial PC,
but the traceback code had a hard rule "never subtract 1
from the initial PC", left over from the signal handling days.
Change gentraceback to take a flag that specifies whether
we are tracing a trap.
Change traceback to default to "starting with a return PC",
which is the overwhelmingly common case.
Add tracebacktrap, like traceback but starting with a trap PC.
Use tracebacktrap in signal handlers.
Fixes#7690.
LGTM=iant, r
R=r, iant
CC=golang-codereviews
https://golang.org/cl/167810044
Attempt to clear up confusion about how to turn
the PCs reported by Callers into the file and line
number people actually want.
Fixes#7690.
LGTM=r, chris.cs.guy
R=r, chris.cs.guy
CC=golang-codereviews
https://golang.org/cl/163550043
The goal here is to get the big-endian fixes so that
in some upcoming code movement for write barriers
I don't make them unmergeable.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/166890043
goprintf is a printf-like print for Go.
It is used in the code generated by 'defer print(...)' and 'go print(...)'.
Normally print(1, 2, 3) turns into
printint(1)
printint(2)
printint(3)
but defer and go need a single function call to give the runtime;
they give the runtime something like goprintf("%d%d%d", 1, 2, 3).
Variadic functions like goprintf cannot be described in the new
type information world, so we have to replace it.
Replace with a custom function, so that defer print(1, 2, 3) turns
into
defer func(a1, a2, a3 int) {
print(a1, a2, a3)
}(1, 2, 3)
(and then the print becomes three different printints as usual).
Fixes#8614.
LGTM=austin
R=austin
CC=golang-codereviews, r
https://golang.org/cl/159700043
I removed support for jumping between functions years ago,
as part of doing the instruction layout for each function separately.
Given that, it makes sense to treat labels as function-scoped.
This lets each function have its own 'loop' label, for example.
Makes the assembly much cleaner and removes the last
reason anyone would reach for the 123(PC) form instead.
Note that this is on the dev.power64 branch, but it changes all
the assemblers. The change will ship in Go 1.5 (perhaps after
being ported into the new assembler).
Came up as part of CL 167730043.
LGTM=r
R=r
CC=austin, dave, golang-codereviews, minux
https://golang.org/cl/159670043
Power64 servers do not currently support sub-word size atomic
memory access, so atomicor8 uses word size atomic access.
However, previously atomicor8 made no attempt to align this
access, resulting in errors. Fix this by aligning the pointer
to a word boundary and shifting the value appropriately.
Since atomicor8 is used in GC, add a test to runtime·check to
make sure this doesn't break in the future.
This also fixes an incorrect branch label, an incorrectly
sized argument move, and adds argument names to help go vet.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/165820043
Fix include paths that got moved in the great pkg/ rename. Add
missing runtime/arch_* files for power64. Port changes that
happened on default since branching to
runtime/{asm,atomic,sys_linux}_power64x.s (precise stacks,
calling convention change, various new and deleted functions.
Port struct renaming and fix some bugs in
runtime/defs_linux_power64.h.
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/161450043
The ftab ends with a half functab record consisting only of
the 'entry' field followed by a uint32 giving the offset of
the next table. Previously, symtabinit assumed it could read
this uint32 as a uintptr. Since this is unsafe on big endian,
explicitly read the offset as a uint32.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/157660043
Routines and logic to preform a concurrent stack scan of go-routines.
This CL excersizes most of the functionality needed. The
major exception being that it does not scan running goroutines.
After doing the scans it relies on a STW to finish the GC, including
rescanning the stacks. It is intended to achieve correctness,
performance will follow.
LGTM=rsc
R=golang-codereviews, rsc
CC=dvyukov, golang-codereviews
https://golang.org/cl/156580043
The changes got rid of the problems we were seeing.
We suspect the pushcnt field has a race.
LGTM=rsc
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/159330043
Also updated defs3_linux.go but had to manually edit defs_linux_power64le.h. Will regenerate the file when cgo is working natively on ppc64.
LGTM=austin
R=rsc, austin
CC=golang-codereviews
https://golang.org/cl/158360043
go_bootstrap was panicking during runtime initialization
(under runtime.main) because Defer objects were being
prematurely GC'd. This happened because of an incorrect
change to runtime·unrollgcprog_m to make it endian-agnostic
during the conversion of runtime bitmaps to byte arrays.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/161960044
This brings dev.power64 up-to-date with the current tip of
default. go_bootstrap is still panicking with a bad defer
when initializing the runtime (even on amd64).
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/152570049
The earlier dev.power64 merge missed the port of
runtime/noasm.goc to runtime/noasm_arm.go. This CL fixes this
by moving noasm_arm.go to noasm.go and adding a +build to
share the file between arm and power64.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/158350043
This also removes pkg/runtime/traceback_lr.c, which was ported
to Go in an earlier commit and then moved to
runtime/traceback.go.
Reviewer: rsc@golang.org
rsc: LGTM
Fixes#8348.
Trying to work around clang's dodgy support for .arch by reverting to the external assembler didn't work out so well. Minux had a much better solution to encode the instructions we need as .word directives which avoids .arch altogether.
I've confirmed with gdb that this form produces the expected machine code
Dump of assembler code for function crosscall_arm1:
0x00000000 <+0>: push {r4, r5, r6, r7, r8, r9, r10, r11, r12, lr}
0x00000004 <+4>: mov r4, r0
0x00000008 <+8>: mov r5, r1
0x0000000c <+12>: mov r0, r2
0x00000010 <+16>: blx r5
0x00000014 <+20>: blx r4
0x00000018 <+24>: pop {r4, r5, r6, r7, r8, r9, r10, r11, r12, pc}
There is another compilation failure that blocks building Go with clang on arm
# ../misc/cgo/test
# _/home/dfc/go/misc/cgo/test
/tmp/--407b12.s: Assembler messages:
/tmp/--407b12.s:59: Error: selected processor does not support ARM mode `blx r0'
clang: error: assembler command failed with exit code 1 (use -v to see invocation)
FAIL _/home/dfc/go/misc/cgo/test [build failed]
I'll open a new issue for that
LGTM=iant
R=iant, minux
CC=golang-codereviews
https://golang.org/cl/158180047
Get rid of gocputicks(), it is no longer used.
LGTM=bradfitz, dave
R=golang-codereviews, bradfitz, dave, minux
CC=golang-codereviews
https://golang.org/cl/161110044
It has been failing periodically on Solaris/x64.
Change blockevent so it always records an event if we called
SetBlockProfileRate(1), even if the time delta is negative or zero.
Hopefully this will fix the test on Solaris.
Caveat: I don't actually know what the Solaris problem is, this
is just an educated guess.
LGTM=dave
R=dvyukov, dave
CC=golang-codereviews
https://golang.org/cl/159150043
Russ Cox pointed out that environment strings are not
required to be nil-terminated on Plan 9.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/159130044
select {
case <- c:
case <- c:
}
In this case, c.recvq lists two SudoGs which have the same G.
So we can't use the G as the key to dequeue the correct SudoG,
as that key is ambiguous. Dequeueing the wrong SudoG ends up
freeing a SudoG that is still in c.recvq.
The fix is to use the actual SudoG pointer as the key.
LGTM=dvyukov
R=rsc, bradfitz, dvyukov, khr
CC=austin, golang-codereviews
https://golang.org/cl/159040043
Don't use cmd/pprof as it is not necessary installed
and does not work on nacl and plan9.
Instead just look at the raw profile.
LGTM=crawshaw, rsc
R=golang-codereviews, crawshaw, 0intro, rsc
CC=golang-codereviews
https://golang.org/cl/159010043
gogo called from GC is okay
for the same reasons that
gogo called from System or ExternalCode is okay.
All three are fake stack traces.
Fixes#8408.
LGTM=dvyukov, r
R=r, dvyukov
CC=golang-codereviews
https://golang.org/cl/152580043
Dmitriy believes this broke Windows.
It looks like build.golang.org stopped before that,
but it's worth a shot.
««« original CL description
runtime: make pprof a little nicer
Update #8942
This does not fully address issue 8942 but it does make
the profiles much more useful, until that issue can be
fixed completely.
LGTM=dvyukov
R=r, dvyukov
CC=golang-codereviews
https://golang.org/cl/159990043
»»»
TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/160030043
hg was unable to create a CL on the code review server for this,
so I am submitting the merge by hand.
The only manual edits are in mgc0.c, to reapply the
removal of cached/ncached to the new code.
It cannot run 'go tool pprof'. There is no guarantee that's installed.
It needs to build a temporary pprof binary and run that.
It also needs to skip the test on systems that can't build and
run binaries, namely android and nacl.
See src/cmd/nm/nm_test.go's TestNM for a template.
Update #8867
Status: Accepted
TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/153710043
Update #8942
This does not fully address issue 8942 but it does make
the profiles much more useful, until that issue can be
fixed completely.
LGTM=dvyukov
R=r, dvyukov
CC=golang-codereviews
https://golang.org/cl/159990043
There are 3 issues:
1. Skip argument of callers is off by 3,
so that all allocations are deep inside of memory profiler.
2. Memory profiling statistics are not updated after runtime.GC.
3. Testing package does not update memory profiling statistics
before capturing the profile.
Also add an end-to-end test.
Fixes#8867.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/148710043
A Go prototype can be used instead now, and the compiler
will do a better job than we will doing it by hand.
(We got it wrong in amd64p32, causing the current build
breakage.)
The auto-prototype-matching only applies to functions
without an explicit package path, so the TEXT lines for
reflectcall and callXX are s/runtime·/·/.
LGTM=khr
R=khr
CC=golang-codereviews, iant, r
https://golang.org/cl/153600043
The racewalk code was not updated for the new write barriers.
Make it more future-proof.
The new write barrier code assumed that +1 pointer would
be aligned properly for any type that might follow, but that's
not true on 32-bit systems where some types are 64-bit aligned.
The only system like that today is nacl/amd64p32.
Insert a dummy pointer so that the ambiguously typed
value is at +2 pointers, which is always max-aligned.
LGTM=r
R=r
CC=golang-codereviews, iant, khr
https://golang.org/cl/158890046
includes undo of 22318cd31d7d and also:
- always use SetUnhandledExceptionFilter on windows-386;
- crash when receive EXCEPTION_BREAKPOINT in exception handler.
Fixes#8006.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/155360043
Assignments of 2-, 3-, and 4-word values were handled
by individual MOV instructions (and for scalars still are).
But if there are pointers involved, those assignments now
go through the write barrier routine. Before this CL, they
went to writebarrierfat, which calls memmove.
Memmove is too much overhead for these small
amounts of data.
Instead, call writebarrierfat{2,3,4}, which are specialized
for the specific amount of data being copied.
Today the write barrier does not care which words are
pointers, so size alone is enough to distinguish the cases.
If we keep these distinctions in Go 1.5 we will need to
expand them for all the pointer-vs-scalar possibilities,
so the current 3 functions will become 3+7+15 = 25,
still not a large burden (we deleted more morestack
functions than that when we dropped segmented stacks).
BenchmarkBinaryTree17 3250972583 3123910344 -3.91%
BenchmarkFannkuch11 3067605223 2964737839 -3.35%
BenchmarkFmtFprintfEmpty 101 96.0 -4.95%
BenchmarkFmtFprintfString 267 235 -11.99%
BenchmarkFmtFprintfInt 261 253 -3.07%
BenchmarkFmtFprintfIntInt 444 402 -9.46%
BenchmarkFmtFprintfPrefixedInt 374 346 -7.49%
BenchmarkFmtFprintfFloat 472 449 -4.87%
BenchmarkFmtManyArgs 1537 1476 -3.97%
BenchmarkGobDecode 13986528 12432985 -11.11%
BenchmarkGobEncode 13120323 12537420 -4.44%
BenchmarkGzip 451925758 437500578 -3.19%
BenchmarkGunzip 113267612 110053644 -2.84%
BenchmarkHTTPClientServer 103151 77100 -25.26%
BenchmarkJSONEncode 25002733 23435278 -6.27%
BenchmarkJSONDecode 94213717 82568789 -12.36%
BenchmarkMandelbrot200 4804246 4713070 -1.90%
BenchmarkGoParse 4646114 4379456 -5.74%
BenchmarkRegexpMatchEasy0_32 163 158 -3.07%
BenchmarkRegexpMatchEasy0_1K 433 391 -9.70%
BenchmarkRegexpMatchEasy1_32 154 138 -10.39%
BenchmarkRegexpMatchEasy1_1K 1481 1132 -23.57%
BenchmarkRegexpMatchMedium_32 282 270 -4.26%
BenchmarkRegexpMatchMedium_1K 92421 86149 -6.79%
BenchmarkRegexpMatchHard_32 5209 4718 -9.43%
BenchmarkRegexpMatchHard_1K 158141 147921 -6.46%
BenchmarkRevcomp 699818791 642222464 -8.23%
BenchmarkTemplate 132402383 108269713 -18.23%
BenchmarkTimeParse 509 478 -6.09%
BenchmarkTimeFormat 462 456 -1.30%
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/156200043
Comments lay out the concurrent GC algorithms.
This CL implements parts of the algorithm.
The acknowledgement code has been removed from this CL
LGTM=rsc, dvyukov
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/151540043
That was complete failure - builders are broken,
but original cl worked fine on my system.
I will need access to builders
to test this change properly.
««« original CL description
runtime: handle all windows exception
Fixes#8006.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/145150043
»»»
TBR=rsc
R=golang-codereviews
CC=golang-codereviews
https://golang.org/cl/154180043
In channels, zeroing of gp.waiting is missed on a closed channel panic.
m.morebuf.g is not zeroed.
I don't expect the latter causes any problems, but just in case.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/151610043
This change prevents confusion in the garbage collector.
The collector wants to make sure that every pointer it finds
isn't junk. Its criteria for junk is (among others) points
to a "free" span.
Because the stack shrinker modifies pointers in the heap,
there is a race condition between the GC scanner and the
shrinker. The GC scanner can see old pointers (pointers to
freed stacks). In particular this happens with SudoG.elem
pointers.
Normally this is not a problem, as pointers into stack spans
are ok. But if the freed stack is the last one in its span,
the span is marked as "free" instead of "contains stacks".
This change makes sure that even if the GC scanner sees
an old pointer, the span into which it points is still
marked as "contains stacks", and thus the GC doesn't
complain about it.
This change will make the GC pause a tiny bit slower, as
the stack freeing now happens in serial with the mark pause.
We could delay the freeing until the mutators start back up,
but this is the simplest change for now.
TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/158750043
The change contains 3 spot optimizations to scan loop:
1. Don't use byte vars, use uintptr's instead.
This seems to alleviate some codegen issue,
and alone accounts to a half of speedup.
2. Remove bitmap cache. Currently we cache only 1 byte,
so caching is not particularly effective anyway.
Removal of the cache simplifies code and positively affects regalloc.
3. Replace BitsMultiword switch with if and
do debug checks only in Debug mode.
I've benchmarked changes separately and ensured that
each of them provides speedup on top of the previous one.
This change as a whole fixes the unintentional regressions
of scan loop that were introduced during development cycle.
Fixes#8625.
Fixes#8565.
On go.benchmarks/garbage benchmark:
GOMAXPROCS=1
time: -3.13%
cputime: -3.22%
gc-pause-one: -15.71%
gc-pause-total: -15.71%
GOMAXPROCS=32
time: -1.96%
cputime: -4.43%
gc-pause-one: -6.22%
gc-pause-total: -6.22%
LGTM=khr, rsc
R=golang-codereviews, khr
CC=golang-codereviews, rlh, rsc
https://golang.org/cl/153990043
Out of stack space due to new 2-word call in freedefer.
Go back to smaller function calls.
TBR=brainman
CC=golang-codereviews
https://golang.org/cl/152340043
It appears to be an opaque bit pattern more than a pointer.
The Go garbage collector has discovered that for m0
it is set to 0x4c.
Should fix Windows build.
TBR=brainman
CC=golang-codereviews
https://golang.org/cl/149640043
Another dangling stack pointer in a cached structure.
Same as SudoG.elem and SudoG.selectdone.
Definitely a fix, and the new test in freedefer makes the
crash reproducible, but probably not a complete fix.
I have seen one dangling pointer in a Defer.panic even
after this fix; I cannot see where it could be coming from.
I think this will fix the solaris build.
I do not think this will fix the occasional failure on the darwin build.
TBR=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/155080043
I have a CL which at every gc looks through data and bss
sections for nonpointer data (according to gc maps) that
looks like a pointer. These are potential missing roots.
The only thing it finds are begnign, storing stack pointers
into m0.scalararg[1] and never cleaning them up. Let's
clean them up now so the test CL passes all.bash cleanly.
The test CL can't be checked in because we might store
pointer-looking things in nonpointer data by accident.
LGTM=iant
R=golang-codereviews, iant, khr
CC=golang-codereviews
https://golang.org/cl/153210043
We no longer have full type information in the heap, so
we can't dump that any more. Instead we dump ptr/noptr
maps so at least we can compute graph connectivity.
In addition, we still dump Iface/Eface types so together
with dwarf type info we might be able to reconstruct
types of most things in the heap.
LGTM=dvyukov
R=golang-codereviews, dvyukov, rsc, khr
CC=golang-codereviews
https://golang.org/cl/155940043
This will help find bugs during the release freeze.
It's not clear it should be kept for the release itself.
That's issue 8861.
The most likely thing that would trigger this is stale
pointers that previously were ignored or caused memory
leaks. These were allowed due to the use of conservative
collection. Now that everything is precise, we should not
see them anymore.
The small number check reinforces what the stack copier
is already doing, catching the storage of integers in pointers.
It caught issue 8864.
The check is disabled if _cgo_allocate is linked into the binary,
which is to say if the binary is using SWIG to allocate untyped
Go memory. In that case, there are invalid pointers and there's
nothing we can do about it.
LGTM=rlh
R=golang-codereviews, dvyukov, rlh
CC=golang-codereviews, iant, khr, r
https://golang.org/cl/148470043
Depending on flags&KindGCProg,
gc[0] and gc[1] are either pointers or inlined bitmap bits.
That's not compatible with a precise garbage collector:
it needs to be always pointers or never pointers.
Change the inlined bitmap case to store a pointer to an
out-of-line bitmap in gc[0]. The out-of-line bitmaps are
dedup'ed, so that for example all pointer types share the
same out-of-line bitmap.
Fixes#8864.
LGTM=r
R=golang-codereviews, dvyukov, r
CC=golang-codereviews, iant, khr, rlh
https://golang.org/cl/155820043