The chanrecv funcs don't use it at all. The chansend ones do, but the
element type is now part of the hchan struct, which is already a
parameter.
hchan can be nil in chansend when sending to a nil channel, so when
instrumenting we must copy to the stack to be able to read the channel
type.
name old time/op new time/op delta
ChanUncontended 6.42µs ± 1% 6.22µs ± 0% -3.06% (p=0.000 n=19+18)
Initially found by github.com/mvdan/unparam.
Fixes#19591.
Change-Id: I3a5e8a0082e8445cc3f0074695e3593fd9c88412
Reviewed-on: https://go-review.googlesource.com/38351
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Removing stray xori that came from big endian copy/paste.
Adding atomicand8 check to runtime.check() that would have revealed
this error.
Might fix#19396.
Change-Id: If8d6f25d3e205496163541eb112548aa66df9c2a
Reviewed-on: https://go-review.googlesource.com/38257
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
I would like to use BenchmarkRunningGoProgram to measure
changes for issue #15588. So the program in the benchmark
should import "os" package.
It is also reasonable that basic Go program includes
"os" package.
For #15588.
Change-Id: Ida6712eab22c2e79fbe91b6fdd492eaf31756852
Reviewed-on: https://go-review.googlesource.com/37914
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This is a workaround for a FreeBSD kernel bug. It can be removed when
we are confident that all people are using the fixed kernel. See #15658.
Updates #15658.
Change-Id: I0ecdccb77ddd0c270bdeac4d3a5c8abaf0449075
Reviewed-on: https://go-review.googlesource.com/38325
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Mallocs and panics in the scavenge path are particularly nasty because
they're likely to silently self-deadlock on the mheap.lock. Avoid
sinking lots of time into debugging these issues in the future by
turning these into immediate throws.
Change-Id: Ib36fdda33bc90b21c32432b03561630c1f3c69bc
Reviewed-on: https://go-review.googlesource.com/38293
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The lfstack API is still a C-style API: lfstacks all have unhelpful
type uint64 and the APIs are package-level functions. Make the code
more readable and Go-style by creating an lfstack type with methods
for push, pop, and empty.
Change-Id: I64685fa3be0e82ae2d1a782a452a50974440a827
Reviewed-on: https://go-review.googlesource.com/38290
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Implement math/bits.TrailingZerosX using intrinsics.
Generally reorganize the intrinsic spec a bit.
The instrinsics data structure is now built at init time.
This will make doing the other functions in math/bits easier.
Update sys.CtzX to return int instead of uint{64,32} so it
matches math/bits.TrailingZerosX.
Improve the intrinsics a bit for amd64. We don't need the CMOV
for <64 bit versions.
Update #18616
Change-Id: Ic1c5339c943f961d830ae56f12674d7b29d4ff39
Reviewed-on: https://go-review.googlesource.com/38155
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Currently, when printing tracebacks of other threads during
GOTRACEBACK=crash, if the thread is on the system stack we print only
the header for the user goroutine and fail to print its stack. This
happens because we passed the g0 to traceback instead of curg. The g0
never has anything set in its gobuf, so traceback doesn't print
anything.
Fix this by passing _g_.m.curg to traceback instead of the g0.
Fixes#19494.
Change-Id: Idfabf94d6a725e9cdf94a3923dead6455ef3b217
Reviewed-on: https://go-review.googlesource.com/38012
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
GOTRACEBACK=crash works by bouncing a SIGQUIT around the process
sched.mcount times. However, sched.mcount includes the extra Ms
allocated by oneNewExtraM for cgo callbacks. Hence, if there are any
extra Ms that don't have real OS threads, we'll try to send SIGQUIT
more times than there are threads to catch it. Since nothing will
catch these extra signals, we'll fall back to blocking for five
seconds before aborting the process.
Avoid this five second delay by subtracting out the number of extra Ms
when sending SIGQUITs.
Of course, in a cgo binary, it's still possible for the SIGQUIT to go
to a cgo thread and cause some other failure mode. This does not fix
that.
Change-Id: I4fbf3c52dd721812796c4c1dcb2ab4cb7026d965
Reviewed-on: https://go-review.googlesource.com/38182
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
CL 32219 added precomputed sizeclass tables.
Remove the unused sizeToClass method which was previously only
called from initSizes.
Change-Id: I907bf9ed78430ecfaabbec7fca77ef2375010081
Reviewed-on: https://go-review.googlesource.com/38113
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In FreeBSD when run Go proc under a given sub-list of
processors(e.g. 'cpuset -l 0 ./a.out' in multi-core system),
runtime.NumCPU() still return all physical CPUs from sysctl
hw.ncpu instead of account from sub-list.
Fix by use syscall cpuset_getaffinity to account the number of sub-list.
Fixes#15206
Change-Id: If87c4b620e870486efa100685db5debbf1210a5b
Reviewed-on: https://go-review.googlesource.com/29341
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Found by github.com/mvdan/unparam.
Change-Id: Iabcdfec2ae42c735aa23210b7183080d750682ca
Reviewed-on: https://go-review.googlesource.com/38030
Reviewed-by: Peter Weinberger <pjw@google.com>
Run-TryBot: Peter Weinberger <pjw@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
A typo in the previous revision ("act" instead of "oldact") caused us
to return the sa_flags from the new (or zeroed) sigaction rather than
the old one.
In the presence of a signal handler registered before
runtime.libpreinit, this caused setsigstack to erroneously zero out
important sa_flags (such as SA_SIGINFO) in its attempt to re-register
the existing handler with SA_ONSTACK.
Change-Id: I3cd5152a38ec0d44ae611f183bc1651d65b8a115
Reviewed-on: https://go-review.googlesource.com/37852
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
There are a few problems from change 35494, discovered during testing
of change 37852.
1. I was confused about the usage of n.key in the sema variant, so we
were looping on the wrong condition. The error was not caught by
the TryBots (presumably due to missing TSAN coverage in the BSD and
darwin builders?).
2. The sysmon goroutine sometimes skips notetsleep entirely, using
direct usleep syscalls instead. In that case, we were not calling
_cgo_yield, leading to missed signals under TSAN.
3. Some notetsleep calls have long finite timeouts. They should be
broken up into smaller chunks with a yield at the end of each
chunk.
updates #18717
Change-Id: I91175af5dea3857deebc686f51a8a40f9d690bcc
Reviewed-on: https://go-review.googlesource.com/37867
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
After benchmarking with a compiler modified to have better
spill location, it became clear that this method of checking
was actually faster on (at least) two different architectures
(ppc64 and amd64) and it also provides more timely interruption
of loops.
This change adds a modified FOR loop node "FORUNTIL" that
checks after executing the loop body instead of before (i.e.,
always at least once). This ensures that a pointer past the
end of a slice or array is not made visible to the garbage
collector.
Without the rescheduling checks inserted, the restructured
loop from this change apparently provides a 1% geomean
improvement on PPC64 running the go1 benchmarks; the
improvement on AMD64 is only 0.12%.
Inserting the rescheduling check exposed some peculiar bug
with the ssa test code for s390x; this was updated based on
initial code actually generated for GOARCH=s390x to use
appropriate OpArg, OpAddr, and OpVarDef.
NaCl is disabled in testing.
Change-Id: Ieafaa9a61d2a583ad00968110ef3e7a441abca50
Reviewed-on: https://go-review.googlesource.com/36206
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This helps systems that maintain an external database mapping
build ID to symbol information for the given binary, especially
in the case where /proc/self/maps lists many different files
(for example, many shared libraries).
Avoid importing debug/elf to avoid dragging in that whole
package (and its dependencies like debug/dwarf) into the
build of every program that generates a profile.
Fixes#19431.
Change-Id: I6d4362a79fe23e4f1726dffb0661d20bb57f766f
Reviewed-on: https://go-review.googlesource.com/37855
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently selectgo is just a wrapper around selectgoImpl. This keeps
the hard-coded frame skip counts for tracing the same between the
channel implementation and the select implementation.
However, this is fragile and confusing, so pass a skip parameter to
send and recv, join selectgo and selectgoImpl into one function, and
use decrease all of the skips in selectgo by one.
Change-Id: I11b8cbb7d805b55f5dc6ab4875ac7dde79412ff2
Reviewed-on: https://go-review.googlesource.com/37860
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
If the bad pointer is on a stack, this makes it possible to find the
frame containing the bad pointer.
Change-Id: Ieda44e054aa9ebf22d15d184457c7610b056dded
Reviewed-on: https://go-review.googlesource.com/37858
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This commit reworks multiway select statements to use normal control
flow primitives instead of the previous setjmp/longjmp-like behavior.
This simplifies liveness analysis and should prevent issues around
"returns twice" function calls within SSA passes.
test/live.go is updated because liveness analysis's CFG is more
representative of actual control flow. The case bodies are the only
real successors of the selectgo call, but previously the selectsend,
selectrecv, etc. calls were included in the successors list too.
Updates #19331.
Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83
Reviewed-on: https://go-review.googlesource.com/37661
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
It's only ever called with the value it was using, but the code was
counterintuitive. Use the parameter instead, like the other funcs near
it.
Found by github.com/mvdan/unparam.
Change-Id: I45855e11d749380b9b2a28e6dd1d5dedf119a19b
Reviewed-on: https://go-review.googlesource.com/37893
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
For historical reasons, it's still commonplace to iterate over the
slice returned by runtime.Callers and call FuncForPC on each PC. This
is broken in gccgo and somewhat broken in gc and will become more
broken in gc with mid-stack inlining.
In Go 1.7, we introduced runtime.CallersFrames to deal with these
problems, but didn't strongly direct people toward using it. Reword
the documentation on runtime.Callers to more strongly encourage people
to use CallersFrames and explicitly discourage them from iterating
over the PCs or using FuncForPC on the results.
Fixes#19426.
Change-Id: Id0d14cb51a0e9521c8fdde9612610f2c2b9383c4
Reviewed-on: https://go-review.googlesource.com/37726
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently almost every function that deals with a *_func has to first
look up the *moduledata for the module containing the function's entry
point. This means we almost always do at least two identical module
lookups whenever we deal with a *_func (one to get the *_func and
another to get something from its module data) and sometimes several
more.
Fix this by making findfunc return a new funcInfo type that embeds
*_func, but also includes the *moduledata, and making all of the
functions that currently take a *_func instead take a funcInfo and use
the already-found *moduledata.
This transformation is trivial for the most part, since the *_func
type is usually inferred. The annoying part is that we can no longer
use nil to indicate failure, so this introduces a funcInfo.valid()
method and replaces nil checks with calls to valid.
Change-Id: I9b8075ef1c31185c1943596d96dec45c7ab5100f
Reviewed-on: https://go-review.googlesource.com/37331
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Currently we acquire a global lock for every newMarkBits call. This is
unfortunate since every span sweep operation calls newMarkBits.
However, most allocations are simply linear allocations from the
current arena. Take advantage of this to add a lock-free fast path for
allocating from the current arena. With this change, the global lock
only protects the lists of arenas, not the free offset in the current
arena.
Change-Id: I6cf6182af8492c8bfc21276114c77275fe3d7826
Reviewed-on: https://go-review.googlesource.com/34595
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, newArena holds the gcBitsArenas lock across allocating
memory from the OS for a new gcBits arena. This is a global lock and
allocating physical memory can be expensive, so this has the potential
to cause high lock contention, especially since every single span
sweep operation calls newArena (via newMarkBits).
Improve the situation by temporarily dropping the lock across
allocation. This means the caller now has to revalidate its
assumptions after the lock is dropped, so this also factors out that
code path and reinvokes it after the lock is acquired.
Change-Id: I1113200a954ab4aad16b5071512583cfac744bdc
Reviewed-on: https://go-review.googlesource.com/34594
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0
Reviewed-on: https://go-review.googlesource.com/37793
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently ReadMemStats stops the world for ~1.7 ms/GB of heap because
it collects statistics from every single span. For large heaps, this
can be quite costly. This is particularly unfortunate because many
production infrastructures call this function regularly to collect and
report statistics.
Fix this by tracking the necessary cumulative statistics in the
mcaches. ReadMemStats still has to stop the world to stabilize these
statistics, but there are only O(GOMAXPROCS) mcaches to collect
statistics from, so this pause is only 25µs even at GOMAXPROCS=100.
Fixes#13613.
Change-Id: I3c0a4e14833f4760dab675efc1916e73b4c0032a
Reviewed-on: https://go-review.googlesource.com/34937
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The gcstats structure is no longer consumed by anything and no longer
tracks statistics that are particularly relevant to the concurrent
garbage collector. Remove it. (Having statistics is probably a good
idea, but these aren't the stats we need these days and we don't have
a way to get them out of the runtime.)
In preparation for #13613.
Change-Id: Ib63e2f9067850668f9dcbfd4ed89aab4a6622c3f
Reviewed-on: https://go-review.googlesource.com/34936
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
The iOS test harness was recently changed in response to lldb bugs
to replace breakpoints with the SIGUSR2 signal (CL 34926), and to
pass the current directory in the test binary arguments (CL 35152).
Both the signal sending and working directory setup is done from
the go test driver.
However, the new method doesn't work with tests where a C program is
the test driver instead of go test: the current working directory
will not be changed and SIGUSR2 is not raised.
Instead of copying that logic into any C test program, rework the
test harness (again) to move the setup logic to the early runtime
cgo setup code. That way, the harness will run even in the library
build modes.
Then, use the app Info.plist file to pass the working
directory, removing the need to alter the arguments after running.
Finally, use the SIGINT signal instead of SIGUSR2 to avoid
manipulating the signal masks or handlers.
Fixes the testcarchive tests on iOS.
With this CL, both darwin/arm and darwin/arm64 passes all.bash.
This CL replaces CL 34926, CL 35152 as well as the fixup CL
35123 and CL 35255. They are reverted in CLs earlier in the
relation chain.
Change-Id: I8485c7db1404fbd8daa261efd1ea89e905121a3e
Reviewed-on: https://go-review.googlesource.com/36090
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
In order to generate accurate tracebacks, the runtime needs to know the
inlined call stack for a given PC. This creates two tables per function
for this purpose. The first table is the inlining tree (stored in the
function's funcdata), which has a node containing the file, line, and
function name for every inlined call. The second table is a PC-value
table that maps each PC to a node in the inlining tree (or -1 if the PC
is not the result of inlining).
To give the appearance that inlining hasn't happened, the runtime also
needs the original source position information of inlined AST nodes.
Previously the compiler plastered over the line numbers of inlined AST
nodes with the line number of the call. This meant that the PC-line
table mapped each PC to line number of the outermost call in its inlined
call stack, with no way to access the innermost line number.
Now the compiler retains line numbers of inlined AST nodes and writes
the innermost source position information to the PC-line and PC-file
tables. Some tools and tests expect to see outermost line numbers, so we
provide the OutermostLine function for displaying line info.
To keep track of the inlined call stack for an AST node, we extend the
src.PosBase type with an index into a global inlining tree. Every time
the compiler inlines a call, it creates a node in the global inlining
tree for the call, and writes its index to the PosBase of every inlined
AST node. The parent of this node is the inlining tree index of the
call. -1 signifies no parent.
For each function, the compiler creates a local inlining tree and a
PC-value table mapping each PC to an index in the local tree. These are
written to an object file, which is read by the linker. The linker
re-encodes these tables compactly by deduplicating function names and
file names.
This change increases the size of binaries by 4-5%. For example, this is
how the go1 benchmark binary is impacted by this change:
section old bytes new bytes delta
.text 3.49M ± 0% 3.49M ± 0% +0.06%
.rodata 1.12M ± 0% 1.21M ± 0% +8.21%
.gopclntab 1.50M ± 0% 1.68M ± 0% +11.89%
.debug_line 338k ± 0% 435k ± 0% +28.78%
Total 9.21M ± 0% 9.58M ± 0% +4.01%
Updates #19348.
Change-Id: Ic4f180c3b516018138236b0c35e0218270d957d3
Reviewed-on: https://go-review.googlesource.com/37231
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
There are two accesses to mheap_.busy that are guarded by checks
against len(mheap_.free). This works because both lists are (and must
be) the same length, but it makes the code less clear. Change these to
use len(mheap_.busy) so the access more clearly parallels the check.
Fixes#18944.
Change-Id: I9bacbd3663988df351ed4396ae9018bc71018311
Reviewed-on: https://go-review.googlesource.com/36354
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently sweep counts the number of allocated objects, computes the
number of free objects from that, then re-computes the number of
allocated objects from that. Simplify and clean this up by skipping
these intermediate steps.
Change-Id: I3ed98e371eb54bbcab7c8530466c4ab5fde35f0a
Reviewed-on: https://go-review.googlesource.com/34935
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently we scan the finalizers queue both during concurrent mark and
during mark termination. This costs roughly 20ns per queued finalizer
and about 1ns per unused finalizer queue slot (allocated queue length
never decreases), which can drive up STW time if there are many
finalizers.
However, we only add finalizers to this queue during sweeping, which
means that the second scan will never find anything new. Hence, we can
fix this by simply not scanning the finalizers queue during mark
termination. This brings the STW time under the 100µs goal even with
1,000,000 queued finalizers.
Fixes#18869.
Change-Id: I4ce5620c66fb7f13ebeb39ca313ce57047d1d0fb
Reviewed-on: https://go-review.googlesource.com/36013
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Since workbuf is now marked go:notinheap, the write barrier-preventing
wrapper type wbufptr is no longer necessary. Remove it.
Change-Id: I3e5b5803a1547d65de1c1a9c22458a38e08549b7
Reviewed-on: https://go-review.googlesource.com/35971
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Some debugging code was recently added to:
1) provide more detail for the stale reason when it is
determined that a package is stale
2) provide file and package time and date information when
it is determined that runtime.a is stale
This backs out those those debugging messages.
Fixes#19116
Change-Id: I8dd0cbe29324820275b481d8bbb78ff2c5fbc362
Reviewed-on: https://go-review.googlesource.com/37382
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The comments in cmd/internal/obj/funcdata.go are identical to the
comments in runtime/funcdata.h, but the majority of the definitions
they refer to don't apply to Go sources and have been stripped out of
funcdata.go.
Remove these stale comments from funcdata.go and clean up the
references to other copies of the PCDATA and FUNCDATA indexes.
Change-Id: I5d6e49a6e586cc9aecd7c3ce1567679f2a605884
Reviewed-on: https://go-review.googlesource.com/37330
Reviewed-by: Keith Randall <khr@golang.org>
These functions are not defined and are not used.
Fixes#19290
Change-Id: I2978147220af83cf319f7439f076c131870fb9ee
Reviewed-on: https://go-review.googlesource.com/37448
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
If the caller passes a large number to Profile.Add,
the list of pcs is empty, which results in junk
(a nil pc) being recorded. Check for that explicitly,
and replace such stack traces with a lostProfileEvent.
Fixes#18836.
Change-Id: I99c96aa67dd5525cd239ea96452e6e8fcb25ce02
Reviewed-on: https://go-review.googlesource.com/36891
Reviewed-by: Russ Cox <rsc@golang.org>
The profiles are self-contained now.
Check that they work by themselves in the tests that invoke pprof,
but also keep checking that the old command lines work.
Change-Id: I24c74b5456f0b50473883c3640625c6612f72309
Reviewed-on: https://go-review.googlesource.com/37166
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
The existing code builds a full profile in memory.
Then it translates that profile into a data structure (in memory).
Then it marshals that data structure into a protocol buffer (in memory).
Then it gzips that marshaled form into the underlying writer.
So there are three copies of the full profile data in memory
at the same time before we're done. This is obviously dumb.
This CL implements a fully streaming conversion from
the original in-memory profile to the underlying writer.
There is now only one copy of the profile in memory.
For the non-CPU profiles, this is optimal, since we have to
have a full copy in memory to start with.
For the CPU profiles, we could still try to bound the profile
size stored in memory and stream fragments out during
the actual profiling, as Go 1.7 did (with a simpler format),
but so far that hasn't been necessary.
Change-Id: Ic36141021857791bf0cd1fce84178fb5e744b989
Reviewed-on: https://go-review.googlesource.com/37164
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
The old hash table was a place holder that allocates memory
during every lookup for key generation, even for keys that hit
in the the table.
Change-Id: I4f601bbfd349f0be76d6259a8989c9c17ccfac21
Reviewed-on: https://go-review.googlesource.com/37163
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
This doesn't change the functionality of the current code,
but it sets us up for exporting the profiling labels into the profile.
The old code had a hash table of profile samples maintained
during the signal handler, with evictions going into a log.
The new code just logs every sample directly, leaving the
hash-based deduplication to an ordinary goroutine.
The new code also avoids storing the entire profile in two
forms in memory, an unfortunate regression introduced
when binary profile support was added. After this CL the
entire profile is only stored once in memory. We'd still like
to get back down to storing it zero times (streaming it to
the underlying io.Writer).
Change-Id: I0893a1788267c564aa1af17970d47377b2a43457
Reviewed-on: https://go-review.googlesource.com/36712
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
It's common for some goroutines to loop calling time.Sleep.
Allocate once per goroutine, not every time.
This comes up in runtime/pprof's background reader.
Change-Id: I89d17dc7379dca266d2c9cd3aefc2382f5bdbade
Reviewed-on: https://go-review.googlesource.com/37162
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The existing CPU profiling buffer is a slice of uintptr, but we want to
start including profiling label data in the profiles, and those labels need
to be pointers in order to let them describe rich information.
This CL implements a new profBuf type that holds both a slice of uint64
for data and a slice of unsafe.Pointer for profiling labels (aka tags).
Making the runtime use these buffers will happen in followup CLs.
Change-Id: I9ff16b532d8edaf4ce0cbba1098229a561834efc
Reviewed-on: https://go-review.googlesource.com/36713
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This updates the testcase to display the timestamps for the
runtime.a, it dependent packages atomic.a and sys.a, and
source files.
Change-Id: Id2901b4e8aa8eb9775c4f404ac01cc07b394ba91
Reviewed-on: https://go-review.googlesource.com/37332
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Suggested by Dmitry in CL 36792 review.
Clearly safe since there are many different semaRoots
that could all have profiled sudogs calling mutexevent.
Change-Id: I45eed47a5be3e513b2dad63b60afcd94800e16d1
Reviewed-on: https://go-review.googlesource.com/37104
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
We have seen one instance of a production job suddenly spinning to
100% CPU and becoming unresponsive. In that one instance, a SIGQUIT
was sent after 328 minutes of spinning, and the stacks showed a single
goroutine in "IO wait (scan)" state.
Looking for things that might get stuck if a goroutine got stuck in
scanning a stack, we found that injectglist does:
lock(&sched.lock)
var n int
for n = 0; glist != nil; n++ {
gp := glist
glist = gp.schedlink.ptr()
casgstatus(gp, _Gwaiting, _Grunnable)
globrunqput(gp)
}
unlock(&sched.lock)
and that casgstatus spins on gp.atomicstatus until the _Gscan bit goes
away. Essentially, this code locks sched.lock and then while holding
sched.lock, waits to lock gp.atomicstatus.
The code that is doing the scan is:
if castogscanstatus(gp, s, s|_Gscan) {
if !gp.gcscandone {
scanstack(gp, gcw)
gp.gcscandone = true
}
restartg(gp)
break loop
}
More analysis showed that scanstack can, in a rare case, end up
calling back into code that acquires sched.lock. For example:
runtime.scanstack at proc.go:866
calls runtime.gentraceback at mgcmark.go:842
calls runtime.scanstack$1 at traceback.go:378
calls runtime.scanframeworker at mgcmark.go:819
calls runtime.scanblock at mgcmark.go:904
calls runtime.greyobject at mgcmark.go:1221
calls (*runtime.gcWork).put at mgcmark.go:1412
calls (*runtime.gcControllerState).enlistWorker at mgcwork.go:127
calls runtime.wakep at mgc.go:632
calls runtime.startm at proc.go:1779
acquires runtime.sched.lock at proc.go:1675
This path was found with an automated deadlock-detecting tool.
There are many such paths but they all go through enlistWorker -> wakep.
The evidence strongly suggests that one of these paths is what caused
the deadlock we observed. We're running those jobs with
GOTRACEBACK=crash now to try to get more information if it happens
again.
Further refinement and analysis shows that if we drop the wakep call
from enlistWorker, the remaining few deadlock cycles found by the tool
are all false positives caused by not understanding the effect of calls
to func variables.
The enlistWorker -> wakep call was intended only as a performance
optimization, it rarely executes, and if it does execute at just the
wrong time it can (and plausibly did) cause the deadlock we saw.
Comment it out, to avoid the potential deadlock.
Fixes#19112.
Unfixes #14179.
Change-Id: I6f7e10b890b991c11e79fab7aeefaf70b5d5a07b
Reviewed-on: https://go-review.googlesource.com/37093
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This changes the os package to use the runtime poller for file I/O
where possible. When a system call blocks on a pollable descriptor,
the goroutine will be blocked on the poller but the thread will be
released to run other goroutines. When using a non-pollable
descriptor, the os package will continue to use thread-blocking system
calls as before.
For example, on GNU/Linux, the runtime poller uses epoll. epoll does
not support ordinary disk files, so they will continue to use blocking
I/O as before. The poller will be used for pipes.
Since this means that the poller is used for many more programs, this
modifies the runtime to only block waiting for the poller if there is
some goroutine that is waiting on the poller. Otherwise, there is no
point, as the poller will never make any goroutine ready. This
preserves the runtime's current simple deadlock detection.
This seems to crash FreeBSD systems, so it is disabled on FreeBSD.
This is issue 19093.
Using the poller on Windows requires opening the file with
FILE_FLAG_OVERLAPPED. We should only do that if we can remove that
flag if the program calls the Fd method. This is issue 19098.
Update #6817.
Update #7903.
Update #15021.
Update #18507.
Update #19093.
Update #19098.
Change-Id: Ia5197dcefa7c6fbcca97d19a6f8621b2abcbb1fe
Reviewed-on: https://go-review.googlesource.com/36800
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Since we're no longer stealing space for the stack barrier array from
the stack allocation, the stack allocation is simply
g.stack.hi-g.stack.lo.
Updates #17503.
Change-Id: Id9b450ae12c3df9ec59cfc4365481a0a16b7c601
Reviewed-on: https://go-review.googlesource.com/36621
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Now that we don't rescan stacks, stack barriers are unnecessary. This
removes all of the code and structures supporting them as well as
tests that were specifically for stack barriers.
Updates #17503.
Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c
Reviewed-on: https://go-review.googlesource.com/36620
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
With the hybrid barrier, rescanning stacks is no longer necessary so
the rescan list is no longer necessary. Remove it.
This leaves the gcrescanstacks GODEBUG variable, since it's useful for
debugging, but changes it to simply walk all of the Gs to rescan
stacks rather than using the rescan list.
We could also remove g.gcscanvalid, which is effectively a distributed
rescan list. However, it's still useful for gcrescanstacks mode and it
adds little complexity, so we'll leave it in.
Fixes#17099.
Updates #17503.
Change-Id: I776d43f0729567335ef1bfd145b75c74de2cc7a9
Reviewed-on: https://go-review.googlesource.com/36619
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The wbshadow implementation was removed a year and a half ago in
1635ab7dfe, but the GODEBUG setting remained. Remove the GODEBUG
setting since it doesn't do anything.
Change-Id: I19cde324a79472aff60acb5cc9f7d4aa86c0c0ed
Reviewed-on: https://go-review.googlesource.com/36618
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
For vet. There are more. This is a start.
Change-Id: Ibbbb2b20b5db60ee3fac4a1b5913d18fab01f6b9
Reviewed-on: https://go-review.googlesource.com/36939
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Extend period of fastrand from (1<<31)-1 to (1<<32)-1 by
choosing other polynom and reacting on high bit before shift.
Polynomial is taken at https://users.ece.cmu.edu/~koopman/lfsr/index.html
from 32.dat.gz . It is referred as F7711115 cause this list of
polynomials is for LFSR with shift to right (and fastrand uses shift to
left). (old polynomial is referred in 31.dat.gz as 7BB88888).
There were couple of places with conversation of fastrand to int, which
leads to negative values on 32bit platforms. They are fixed.
Change-Id: Ibee518a3f9103e0aea220ada494b3aec77babb72
Reviewed-on: https://go-review.googlesource.com/36875
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This will make it possible to use the poller with the os package.
This is a lot of code movement but the behavior is intended to be
unchanged.
Update #6817.
Update #7903.
Update #15021.
Update #18507.
Change-Id: I1413685928017c32df5654ded73a2643820977ae
Reviewed-on: https://go-review.googlesource.com/36799
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
When doing i.(T) for non-empty-interface i and concrete type T,
there's no need to read the type out of the itab. Just compare the
itab to the itab we expect for that interface/type pair.
Also optimize type switches by putting the type hash of the
concrete type in the itab. That way we don't need to load the
type pointer out of the itab.
Update #18492
Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce
Reviewed-on: https://go-review.googlesource.com/34810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Based on sample code from iant.
Fixes#18788.
Change-Id: I6bb33ed05af2538fbde42ddcac629280ef7c00a6
Reviewed-on: https://go-review.googlesource.com/36892
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
If there are many goroutines contending for two different locks
and both locks hash to the same semaRoot, the scans to find the
goroutines for a particular lock can end up being O(n), making
n lock acquisitions quadratic.
As long as only one actively-used lock hashes to each semaRoot
there's no problem, since the list operations in that case are O(1).
But when the second actively-used lock hits the same semaRoot,
then scans for entries with for a given lock have to scan over the
entries for the other lock.
Fix this problem by changing the semaRoot to hold only one sudog
per unique address. In the running example, this drops the length of
that list from O(n) to 2. Then attach other goroutines waiting on the
same address to a separate list headed by the sudog in the semaRoot list.
Those "same address list" operations are still O(1), so now the
example from above works much better.
There is still an assumption here that in real programs you don't have
many many goroutines queueing up on many many distinct addresses.
If we end up with that problem, we can replace the top-level list with
a treap.
Fixes#17953.
Change-Id: I78c5b1a5053845275ab31686038aa4f6db5720b2
Reviewed-on: https://go-review.googlesource.com/36792
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
So it could be inlined.
Using bit-tricks it could be implemented without condition
(improved trick version by Minux Ma).
Simple benchmark shows it is faster on i386 and x86_64, though
I don't know will it be faster on other architectures?
benchmark old ns/op new ns/op delta
BenchmarkFastrand-3 2.79 1.48 -46.95%
BenchmarkFastrandHashiter-3 25.9 24.9 -3.86%
Change-Id: Ie2eb6d0f598c0bb5fac7f6ad0f8b5e3eddaa361b
Reviewed-on: https://go-review.googlesource.com/34782
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
If the user is calling SetGCPercent(-1), they intend to disable GC.
They probably don't intend to run one. If they do, they can call
runtime.GC themselves.
Change-Id: I40ef40dfc7e15193df9ff26159cd30e56b666f73
Reviewed-on: https://go-review.googlesource.com/34013
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
During the mark phase of garbage collection, goroutines that allocate
may be recruited to assist. This change creates trace events for mark
assists and displays them similarly to sweep assists in the trace
viewer.
Mark assists are different than sweeps in that they can be preempted, so
displaying them in the trace viewer is a little tricky -- we may need to
synthesize multiple slices for one mark assist. This could have been
done in the parser instead, but I thought it might be preferable to keep
the parser as true to the event stream as possible.
Change-Id: I381dcb1027a187a354b1858537851fa68a620ea7
Reviewed-on: https://go-review.googlesource.com/36015
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
These are very tightly coupled, and internal/protopprof is small.
There's no point to having a separate package.
Change-Id: I2c8aa49c9e18a7128657bf2b05323860151b5606
Reviewed-on: https://go-review.googlesource.com/36711
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The gcCompat mode was introduced to match the new parser's node position
setup exactly with the positions used by the original parser. Some of the
gcCompat adjustments were required to satisfy syntax error test cases,
and the rest were required to make toolstash cmp pass.
This change removes the former gcCompat adjustments and instead adjusts
the respective test cases as necessary. In some cases this makes the error
lines consistent with the ones reported by gccgo.
Where it has changed, the position associated with a given syntactic construct
is the position (line/col number) of the left-most token belonging to the
construct.
Change-Id: I5b60c00c5999a895c4d6d6e9b383c6405ccf725c
Reviewed-on: https://go-review.googlesource.com/36695
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This ensures that SIGPROF is handled correctly when using
runtime/pprof in a c-archive or c-shared library.
Separate profiler handling into pre-process changes and per-thread
changes. Simplify the Windows code slightly accordingly.
Fixes#18220.
Change-Id: I5060f7084c91ef0bbe797848978bdc527c312777
Reviewed-on: https://go-review.googlesource.com/34018
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Fetch both monotonic and wall time together when possible.
Avoids skew and is cheaper.
Also shave a few ns off in conversion in package time.
Compared to current implementation (after monotonic changes):
name old time/op new time/op delta
Now 19.6ns ± 1% 9.7ns ± 1% -50.63% (p=0.000 n=41+49) darwin/amd64
Now 23.5ns ± 4% 10.6ns ± 5% -54.61% (p=0.000 n=30+28) windows/amd64
Now 54.5ns ± 5% 29.8ns ± 9% -45.40% (p=0.000 n=27+29) windows/386
More importantly, compared to Go 1.8:
name old time/op new time/op delta
Now 9.5ns ± 1% 9.7ns ± 1% +1.94% (p=0.000 n=41+49) darwin/amd64
Now 12.9ns ± 5% 10.6ns ± 5% -17.73% (p=0.000 n=30+28) windows/amd64
Now 15.3ns ± 5% 29.8ns ± 9% +94.36% (p=0.000 n=30+29) windows/386
This brings time.Now back in line with Go 1.8 on darwin/amd64 and windows/amd64.
It's not obvious why windows/386 is still noticeably worse than Go 1.8,
but it's better than before this CL. The windows/386 speed is not too
important; the changes just keep the two architectures similar.
Change-Id: If69b94970c8a1a57910a371ee91e0d4e82e46c5d
Reviewed-on: https://go-review.googlesource.com/36428
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The fwdSig array is accessed by the signal handler, which may run in
parallel with other threads manipulating it via the os/signal package.
Use atomic accesses to ensure that there are no problems.
Move the _SigHandling flag out of the sigtable array. This makes sigtable
immutable and safe to read from the signal handler.
Change-Id: Icfa407518c4ebe1da38580920ced764898dfc9ad
Reviewed-on: https://go-review.googlesource.com/36321
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently both _MaxMem and _MaxArena32 represent the maximum arena
size on 32-bit hosts (except on MIPS32 where _MaxMem is confusingly
smaller than _MaxArena32).
Clean up sysAlloc so that it always uses _MaxMem, which is the maximum
arena size on both 32- and 64-bit architectures and is the arena size
we allocate auxiliary structures for. This lets us simplify and unify
some code paths and eliminate _MaxArena32.
Fixes#18651. mheap.sysAlloc currently assumes that if the arena is
small, we must be on a 32-bit machine and can therefore grow the arena
to _MaxArena32. This breaks down on darwin/arm64, where _MaxMem is
only 2 GB. As a result, on darwin/arm64, we only reserve spans and
bitmap space for a 2 GB heap, and if the application tries to allocate
beyond that, sysAlloc takes the 32-bit path, tries to grow the arena
beyond 2 GB, and panics when it tries to grow the spans array
allocation past its reserved size. This has probably been a problem
for several releases now, but was only noticed recently because
mapSpans didn't check the bounds on the span reservation until
recently. Most likely it corrupted the bitmap before. By using _MaxMem
consistently, we avoid thinking that we can grow the arena larger than
we have auxiliary structures for.
Change-Id: Ifef28cb746a3ead4b31c1d7348495c2242fef520
Reviewed-on: https://go-review.googlesource.com/35253
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Elias Naur <elias.naur@gmail.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
mallocinit has evolved organically. Make a pass to clean it up in
various ways:
1. Merge the computation of spansSize and bitmapSize. These were
computed on every loop iteration of two different loops, but always
have the same value, which can be derived directly from _MaxMem.
This also avoids over-reserving these on MIPS, were _MaxArena32 is
larger than _MaxMem.
2. Remove the ulimit -v logic. It's been disabled for many releases
and the dead code paths to support it are even more wrong now than
they were when it was first disabled, since now we *must* reserve
spans and bitmaps for the full address space.
3. Make it clear that we're using a simple linear allocation to lay
out the spans, bitmap, and arena spaces. Previously there were a
lot of redundant pointer computations. Now we just bump p1 up as we
reserve the spaces.
In preparation for #18651.
Updates #5049 (respect ulimit).
Change-Id: Icbe66570d3a7a17bea227dc54fb3c4978b52a3af
Reviewed-on: https://go-review.googlesource.com/35252
Reviewed-by: Russ Cox <rsc@golang.org>
Currently _MaxMem is a uintptr, which is going to complicate some
further changes. Make it untyped so we'll be able to do untyped math
on it before truncating it to a uintptr.
The runtime assembly is identical before and after this change on
{linux,windows}/{amd64,386}.
Updates #18651.
Change-Id: I0f64511faa9e0aa25179a556ab9f185ebf8c9cf8
Reviewed-on: https://go-review.googlesource.com/35251
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
This change defines runtime/pprof.SetGoroutineLabels and runtime/pprof.Do, which
are used to set profiler labels on goroutines. The change defines functions
in the runtime for setting and getting profile labels, and sets and unsets
profile labels when goroutines are created and deleted. The change also adds
the package runtime/internal/proflabel, which defines the structure the runtime
uses to store profile labels.
Change-Id: I747a4400141f89b6e8160dab6aa94ca9f0d4c94d
Reviewed-on: https://go-review.googlesource.com/34198
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-on: https://go-review.googlesource.com/35010
This change defines WithLabels, Labels, Label, and ForLabels.
This is the first step of the profile labels implemention for go 1.9.
Updates #17280
Change-Id: I2dfc9aae90f7a4aa1ff7080d5747f0a1f0728e75
Reviewed-on: https://go-review.googlesource.com/34198
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
It's not used, it's never been used, and it doesn't do what its doc
comment says it does.
Fixes#18941.
Change-Id: Ia89d97fb87525f5b861d7701f919e0d6b7cbd376
Reviewed-on: https://go-review.googlesource.com/36322
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Before this CL, Go programs in c-archive or c-shared buildmodes
would not handle SIGPIPE. That leads to surprising behaviour where
writes on a closed pipe or socket would raise SIGPIPE and terminate
the program. This CL changes the Go runtime to handle
SIGPIPE regardless of buildmode. In addition, SIGPIPE from non-Go
code is forwarded.
This is a refinement of CL 32796 that fixes the case where a non-default
handler for SIGPIPE is installed by the host C program.
Fixes#17393
Change-Id: Ia41186e52c1ac209d0a594bae9904166ae7df7de
Reviewed-on: https://go-review.googlesource.com/35960
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
sigtramp was calling sigtrampgo and depending on the fact that
the 3rd argument slot will not be modified on return. Our calling
convention doesn't guarantee that. Avoid that assumption.
There's no actual bug here, as sigtrampgo does not in fact modify its
argument slots. But I found this while working on the dead stack slot
clobbering tool. https://go-review.googlesource.com/c/23924/
Change-Id: Ia7e791a2b4c1c74fff24cba8169e7840b4b06ffc
Reviewed-on: https://go-review.googlesource.com/36216
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
It seems the problem is on gdb and the dynamic linker. Skip the
test for now until we figure out what's going on with the system.
Updates #18784.
Change-Id: Ic9320ffd463f6c231b2c4192652263b1cf7f4231
Reviewed-on: https://go-review.googlesource.com/36250
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The existing darwin/amd64 implementation of runtime.nanotime returns the
wallclock time, which results in timers not functioning properly when
system time runs backwards. By implementing the algorithm used by the
darwin syscall mach_absolute_time, timers will function as expected.
The algorithm is described at
https://opensource.apple.com/source/xnu/xnu-3248.60.10/libsyscall/wrappers/mach_absolute_time.sFixes#17610
Change-Id: I9c8d35240d48249a6837dca1111b1406e2686f67
Reviewed-on: https://go-review.googlesource.com/35292
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
For #18130.
f8b4123613 [dev.typealias] spec: use term 'embedded field' rather than 'anonymous field'
9ecc3ee252 [dev.typealias] cmd/compile: avoid false positive cycles from type aliases
49b7af8a30 [dev.typealias] reflect: add test for type aliases
9bbb07ddec [dev.typealias] cmd/compile, reflect: fix struct field names for embedded byte, rune
43c7094386 [dev.typealias] reflect: fix StructOf use of StructField to match StructField docs
9657e0b077 [dev.typealias] cmd/doc: update for type alias
de2e5459ae [dev.typealias] cmd/compile: declare methods after resolving receiver type
9259f3073a [dev.typealias] test: match gccgo error messages on alias2.go
5d92916770 [dev.typealias] cmd/compile: change Func.Shortname to *Sym
a7c884efc1 [dev.typealias] go/internal/gccgoimporter: support for type aliases
5802cfd900 [dev.typealias] cmd/compile: export/import test cases for type aliases
d7cabd40dd [dev.typealias] go/types: clarified doc string
cc2dcce3d7 [dev.typealias] cmd/compile: a few better comments related to alias types
5c160b28ba [dev.typealias] cmd/compile: improved error message for cyles involving type aliases
b2386dffa1 [dev.typealias] cmd/compile: type-check type alias declarations
ac8421f9a5 [dev.typealias] cmd/compile: various minor cleanups
f011e0c6c3 [dev.typealias] cmd/compile, go/types, go/importer: various alias related fixes
49de5f0351 [dev.typealias] cmd/compile, go/importer: define export format and implement importing of type aliases
5ceec42dc0 [dev.typealias] go/types: export TypeName.IsAlias so clients can use it
aa1f0681bc [dev.typealias] go/types: improved Object printing
c80748e389 [dev.typealias] go/types: remove some more vestiges of prior alias implementation
80d8b69e95 [dev.typealias] go/types: implement type aliases
a917097b5e [dev.typealias] go/build: add go1.9 build tag
3e11940437 [dev.typealias] cmd/compile: recognize type aliases but complain for now (not yet supported)
e0a05c274a [dev.typealias] cmd/gofmt: added test cases for alias type declarations
2e5116bd99 [dev.typealias] go/ast, go/parser, go/printer, go/types: initial type alias support
Change-Id: Ia65f2e011fd7195f18e1dce67d4d49b80a261203
This avoids errors like
./traceback.go:80:2: call of non-function C.f1
I filed https://gcc.gnu.org/PR79289 for the GCC problem. I think this
is a bug in GCC, and it may be fixed before the final GCC 7 release.
This CL is correct either way.
Fixes#18855.
Change-Id: I0785a7b7c5b1d0ca87b454b5eca9079f390fcbd4
Reviewed-on: https://go-review.googlesource.com/35919
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Modules appear in the moduledata linked list in the order they are
loaded by the dynamic loader, with one exception: the
firstmoduledata itself the module that contains the runtime.
This is not always the first module (when using -buildmode=shared,
it is typically libstd.so, the second module).
The order matters for typelinksinit, so we swap the first module
with whatever module contains the main function.
Updates #18729
This fixes the test case extracted with -linkshared, and now
go test -linkshared encoding/...
passes. However the original issue about a plugin failure is not
yet fixed.
Change-Id: I9f399ecc3518e22e6b0a350358e90b0baa44ac96
Reviewed-on: https://go-review.googlesource.com/35644
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Will also fix type aliases.
Fixes#17766.
For #18130.
Change-Id: I9e1584d47128782152e06abd0a30ef423d5c30d2
Reviewed-on: https://go-review.googlesource.com/35732
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Otherwise we don't emit any required ELF relocations when doing an
external link, because elfrelocsect skips unreachable symbols.
Fixes#18745.
Change-Id: Ia3583c41bb6c5ebb7579abd26ed8689370311cd6
Reviewed-on: https://go-review.googlesource.com/35590
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
memmove used to use 2 2-byte load/store pairs to move 4 bytes.
When the result is loaded with a single 4-byte load, it caused
a store to load fowarding stall. To avoid the stall,
special case memmove to use 4 byte ops for the 4 byte copy case.
We already have a special case for 8-byte copies.
386 already specializes 4-byte copies.
I'll do 2-byte copies also, but not for 1.8.
benchmark old ns/op new ns/op delta
BenchmarkIssue18740-8 7567 4799 -36.58%
3-byte copies get a bit slower. Other copies are unchanged.
name old time/op new time/op delta
Memmove/3-8 4.76ns ± 5% 5.26ns ± 3% +10.50% (p=0.000 n=10+10)
Fixes#18740
Change-Id: Iec82cbac0ecfee80fa3c8fc83828f9a1819c3c74
Reviewed-on: https://go-review.googlesource.com/35567
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Currently we check that all roots are marked as soon as gcMarkDone
decides to transition from mark 1 to mark 2. However, issue #16083
indicates that there may be a race where we try to complete mark 1
while a worker is still scanning a stack, causing the root mark check
to fail.
We don't yet understand this race, but as a simple mitigation, move
the root check to after gcMarkDone performs a ragged barrier, which
will force any remaining workers to finish their current job.
Updates #16083. This may "fix" it, but it would be better to
understand and fix the underlying race.
Change-Id: I1af9ce67bd87ade7bc2a067295d79c28cd11abd2
Reviewed-on: https://go-review.googlesource.com/35353
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
We already do this for shared libraries. Do it for plugins also.
Suggestions on how to test this would be welcome.
I'd like to get this in for 1.8. It could lead to mysterious
hangs when using plugins.
Fixes#18676
Change-Id: I03209b096149090b9ba171c834c5e59087ed0f92
Reviewed-on: https://go-review.googlesource.com/35117
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Use R11 (a caller-saved temp register) instead of RBX (a callee-saved
register).
I believe this only affects linux/amd64, since it is the only platform
with a non-trivial cgoSigtramp implementation.
Updates #18328.
Change-Id: I3d35c4512624184d5a8ece653fa09ddf50e079a2
Reviewed-on: https://go-review.googlesource.com/35068
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Loop breaking with a counter. Benchmarked (see comments),
eyeball checked for sanity on popular loops. This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.
Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test. Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.
If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop. Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.
This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.
Updates #17831.
Updates #10958.
Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Change-Id: I429637ca91f7db4144f17621de851a548dc1ce76
Reviewed-on: https://go-review.googlesource.com/34923
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
To implement the blocking of a select, a goroutine builds a list of
offers to communicate (pseudo-g's, aka sudog), one for each case,
queues them on the corresponding channels, and waits for another
goroutine to complete one of those cases and wake it up. Obviously it
is not OK for two other goroutines to complete multiple cases and both
wake the goroutine blocked in select. To make sure that only one
branch of the select is chosen, all the sudogs contain a pointer to a
shared (single) 'done uint32', which is atomically cas'ed by any
interested goroutines. The goroutine that wins the cas race gets to
wake up the select. A complication is that 'done uint32' is stored on
the stack of the goroutine running the select, and that stack can move
during the select due to stack growth or stack shrinking.
The relevant ordering to block and unblock in select is:
1. Lock all channels.
2. Create list of sudogs and queue sudogs on all channels.
3. Switch to system stack, mark goroutine as asleep,
unlock all channels.
4. Sleep until woken.
5. Wake up on goroutine stack.
6. Lock all channels.
7. Dequeue sudogs from all channels.
8. Free list of sudogs.
9. Unlock all channels.
There are two kinds of stack moves: stack growth and stack shrinking.
Stack growth happens while the original goroutine is running.
Stack shrinking happens asynchronously, during garbage collection.
While a channel listing a sudog is locked by select in this process,
no other goroutine can attempt to complete communication on that
channel, because that other goroutine doesn't hold the lock and can't
find the sudog. If the stack moves while all the channel locks are
held or when the sudogs are not yet or no longer queued in the
channels, no problem, because no goroutine can get to the sudogs and
therefore to selectdone. We only need to worry about the stack (and
'done uint32') moving with the sudogs queued in unlocked channels.
Stack shrinking can happen any time the goroutine is stopped.
That code already acquires all the channel locks before doing the
stack move, so it avoids this problem.
Stack growth can happen essentially any time the original goroutine is
running on its own stack (not the system stack). In the first half of
the select, all the channels are locked before any sudogs are queued,
and the channels are not unlocked until the goroutine has stopped
executing on its own stack and is asleep, so that part is OK. In the
second half of the select, the goroutine wakes up on its own goroutine
stack and immediately locks all channels. But the actual call to lock
might grow the stack, before acquiring any locks. In that case, the
stack is moving with the sudogs queued in unlocked channels. Not good.
One goroutine has already won a cas on the old stack (that goroutine
woke up the selecting goroutine, moving it out of step 4), and the
fact that done = 1 now should prevent any other goroutines from
completing any other select cases. During the stack move, however,
sudog.selectdone is moved from pointing to the old done variable on
the old stack to a new memory location on the new stack. Another
goroutine might observe the moved pointer before the new memory
location has been initialized. If the new memory word happens to be
zero, that goroutine might win a cas on the new location, thinking it
can now complete the select (again). It will then complete a second
communication (reading from or writing to the goroutine stack
incorrectly) and then attempt to wake up the selecting goroutine,
which is already awake.
The scribbling over the goroutine stack unexpectedly is already bad,
but likely to go unnoticed, at least immediately. As for the second
wakeup, there are a variety of ways it might play out.
* The goroutine might not be asleep.
That will produce a runtime crash (throw) like in #17007:
runtime: gp: gp=0xc0422dcb60, goid=2299, gp->atomicstatus=8
runtime: g: g=0xa5cfe0, goid=0, g->atomicstatus=0
fatal error: bad g->status in ready
Here, atomicstatus=8 is copystack; the second, incorrect wakeup is
observing that the selecting goroutine is in state "Gcopystack"
instead of "Gwaiting".
* The goroutine might be sleeping in a send on a nil chan.
If it wakes up, it will crash with 'fatal error: unreachable'.
* The goroutine might be sleeping in a send on a non-nil chan.
If it wakes up, it will crash with 'fatal error: chansend:
spurious wakeup'.
* The goroutine might be sleeping in a receive on a nil chan.
If it wakes up, it will crash with 'fatal error: unreachable'.
* The goroutine might be sleeping in a receive on a non-nil chan.
If it wakes up, it will silently (incorrectly!) continue as if it
received a zero value from a closed channel, leaving a sudog queued on
the channel pointing at that zero vaue on the goroutine's stack; that
space will be reused as the goroutine executes, and when some other
goroutine finally completes the receive, it will do a stray write into
the goroutine's stack memory, which may cause problems. Then it will
attempt the real wakeup of the goroutine, leading recursively to any
of the cases in this list.
* The goroutine might have been running a select in a finalizer
(I hope not!) and might now be sleeping waiting for more things to
finalize. If it wakes up, as long as it goes back to sleep quickly
(before the real GC code tries to wake it), the spurious wakeup does
no harm (but the stack was still scribbled on).
* The goroutine might be sleeping in gcParkAssist.
If it wakes up, that will let the goroutine continue executing a bit
earlier than we would have liked. Eventually the GC will attempt the
real wakeup of the goroutine, leading recursively to any of the cases
in this list.
* The goroutine cannot be sleeping in bgsweep, because the background
sweepers never use select.
* The goroutine might be sleeping in netpollblock.
If it wakes up, it will crash with 'fatal error: netpollblock:
corrupted state'.
* The goroutine might be sleeping in main as another thread crashes.
If it wakes up, it will exit(0) instead of letting the other thread
crash with a non-zero exit status.
* The goroutine cannot be sleeping in forcegchelper,
because forcegchelper never uses select.
* The goroutine might be sleeping in an empty select - select {}.
If it wakes up, it will return to the next line in the program!
* The goroutine might be sleeping in a non-empty select (again).
In this case, it will wake up spuriously, with gp.param == nil (no
reason for wakeup), but that was fortuitously overloaded for handling
wakeup due to a closing channel and the way it is handled is to rerun
the select, which (accidentally) handles the spurious wakeup
correctly:
if cas == nil {
// This can happen if we were woken up by a close().
// TODO: figure that out explicitly so we don't need this loop.
goto loop
}
Before looping, it will dequeue all the sudogs on all the channels
involved, so that no other goroutine will attempt to wake it.
Since the goroutine was blocked in select before, being blocked in
select again when the spurious wakeup arrives may be quite likely.
In this case, the spurious wakeup does no harm (but the stack was
still scribbled on).
* The goroutine might be sleeping in semacquire (mutex slow path).
If it wakes up, that is taken as a signal to try for the semaphore
again, not a signal that the semaphore is now held, but the next
iteration around the loop will queue the sudog a second time, causing
a cycle in the wakeup list for the given address. If that sudog is the
only one in the list, when it is eventually dequeued, it will
(due to the precise way the code is written) leave the sudog on the
queue inactive with the sudog broken. But the sudog will also be in
the free list, and that will eventually cause confusion.
* The goroutine might be sleeping in notifyListWait, for sync.Cond.
If it wakes up, (*Cond).Wait returns. The docs say "Unlike in other
systems, Wait cannot return unless awoken by Broadcast or Signal,"
so the spurious wakeup is incorrect behavior, but most callers do not
depend on that fact. Eventually the condition will happen, attempting
the real wakeup of the goroutine and leading recursively to any of the
cases in this list.
* The goroutine might be sleeping in timeSleep aka time.Sleep.
If it wakes up, it will continue running, leaving a timer ticking.
When that time bomb goes off, it will try to ready the goroutine
again, leading to any one of the cases in this list.
* The goroutine cannot be sleeping in timerproc,
because timerproc never uses select.
* The goroutine might be sleeping in ReadTrace.
If it wakes up, it will print 'runtime: spurious wakeup of trace
reader' and return nil. All future calls to ReadTrace will print
'runtime: ReadTrace called from multiple goroutines simultaneously'.
Eventually, when trace data is available, a true wakeup will be
attempted, leading to any one of the cases in this list.
None of these fatal errors appear in any of the trybot or dashboard
logs. The 'bad g->status in ready' that happens if the goroutine is
running (the most likely scenario anyway) has happened once on the
dashboard and eight times in trybot logs. Of the eight, five were
atomicstatus=8 during net/http tests, so almost certainly this bug.
The other three were atomicstatus=2, all near code in select,
but in a draft CL by Dmitry that was rewriting select and may or may
not have had its own bugs.
This bug has existed since Go 1.4. Until then the select code was
implemented in C, 'done uint32' was a C stack variable 'uint32 done',
and C stacks never moved. I believe it has become more common recently
because of Brad's work to run more and more tests in net/http in
parallel, which lengthens race windows.
The fix is to run step 6 on the system stack,
avoiding possibility of stack growth.
Fixes#17007 and possibly other mysterious failures.
Change-Id: I9d6575a51ac96ae9d67ec24da670426a4a45a317
Reviewed-on: https://go-review.googlesource.com/34835
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This adds high-level descriptions of the scheduler structures, the
user and system stacks, error handling, and synchronization.
Change-Id: I1eed97c6dd4a6e3d351279e967b11c6e64898356
Reviewed-on: https://go-review.googlesource.com/34290
Reviewed-by: Rick Hudson <rlh@golang.org>
The comment describing the overall GC algorithm at the top of mgc.go
has gotten woefully out-of-date (and was possibly never
correct/complete). Update it to reflect the current workings of the
GC and the set of phases that we now divide it into.
Change-Id: I02143c0ebefe9d4cd7753349dab8045f0973bf95
Reviewed-on: https://go-review.googlesource.com/34711
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, the check for legal pointers in stack copying uses
_PageSize (8K) as the minimum legal pointer. By default, Linux won't
let you map under 64K, but
1) it's less clear what other OSes allow or will allow in the future;
2) while mapping the first page is a terrible idea, mapping anywhere
above that is arguably more justifiable;
3) the compiler only assumes the first physical page (4K) is never
mapped.
Make the runtime consistent with the compiler and more robust by
changing the bad pointer check to use 4K as the minimum legal pointer.
This came out of discussions on CLs 34663 and 34719.
Change-Id: Idf721a788bd9699fb348f47bdd083cf8fa8bd3e5
Reviewed-on: https://go-review.googlesource.com/34890
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
The existing implementations on AMD64 only detects AVX2 usability,
when they also contains BMI (bit-manipulation instructions).
These instructions crash the running program as 'unknown instructions'
on the architecture, e.g. i3-4000M, which supports AVX2 but not
support BMI.
This change added the detections for BMI1 and BMI2 to AMD64 runtime with
two flags as the result, `support_bmi1` and `support_bmi2`,
in runtime/runtime2.go. It also completed the condition to run AVX2 version
in packages crypto/sha1 and crypto/sha256.
Fixes#18512
Change-Id: I917bf0de365237740999de3e049d2e8f2a4385ad
Reviewed-on: https://go-review.googlesource.com/34850
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Android on ChromeOS uses a restrictive seccomp filter that blocks
sched_getaffinity, leading this code to index a slice by -errno.
Change-Id: Iec09a4f79dfbc17884e24f39bcfdad305de75b37
Reviewed-on: https://go-review.googlesource.com/34794
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Fixes misc/cgo/testsigfwd, enabled for mips{,le} with the next commit
(https://golang.org/cl/34646).
Change-Id: I2bec894b0492fd4d84dd73a4faa19eafca760107
Reviewed-on: https://go-review.googlesource.com/34645
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
CL 33652 removed the fake auxv for Android, and replaced it with
a /proc/self/auxv fallback. When /proc/self/auxv is unreadable,
however, hardware capabilities detection won't work and the runtime
will mistakenly think that floating point hardware is unavailable.
Fix this by always assuming floating point hardware on Android.
Manually tested on a Nexus 5 running Android 6.0.1. I suspect the
android/arm builder has a readable /proc/self/auxv and therefore
does not trigger the failure mode.
Change-Id: I95c3873803f9e17333c6cb8b9ff2016723104085
Reviewed-on: https://go-review.googlesource.com/34641
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
On Windows, CreateThread occasionally fails with ERROR_ACCESS_DENIED.
We're not sure why this is, but the Wine source code suggests that
this can happen when there's a concurrent CreateThread and ExitProcess
in the same process.
Fix this by setting a flag right before calling ExitProcess and
halting if CreateThread fails and this flag is set.
Updates #18253 (might fix it, but we're not sure this is the issue and
can't reproduce it on demand).
Change-Id: I1945b989e73a16cf28a35bf2613ffab07577ed4e
Reviewed-on: https://go-review.googlesource.com/34616
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Stop-the-world and freeze-the-world (used for unhandled panics) are
currently not safe to do at the same time. While a regular unhandled
panic can't happen concurrently with STW (if the P hasn't been
stopped, then the panic blocks the STW), a panic from a _SigThrow
signal can happen on an already-stopped P, racing with STW. When this
happens, freezetheworld sets sched.stopwait to 0x7fffffff and
stopTheWorldWithSema panics because sched.stopwait != 0.
Fix this by detecting when freeze-the-world happens before
stop-the-world has completely stopped the world and freeze the STW
operation rather than panicking.
Fixes#17442.
Change-Id: I646a7341221dd6d33ea21d818c2f7218e2cb7e20
Reviewed-on: https://go-review.googlesource.com/34611
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The runtime no longer hard-codes the offset of
reflect.methodValue.stack, so remove these obsolete comments. Also,
reflect.methodValue and runtime.reflectMethodValue must also agree
with reflect.makeFuncImpl, so update the comments on all three to
mention this.
This was pointed out by Minux on CL 31138.
Change-Id: Ic5ed1beffb65db76aca2977958da35de902e8e58
Reviewed-on: https://go-review.googlesource.com/34590
Reviewed-by: Keith Randall <khr@golang.org>
golang.org/issue/17594 was caused by additab being called more than once for
an itab. golang.org/cl/32131 fixed that by making the itabs local symbols,
but that in turn causes golang.org/issue/18252 because now there are now
multiple itab symbols in a process for a given (type,interface) pair and
different code paths can end up referring to different itabs which breaks
lots of reflection stuff. So this makes itabs global again and just takes
care to only call additab once for each itab.
Fixes#18252
Change-Id: I781a193e2f8dd80af145a3a971f6a25537f633ea
Reviewed-on: https://go-review.googlesource.com/34173
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
It takes me several minutes every time I want to find where the linker
writes out the _func structures. Add some comments to make this
easier.
Change-Id: Ic75ce2786ca4b25726babe3c4fe9cd30c85c34e2
Reviewed-on: https://go-review.googlesource.com/34390
Reviewed-by: Ian Lance Taylor <iant@golang.org>
In the sampling tests, let the test pass if we get at least 10 samples.
Fixes#18332.
Change-Id: I8aad083d1a0ba179ad6663ff43f6b6b3ce1e18cd
Reviewed-on: https://go-review.googlesource.com/34507
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This fixes Linux and the *BSD platforms on 386/amd64.
A few OS/arch combinations were already saving registers and/or doing
something that doesn't clearly resemble the SysV C ABI; those have
been left alone.
Fixes#18328.
Change-Id: I6398f6c71020de108fc8b26ca5946f0ba0258667
Reviewed-on: https://go-review.googlesource.com/34501
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
I meant to say ~7, instead of ^7, in the review.
Fix build.
Change-Id: I5060bbcd98b4ab6f00251fdb68b6b35767e5acf1
Reviewed-on: https://go-review.googlesource.com/34411
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Explicitly filter any C-only cgo functions out of pclntable,
which allows them to be duplicated with the host binary.
Updates #18190.
Change-Id: I50d8706777a6133b3e95f696bc0bc586b84faa9e
Reviewed-on: https://go-review.googlesource.com/34199
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Also, if we changed the gsignal stack to match the stack we are
executing on, restore it when returning from the signal handler, for
safety.
Fixes#18255.
Change-Id: Ic289b36e4e38a56f8a6d4b5d74f68121c242e81a
Reviewed-on: https://go-review.googlesource.com/34239
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Change the openbsd runtime to use the current sys_kill and sys_thrkill
system calls.
Prior to OpenBSD 5.9 the sys_kill system call could be used with both
processes and threads. In OpenBSD 5.9 this functionality was split into
a sys_kill system call for processes (with a new syscall number) and a
sys_thrkill system call for threads. The original/legacy system call was
retained in OpenBSD 5.9 and OpenBSD 6.0, however has been removed and
will not exist in the upcoming OpenBSD 6.1 release.
Note: This change is needed to make Go work on OpenBSD 6.1 (to be
released in May 2017) and should be included in the Go 1.8 release.
This change also drops support for OpenBSD 5.8, which is already an
unsupported OpenBSD release.
Change-Id: I525ed9b57c66c0c6f438dfa32feb29c7eefc72b0
Reviewed-on: https://go-review.googlesource.com/34093
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Must add locations to the profile when generating a profile.proto.
This fixes#18229
Change-Id: I49cd63a30759d3fe8960d7b7c8bd5a554907f8d1
Reviewed-on: https://go-review.googlesource.com/34028
Reviewed-by: Michael Matloob <matloob@golang.org>
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This adds a counter for the number of times the application forced a
GC by, e.g., calling runtime.GC(). This is useful for detecting
applications that are overusing/abusing runtime.GC() or
debug.FreeOSMemory().
Fixes#18217.
Change-Id: I990ab7a313c1b3b7a50a3d44535c460d7c54f47d
Reviewed-on: https://go-review.googlesource.com/34067
Reviewed-by: Russ Cox <rsc@golang.org>
When we copy the stack, we need to adjust all BPs.
We correctly adjust the ones on the stack, but we also
need to adjust the one that is in g.sched.bp.
Like CL 33754, no test as only kernel-gathered profiles will notice.
Tests will come (in 1.9) with the implementation of #16638.
The invariant should hold that every frame pointer points to
somewhere within its stack. After this CL, it is mostly true, but
something about cgo breaks it. The runtime checks are disabled
until I figure that out.
Update #16638Fixes#18174
Change-Id: I6023ee64adc80574ee3e76491d4f0fa5ede3dbdb
Reviewed-on: https://go-review.googlesource.com/33895
Reviewed-by: Austin Clements <austin@google.com>
For reasons that I do not know, OpenBSD does not call pthread_create
directly, but instead looks it up in libpthread.so. That means that we
can't use the code used on other systems to retry pthread_create on
EAGAIN, since that code simply calls pthread_create.
This patch copies that code to an OpenBSD-specific version.
Also, check for an EAGAIN failure in the test, as that seems to be the
underlying cause of the test failure on several systems including OpenBSD.
Fixes#18146.
Change-Id: I3bceaa1e03a7eaebc2da19c9cc146b25b59243ef
Reviewed-on: https://go-review.googlesource.com/33905
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Seems to be fixed according to discussion on issue 16396.
Fixes#16396.
Change-Id: Ibac7037a24280204e48cb4d3000af524f65afd36
Reviewed-on: https://go-review.googlesource.com/33903
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Commit 303b69fe packed bitvectors more tightly, but missed a comment
describing their old layout. Update that comment.
Change-Id: I095ccb01f245197054252545f37b40605a550dec
Reviewed-on: https://go-review.googlesource.com/33718
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This reverts commit d24b57a6a1.
Reason for revert: Further complications arised (issue 18100). We'll try again in Go 1.9.
Change-Id: I5ca93d2643a4be877dd9c2d8df3359718440f02f
Reviewed-on: https://go-review.googlesource.com/33770
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
From the garbage collector's perspective, time can move backwards in
cgocall. However, in the midst of this time warp, the pointer
arguments to cgocall can go from dead back to live. If a stack growth
happens while they're dead and then a GC happens when they become live
again, GC can crash with a bad heap pointer.
Specifically, the sequence that leads to a panic is:
1. cgocall calls entersyscall, which saves the PC and SP of its call
site in cgocall. Call this PC/SP "X". At "X" both pointer arguments
are live.
2. cgocall calls asmcgocall. Call the PC/SP of this call "Y". At "Y"
neither pointer argument is live.
3. asmcgocall calls the C code, which eventually calls back into the
Go code.
4. cgocallbackg remembers the saved PC/SP "X" in some local variables,
calls exitsyscall, and then calls cgocallbackg1.
5. The Go code causes a stack growth. This stack unwind sees PC/SP "Y"
in the cgocall frame. Since the arguments are dead at "Y", they are
not adjusted.
6. The Go code returns to cgocallbackg1, which calls reentersyscall
with the recorded saved PC/SP "X", so "X" gets stashed back into
gp.syscallpc/sp.
7. GC scans the stack. It sees there's a saved syscall PC/SP, so it
starts the traceback at PC/SP "X". At "X" the arguments are considered
live, so it scans them, but since they weren't adjusted, the pointers
are bad, so it panics.
This issue started as of commit ca4089ad, when the compiler stopped
marking arguments as live for the whole function.
Since this is a variable liveness issue, fix it by adding KeepAlive
calls that keep the arguments live across this whole time warp.
The existing issue7978 test has all of the infrastructure for testing
this except that it's currently up to chance whether a stack growth
happens in the callback (it currently only happens on the
linux-amd64-noopt builder, for example). Update this test to force a
stack growth, which causes it to fail reliably without this fix.
Fixes#17785.
Change-Id: If706963819ee7814e6705693247bcb97a6f7adb8
Reviewed-on: https://go-review.googlesource.com/33710
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Android's libc doesn't provide access to auxv, so currently the Go
runtime synthesizes a fake, minimal auxv when loaded as a library on
Android. This used to be sufficient, but now we depend on auxv to
retrieve the system physical page size and panic if we can't retrieve
it.
Fix this by falling back to reading auxv from /proc/self/auxv if the
loader-provided auxv is empty and removing the synthetic auxv vectors.
Fixes#18041.
Change-Id: Ia2ec2c764a6609331494a5d359032c56cbb83482
Reviewed-on: https://go-review.googlesource.com/33652
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
The pprof code discards all heap allocations made by runtime
routines. This caused it to discard heap allocations made by functions
called by reflect.Call, as the calls are made via the functions
`runtime.call32`, `runtime.call64`, etc. Fix the profiler to retain
these heap allocations.
Fixes#18077.
Change-Id: I8962d552f1d0b70fc7e6f7b2dbae8d5bdefb0735
Reviewed-on: https://go-review.googlesource.com/33635
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
When transitioning from C code to Go code we must respect the C
calling convention. On s390x this means that r6-r13, r15 and f8-f15
must be saved and restored by functions that use them.
On s390x we were saving the wrong set of floating point registers
(f0, f2, f4 and f6) rather than f8-f15 which means that Go code
could clobber registers that C code expects to be restored. This
CL modifies the crosscall functions on s390x to save/restore the
correct floating point registers.
Fixes#18035.
Change-Id: I5cc6f552c893a4e677669c8891521bf735492e97
Reviewed-on: https://go-review.googlesource.com/33571
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Applies the fix from CL 32920 to the new test TestSampledHeapAllocProfile
introduced in CL 33422. The test should be skipped rather than fail if
there is only one executable region of memory.
Updates #17852.
Change-Id: Id8c47b1f17ead14f02a58a024c9a04ebb8ec0429
Reviewed-on: https://go-review.googlesource.com/33453
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The expected default behavior (no explicit GOTRACEBACK setting)
is for the stack trace to start in user code, eliding unnecessary runtime
frames that led up to the actual trace printing code. The idea was that
the first line number printed was the one that crashed.
For #5832 we added code to show 'panic' frames so that if code panics
and then starts running defers and then we trace from there, the panic
frame can help explain why the code seems to have made a call not
present in the code. But that's only needed for panics between two different
call frames, not the panic at the very top of the stack trace.
Fix the fix to again elide the runtime code at the very top of the stack trace.
Simple panic:
package main
func main() {
var x []int
println(x[1])
}
Before this CL:
panic: runtime error: index out of range
goroutine 1 [running]:
panic(0x1056980, 0x1091bf0)
/Users/rsc/go/src/runtime/panic.go:531 +0x1cf
main.main()
/tmp/x.go:5 +0x5
After this CL:
panic: runtime error: index out of range
goroutine 1 [running]:
main.main()
/tmp/x.go:5 +0x5
Panic inside defer triggered by panic:
package main
func main() {
var x []int
defer func() {
println(x[1])
}()
println(x[2])
}
Before this CL:
panic: runtime error: index out of range
panic: runtime error: index out of range
goroutine 1 [running]:
panic(0x1056aa0, 0x1091bf0)
/Users/rsc/go/src/runtime/panic.go:531 +0x1cf
main.main.func1(0x0, 0x0, 0x0)
/tmp/y.go:6 +0x62
panic(0x1056aa0, 0x1091bf0)
/Users/rsc/go/src/runtime/panic.go:489 +0x2cf
main.main()
/tmp/y.go:8 +0x59
The middle panic is important: it explains why main.main ended up calling main.main.func1 on a line that looks like a call to println. The top panic is noise.
After this CL:
panic: runtime error: index out of range
panic: runtime error: index out of range
goroutine 1 [running]:
main.main.func1(0x0, 0x0, 0x0)
/tmp/y.go:6 +0x62
panic(0x1056ac0, 0x1091bf0)
/Users/rsc/go/src/runtime/panic.go:489 +0x2cf
main.main()
/tmp/y.go:8 +0x59
Fixes#17901.
Change-Id: Id6d7c76373f7a658a537a39ca32b7dc23e1e76aa
Reviewed-on: https://go-review.googlesource.com/33165
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
When debug is 0, emit the compressed proto format.
The debug>0 format stays the same.
Updates #16093
Change-Id: I45aa1874a22d34cf44dd4aa78bbff9302381cb34
Reviewed-on: https://go-review.googlesource.com/33422
Run-TryBot: Michael Matloob <matloob@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
When we raise a signal that was delivered to C code, it's possible that
the kernel will not deliver it immediately. This is especially possible
on Darwin where we use send the signal to the entire process rather than
just the current thread. Sleep for a millisecond after sending the
signal to give it a chance to be delivered before we restore the Go
signal handler. In most real cases the program is going to crash at this
point, so sleeping is kind of irrelevant anyhow.
Fixes#14809.
Change-Id: Ib2c0d2c4e240977fb4535dc1dd2bdc50d430eb85
Reviewed-on: https://go-review.googlesource.com/33300
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Updates #17786. Will fix mips(32) when the port is fully landed.
Change-Id: I00d4ff666ec14a38cadbcd52569b347bb5bc8b75
Reviewed-on: https://go-review.googlesource.com/33236
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
count profiles with debug=1 retain their previous format.
Also add a test check for the proto profiles since all runtime/pprof
tests only look at the debug=1 profiles.
Change-Id: Ibe805585b597e5d3570807115940a1dc4535c03f
Reviewed-on: https://go-review.googlesource.com/33148
Run-TryBot: Michael Matloob <matloob@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
If the scheduler has no user work and there's no GC work visible, it
puts the P to sleep (or blocks on the network). However, if we later
enqueue more GC work, there's currently nothing that specifically
wakes up the scheduler to let it start an idle GC worker. As a result,
we can underutilize the CPU during GC if Ps have been put to sleep.
Fix this by making GC wake idle Ps when work buffers are put on the
full list. We already have a hook to do this, since we use this to
preempt a random P if we need more dedicated workers. We expand this
hook to instead wake an idle P if there is one. The logic we use for
this is identical to the logic used to wake an idle P when we ready a
goroutine.
To make this really sound, we also fix the scheduler to re-check the
idle GC worker condition after releasing its P. This closes a race
where 1) the scheduler checks for idle work and finds none, 2) new
work is enqueued but there are no idle Ps so none are woken, and 3)
the scheduler releases its P.
There is one subtlety here. Currently we call enlistWorker directly
from putfull, but the gcWork is in an inconsistent state in the places
that call putfull. This isn't a problem right now because nothing that
enlistWorker does touches the gcWork, but with the added call to
wakep, it's possible to get a recursive call into the gcWork
(specifically, while write barriers are disallowed, this can do an
allocation, which can dispose a gcWork, which can put a workbuf). To
handle this, we lift the enlistWorker calls up a layer and delay them
until the gcWork is in a consistent state.
Fixes#14179.
Change-Id: Ia2467a52e54c9688c3c1752e1fc00f5b37bbfeeb
Reviewed-on: https://go-review.googlesource.com/32434
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Idle GC workers trigger whenever there's a GC running and the
scheduler doesn't find any other work. However, they currently run for
a full scheduler quantum (~10ms) once started.
This is really bad for event-driven applications, where work may come
in on the network hundreds of times during that window. In the
go-gcbench rpc benchmark, this is bad enough to often cause effective
STWs where all Ps are in the idle worker. When this happens, we don't
even poll the network any more (except for the background 10ms poll in
sysmon), so we don't even know there's more work to do.
Fix this by making idle workers check with the scheduler roughly every
100 µs to see if there's any higher-priority work the P should be
doing. This check includes polling the network for incoming work.
Fixes#16528.
Change-Id: I6f62ebf6d36a92368da9891bafbbfd609b9bd003
Reviewed-on: https://go-review.googlesource.com/32433
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Before this CL, Go programs in c-archive or c-shared buildmodes
would not handle SIGPIPE. That leads to surprising behaviour where
writes on a closed pipe or socket would raise SIGPIPE and terminate
the program. This CL changes the Go runtime to handle
SIGPIPE regardless of buildmode. In addition, SIGPIPE from non-Go
code is forwarded.
Fixes#17393
Updates #16760
Change-Id: I155e82020a03a5cdc627a147c27da395662c3fe8
Reviewed-on: https://go-review.googlesource.com/32796
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently there are no diagnostics for mark root check during marking.
Fix this by printing out the same diagnostics we print during mark
termination.
Also, drop the allglock before throwing. Holding that across a throw
causes a self-deadlock with tracebackothers.
For #16083.
Change-Id: Ib605f3ae0c17e70704b31d8378274cfaa2307dc2
Reviewed-on: https://go-review.googlesource.com/33339
Reviewed-by: Rick Hudson <rlh@golang.org>
Not sure what I was thinking.
Change-Id: I143cdf7c5ef8e7b2394afeca6b30c46bb2c19a55
Reviewed-on: https://go-review.googlesource.com/33340
Reviewed-by: Ian Lance Taylor <iant@golang.org>
If a program has had its text section split into multiple
sections then the ftab that is built is based on addresses
prior to splitting. That means all the function addresses
are there and correct because of relocation but the
but the computed idx won't always match up quite right and
in some cases go beyond the end of the table, causing a panic.
To resolve this, determine if the idx is too large and if it is,
set it to the last index in ftab. Then search backward to find the
matching function address.
Fixes#17854
Change-Id: I6940e76a5238727b0a9ac23dc80000996db2579a
Reviewed-on: https://go-review.googlesource.com/32972
Reviewed-by: David Chase <drchase@google.com>
Zero out the sigaction structs, in case the sa_restorer field is set.
Clear the SA_RESTORER flag; it is part of the kernel interface, not the
libc interface.
Fixes#17947.
Change-Id: I610348ce3c196d3761cf2170f06c24ecc3507cf7
Reviewed-on: https://go-review.googlesource.com/33331
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Autotmp filtering was too aggressive and excluded types
necessary to make debuggers work properly. Restore the
"late filter" in dwarf.go based on names to exclude autotmps,
and remove the "early filter" in pgen.go based on how the
name was introduced. However, the updated naming scheme
with a dot prefix is retained to prevent accidental clashes
with legal Go identifier names.
Includes test (grouped with runtime gdb tests),
verified to fail without the fix.
Updates #17644.
Fixes#17830.
Change-Id: I7ec3f7230083889660236e5f6bc77ba5fe434e93
Reviewed-on: https://go-review.googlesource.com/33233
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This ensures that runtime's signal handlers pass through the TSAN and
MSAN libc interceptors and subsequent calls to the intercepted
sigaction function from C will correctly see them.
Fixes#17753.
Change-Id: I9798bb50291a4b8fa20caa39c02a4465ec40bb8d
Reviewed-on: https://go-review.googlesource.com/33142
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In plugins and every program that opens a plugin, include a hash of
every imported package.
There are two versions of each hash: one local and one exported.
As the program starts and plugins are loaded, the first exported
symbol for each package becomes the canonical version.
Any subsequent plugin's local package hash symbol has to match the
canonical version.
Fixes#17832
Change-Id: I4e62c8e1729d322e14b1673bada40fa7a74ea8bc
Reviewed-on: https://go-review.googlesource.com/33161
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Add a variant of sync/atomic's TestUnaligned64 to
runtime/internal/atomic.
Skips the test on arm for now where it's currently failing.
Updates #17786
Change-Id: If63f9c1243e9db7b243a95205b2d27f7d1dc1e6e
Reviewed-on: https://go-review.googlesource.com/33159
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This change is an experimental implementation of asynchronous
cancelable I/O operations on Plan 9, which are required to
implement deadlines.
There are no asynchronous syscalls on Plan 9. I/O operations
are performed with blocking pread and pwrite syscalls.
Implementing deadlines in Go requires a way to interrupt
I/O operations.
It is possible to interrupt reads and writes on a TCP connection
by forcing the closure of the TCP connection. This approach
has been used successfully in CL 31390.
However, we can't implement deadlines with this method, since
we require to be able to reuse the connection after the timeout.
On Plan 9, I/O operations are interrupted when the process
receives a note. We can rely on this behavior to implement
a more generic approach.
When doing an I/O operation (read or write), we start the I/O in
its own process, then wait for the result asynchronously. The
process is able to handle the "hangup" note. When receiving the
"hangup" note, the currently running I/O operation is canceled
and the process returns.
This way, deadlines can be implemented by sending an "hangup"
note to the process running the blocking I/O operation, after
the expiration of a timer.
Fixes#11932.
Fixes#17498.
Change-Id: I414f72c7a9a4f9b8f9c09ed3b6c269f899d9b430
Reviewed-on: https://go-review.googlesource.com/31521
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
When a Go program crashes with GOTRACEBACK=crash, the OS creates a
core dump. Include the text-formatted output of some of the cause of
that crash in the core dump.
Output printed by the runtime before crashing is maintained in a
circular buffer to allow access to messages that may be printed
immediately before calling runtime.throw.
The stack traces printed by the runtime as it crashes are not stored.
The information required to recreate them should be included in the
core file.
Updates #16893
There are no tests covering the generation of core dumps; this change
has not added any.
This adds (reentrant) locking to runtime.gwrite, which may have an
undesired performance impact.
Change-Id: Ia2463be3c12429354d290bdec5f3c8d565d1a2c3
Reviewed-on: https://go-review.googlesource.com/32013
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
I don't have any way to test or reproduce this problem,
but the current code is clearly wrong for Windows.
Make it better.
As I said on #17165:
But the borrowing of M's and the profiling of M's by the CPU profiler
seem not synchronized enough. This code implements the CPU profiler
on Windows:
func profileloop1(param uintptr) uint32 {
stdcall2(_SetThreadPriority, currentThread, _THREAD_PRIORITY_HIGHEST)
for {
stdcall2(_WaitForSingleObject, profiletimer, _INFINITE)
first := (*m)(atomic.Loadp(unsafe.Pointer(&allm)))
for mp := first; mp != nil; mp = mp.alllink {
thread := atomic.Loaduintptr(&mp.thread)
// Do not profile threads blocked on Notes,
// this includes idle worker threads,
// idle timer thread, idle heap scavenger, etc.
if thread == 0 || mp.profilehz == 0 || mp.blocked {
continue
}
stdcall1(_SuspendThread, thread)
if mp.profilehz != 0 && !mp.blocked {
profilem(mp)
}
stdcall1(_ResumeThread, thread)
}
}
}
func profilem(mp *m) {
var r *context
rbuf := make([]byte, unsafe.Sizeof(*r)+15)
tls := &mp.tls[0]
gp := *((**g)(unsafe.Pointer(tls)))
// align Context to 16 bytes
r = (*context)(unsafe.Pointer((uintptr(unsafe.Pointer(&rbuf[15]))) &^ 15))
r.contextflags = _CONTEXT_CONTROL
stdcall2(_GetThreadContext, mp.thread, uintptr(unsafe.Pointer(r)))
sigprof(r.ip(), r.sp(), 0, gp, mp)
}
func sigprof(pc, sp, lr uintptr, gp *g, mp *m) {
if prof.hz == 0 {
return
}
// Profiling runs concurrently with GC, so it must not allocate.
mp.mallocing++
... lots of code ...
mp.mallocing--
}
A borrowed M may migrate between threads. Between the
atomic.Loaduintptr(&mp.thread) and the SuspendThread, mp may have
moved to a new thread, so that it's in active use. In particular
it might be calling malloc, as in the crash stack trace. If so, the
mp.mallocing++ in sigprof would provoke the crash.
Those lines are trying to guard against allocation during sigprof.
But on Windows, mp is the thread being traced, not the current
thread. Those lines should really be using getg().m.mallocing, which
is the same on Unix but not on Windows. With that change, it's
possible the race on the actual thread is not a problem: the traceback
would get confused and eventually return an error, but that's fine.
The code expects that possibility.
Fixes#17165.
Change-Id: If6619731910d65ca4b1a6e7de761fa2518ef339e
Reviewed-on: https://go-review.googlesource.com/33132
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
All the existing CPU profiler tests already parse the profile.
That should be sufficient indication that profiles can be parsed.
Fixes#17853.
Change-Id: Ie8a190e2ae4eef125c8eb0d4e8b7adac420abbdb
Reviewed-on: https://go-review.googlesource.com/33136
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
rsc's change golang.org/cl/32455 added a mechanism
that allows pprof to depend on gzip without introducing
an import cycle. This obsoletes the need for the gzip0
package, which was created solely to remove the need
for that dependency.
Change-Id: Ifa3b98faac9b251f909b84b4da54742046c4e3ad
Reviewed-on: https://go-review.googlesource.com/33137
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This change buffers the entire profile and converts in one shot
in the profile writer, and could use more memory than necessary
to output protocol buffer formatted profiles. It should be
possible to convert each chunk in a stream (maybe maintaining
some minimal state to output in the end) which could save on
memory usage.
Fixes#16093
Change-Id: I946c6a2b044ae644c72c8bb2d3bd82c415b1a847
Reviewed-on: https://go-review.googlesource.com/33071
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
A Go binary may only have 1 executable memory region if it has been
linked using internal linking. This change means that the test will
be skipped if this is the case, rather than fail.
Fixes#17852.
Change-Id: I59459a0f90ae8963aeb9908e5cb9fb64d7d0e0f4
Reviewed-on: https://go-review.googlesource.com/32920
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
This change adds code, originally written by Russ Cox <rsc@golang.org>
and open-sourced by Google, that converts from the "legacy"
binary pprof profile format to a struct representation of the
new protocol buffer pprof profile format.
This code reads the entire binary format for conversion to the
protobuf format. In a future change, we will update the code
to incrementally read and convert segments of the binary format,
so that the entire profile does not need to be stored in memory.
This change also contains contributions by Daria Kolistratova
<daria.kolistratova@intel.com> from the rolled-back change
golang.org/cl/30556 adapting the code to be used by the package
runtime/pprof.
This code also appeared in the change golang.org/cl/32257, which was based
on Daria Kolistratova's change, but was also rolled back.
Updates #16093
Change-Id: I5c768b1134bc15408d80a3ccc7ed867db9a1c63d
Reviewed-on: https://go-review.googlesource.com/32811
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Fixes#17811
Change-Id: I7bf9cbc5245417047ad28a14d9b9ad6592607d3d
Reviewed-on: https://go-review.googlesource.com/32774
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
I used the slowtests.go tool as described in
https://golang.org/cl/32684 on packages that stood out.
go test -short std drops from ~56 to ~52 seconds.
This isn't a huge win, but it was mostly an exercise.
Updates #17751
Change-Id: I9f3402e36a038d71e662d06ce2c1d52f6c4b674d
Reviewed-on: https://go-review.googlesource.com/32751
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
mheap_.heap_live is an atomically accessed uint64. It is currently not 8-byte
aligned on 32-bit platforms, which has been okay because it's only accessed via
Xadd64, which doesn't require alignment on 386 or ARM32. However, Xadd64 on
MIPS32 does require 8-byte alignment.
Add a padding field to force 8-byte alignment of heap_live and prevent an
alignment check crash on MIPS32.
Change-Id: I7eddf7883aec7a0a7e0525af5d58ed4338a401d0
Reviewed-on: https://go-review.googlesource.com/31635
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Newer versions of gcc notice a type mismatch and complain.
Fix code to match documented signature in MSDN.
Trybots say this still compiles with the older (5.1) version
of gcc.
Fixes#17771.
Change-Id: Ib3fe6f71b40751e1146249e31232da5ac69b9e00
Reviewed-on: https://go-review.googlesource.com/32646
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
OpenBSD's scheduler causes preemption to take 20+ms, so 30ms is not
enough time for 3 goroutines to run. This change continues to sleep for
30ms, but if it finds that the 3 goroutines have not run, it sleeps for
an additional 1s before declaring failure.
Updates #17712
Change-Id: I3e886e40d05192b7cb71b4f242af195836ef62a8
Reviewed-on: https://go-review.googlesource.com/32634
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Quentin Smith <quentin@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The code to do the conversion is smaller than the
call to the runtime.
The 1-result asserts need to call panic if they fail, but that
code is out of line.
The only conversions left in the runtime are those which
might allocate and those which might need to generate an itab.
Given the following types:
type E interface{}
type I interface { foo() }
type I2 iterface { foo(); bar() }
type Big [10]int
func (b Big) foo() { ... }
This CL inlines the following conversions:
was assertE2T
var e E = ...
b := i.(Big)
was assertE2T2
var e E = ...
b, ok := i.(Big)
was assertI2T
var i I = ...
b := i.(Big)
was assertI2T2
var i I = ...
b, ok := i.(Big)
was assertI2E
var i I = ...
e := i.(E)
was assertI2E2
var i I = ...
e, ok := i.(E)
These are the remaining runtime calls:
convT2E:
var b Big = ...
var e E = b
convT2I:
var b Big = ...
var i I = b
convI2I:
var i2 I2 = ...
var i I = i2
assertE2I:
var e E = ...
i := e.(I)
assertE2I2:
var e E = ...
i, ok := e.(I)
assertI2I:
var i I = ...
i2 := i.(I2)
assertI2I2:
var i I = ...
i2, ok := i.(I2)
Fixes#17405Fixes#8422
Change-Id: Ida2367bf8ce3cd2c6bb599a1814f1d275afabe21
Reviewed-on: https://go-review.googlesource.com/32313
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
The runtime.typeEquals function is used during typelinksinit to
determine the canonical set of *_type values to use throughout the
runtime. As such, it is run against non-canonical *_type values, that
is, types from modules that are duplicates of a type from another
module that was loaded earlier in the program life.
These non-canonical *_type values sometimes contain pointers. These
pointers are pointing to position-independent data, and so they are set
by ld.so using dynamic relocations when the module is loaded. As such,
the pointer can point to the equivalent memory from a previous module.
This means if typesEqual follows a pointer inside a *_type, it can end
up at a piece of memory from another module. If it reads a typeOff or
nameOff from that memory and attempts to resolve it against the
non-canonical *_type from the later module, it will end up with a
reference to junk memory.
Instead, resolve against the pointer the offset was read from, so the
data is valid.
Fixes#17709.
Should no longer matter after #17724 is resolved in a later Go.
Change-Id: Ie88b151a3407d82ac030a97b5b6a19fc781901cb
Reviewed-on: https://go-review.googlesource.com/32513
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This makes no practical difference, as SIGSTOP can not be caught, but
may as well be consistent.
Change-Id: I3efbbf092388bb3f6dccc94cf703c5d94d35f6a1
Reviewed-on: https://go-review.googlesource.com/32533
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
sigfwd calls an arbitrary C signal handler function. The System V ABI
for x86_64 (and the most recent revision of the ABI for i386) requires
the stack to be 16-byte aligned.
Fixes: #17641
Change-Id: I77f53d4a8c29c1b0fe8cfbcc8d5381c4e6f75a6b
Reviewed-on: https://go-review.googlesource.com/32107
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The introduction of -buildmode=plugin means modules can be added to a
Go program while it is running. This means there exists some time
while the program is running with the module is on the moduledata
linked list, but it has not been initialized to the satisfaction of
other parts of the runtime. Notably, the GC.
This CL adds a new way of access modules, an activeModules function.
It returns a slice of modules that is built in the background and
atomically swapped in. The parts of the runtime that need to wait on
module initialization can use this slice instead of the linked list.
Fixes#17455
Change-Id: I04790fd07e40c7295beb47cea202eb439206d33d
Reviewed-on: https://go-review.googlesource.com/32357
Reviewed-by: Ian Lance Taylor <iant@golang.org>
- Adds overflow checks
- Adds parsing of negative integers
- Adds boolean return value to signal parsing errors
- Adds atoi32 for parsing of integers that fit in an int32
- Adds tests
Handling of errors to provide error messages
at the call sites is left to future CLs.
Updates #17718
Change-Id: I3cacd0ab1230b9efc5404c68edae7304d39bcbc0
Reviewed-on: https://go-review.googlesource.com/32390
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This implements a check that can be done at runtime for the ISA level and
hardware capability. It follows the same implementation as in s390x.
These checks will be important as we enable new instructions and write go
asm implementations using those.
Updates #15403Fixes#16643
Change-Id: Idfee374a3ffd7cf13a7d8cf0a6c83d247d3bee16
Reviewed-on: https://go-review.googlesource.com/32330
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently we have write barriers for direct channel sends, where the
receiver is blocked and the sender is writing directly to the
receiver's stack; but not for direct channel receives, where the
sender is blocked and the receiver is reading directly from the
sender's stack.
This was okay with the old write barrier because either 1) the
receiver would write the received pointer into the heap (causing it to
be shaded), 2) the pointer would still be on the receiver's stack at
mark termination and we would rescan it, or 3) the receiver dropped
the pointer so it wasn't necessarily reachable anyway.
This is not okay with the write barrier because it lets a grey stack
send a white pointer to a black stack and then remove it from its own
stack. If the grey stack was the sole grey-protector of this pointer,
this hides the object from the garbage collector.
Fix this by making direct receives perform a stack-to-stack write
barrier just like direct sends do.
Fixes#17694.
Change-Id: I1a4cb904e4138d2ac22f96a3e986635534a5ae41
Reviewed-on: https://go-review.googlesource.com/32450
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, assists can only perform heap marking jobs. However, at the
beginning of GC, there are only root jobs and no heap marking jobs. As
a result, there's often a period at the beginning of a GC cycle where
no goroutine has accumulated assist credit, but at the same time it
can't get any credit because there are no heap marking jobs for it to
do yet. As a result, many goroutines often block on the assist queue
at the very beginning of the GC cycle.
This commit fixes this by allowing assists to perform root marking
jobs. The tricky part of this (and the reason we haven't done this
before) is that stack scanning jobs can lead to deadlocks if the
goroutines performing the stack scanning are themselves
non-preemptible, since two non-preemptible goroutines may try to scan
each other. To address this, we use the same insight d6625ca used to
simplify the mark worker stack scanning: as long as we're careful with
the stacks and only drain jobs while on the system stack, we can put
the goroutine into a preemptible state while we drain jobs. This means
an assist's user stack can be scanned while it continues to do work.
This reduces the rate of assist blocking in the x/benchmarks HTTP
benchmark by a factor of 3 and all remaining blocking happens towards
the *end* of the GC cycle, when there may genuinely not be enough work
to go around.
Ideally, assists would get credit for working on root jobs. Currently
they do not; however, this change prioritizes heap work over root jobs
in assists, so they're likely to mostly perform heap work. In contrast
with mark workers, for assists, the root jobs act only as a backstop
to create heap work when there isn't enough heap work.
Fixes#15361.
Change-Id: If6e169863e4ad75710b0c8dc00f6125b41e9a595
Reviewed-on: https://go-review.googlesource.com/32432
Reviewed-by: Rick Hudson <rlh@golang.org>
This lifts the part of gcAssistAlloc that runs on the system stack to
its own function in preparation for letting assists perform root jobs
(notably stack scanning). This makes it easy to see that there are no
references to the user stack once we've entered gcAssistAlloc1, which
means it's safe to shrink the stack while in gcAssistAlloc1.
This does not yet make assists perform root jobs, so it's not actually
possible for the stack to shrink yet. That will happen in the next
commit.
The code in gcAssistAlloc1 is identical to the code that's currently
passed in a closure to systemstack with one exception. Currently, we
set the "completed" variable in the enclosing scope to indicate that
the assist completed the mark phase. This is exactly the sort of
cross-stack reference lifting this function is meant to prevent. We
replace this variable with setting gp.param to nil or non-nil to
indicate the completion status.
Updates #15361.
Change-Id: Iba7cfb758c781070a441aea86c0117b399a24dbd
Reviewed-on: https://go-review.googlesource.com/32431
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
This reverts commit b33030a727.
Reason for revert: We're going to try to get the code in this change
submitted in smaller, more carefully reviewed changes.
Change-Id: I4175f4b297f0e69fb78b11f9dc0bd82f27865be7
Reviewed-on: https://go-review.googlesource.com/32441
Reviewed-by: Russ Cox <rsc@golang.org>
The map[typeOff]*_type object is created at run time and stored in
the moduledata. The moduledata object is marked by the linker as
SNOPTRDATA, so the reference is ignored by the GC. Running
misc/cgo/testplugin/test.bash with GOGC=1 will eventually collect
the typemap and crash.
This bug probably comes up in -linkshared binaries in Go 1.7.
I don't know why we haven't seen a report about this yet.
Fixes#17680
Change-Id: I0e9b5c006010e8edd51d9471651620ba665248d3
Reviewed-on: https://go-review.googlesource.com/32430
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Plumb the import path of a plugin package through to the linker, and
use it as the prefix on the exported symbol names.
Before this we used the basename of the plugin file as the prefix,
which could conflict and result in multiple loaded plugins sharing
symbols that are distinct.
Fixes#17155Fixes#17579
Change-Id: I7ce966ca82d04e8507c0bcb8ea4ad946809b1ef5
Reviewed-on: https://go-review.googlesource.com/32355
Reviewed-by: Ian Lance Taylor <iant@golang.org>
No point in computing this info on startup.
Compute it at build time.
This lets us spend more time computing & checking the size classes.
Improve the div magic for rounding to the start of an object.
We can now use 32-bit multiplies & shifts, which should help
32-bit platforms.
The static data is <1KB.
The actual size classes are not changed by this CL.
Change-Id: I6450cec7d1b2b4ad31fd3f945f504ed2ec6570e7
Reviewed-on: https://go-review.googlesource.com/32219
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
I did 'export GORACE=atexit_sleep_ms=0' in a console
and then was puzzled as to why race tests fail.
Existing GORACE env var may (or may not) override
the one that we setup.
Filter out GORACE as we do for other important env vars.
Change-Id: I29be86b0cbb9b5dc7f9efb15729ade86fc79b0e0
Reviewed-on: https://go-review.googlesource.com/32163
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
On solaris/amd64 sometimes the reported cycle count is negative. Replace
with 0.
Change-Id: I364eea5ca072281245c7ab3afb0bf69adc3a8eae
Reviewed-on: https://go-review.googlesource.com/32258
Reviewed-by: Ian Lance Taylor <iant@golang.org>
On amd64p32, rt0_go attempts to reserve 128 bytes of scratch space on
the stack, but due to a register mixup this ends up being a no-op. Fix
this so we actually reserve the stack space.
Change-Id: I04dbfbeb44f3109528c8ec74e1136bc00d7e1faa
Reviewed-on: https://go-review.googlesource.com/32331
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
With the hybrid barrier in place, we can now disable stack rescanning
by default. This commit adds a "gcrescanstacks" GODEBUG variable that
is off by default but can be set to re-enable STW stack rescanning.
The plan is to leave this off but available in Go 1.8 for debugging
and as a fallback.
With this change, worst-case mark termination time at GOMAXPROCS=12
*not* including time spent stopping the world (which is still
unbounded) is reliably under 100 µs, with a 95%ile around 50 µs in
every benchmark I tried (the go1 benchmarks, the x/benchmarks garbage
benchmark, and the gcbench activegs and rpc benchmarks). Including
time spent stopping the world usually adds about 20 µs to total STW
time at GOMAXPROCS=12, but I've seen it add around 150 µs in these
benchmarks when a goroutine takes time to reach a safe point (see
issue #10958) or when stopping the world races with goroutine
switches. At GOMAXPROCS=1, where this isn't an issue, worst case STW
is typically 30 µs.
The go-gcbench activegs benchmark is designed to stress large numbers
of dirty stacks. This commit reduces 95%ile STW time for 500k dirty
stacks by nearly three orders of magnitude, from 150ms to 195µs.
This has little effect on the throughput of the go1 benchmarks or the
x/benchmarks benchmarks.
name old time/op new time/op delta
XGarbage-12 2.31ms ± 0% 2.32ms ± 1% +0.28% (p=0.001 n=17+16)
XJSON-12 12.4ms ± 0% 12.4ms ± 0% +0.41% (p=0.000 n=18+18)
XHTTP-12 11.8µs ± 0% 11.8µs ± 1% ~ (p=0.492 n=20+18)
It reduces the tail latency of the x/benchmarks HTTP benchmark:
name old p50-time new p50-time delta
XHTTP-12 489µs ± 0% 491µs ± 1% +0.54% (p=0.000 n=20+18)
name old p95-time new p95-time delta
XHTTP-12 957µs ± 1% 960µs ± 1% +0.28% (p=0.002 n=20+17)
name old p99-time new p99-time delta
XHTTP-12 1.76ms ± 1% 1.64ms ± 1% -7.20% (p=0.000 n=20+18)
Comparing to the beginning of the hybrid barrier implementation
("runtime: parallelize STW mcache flushing") shows that the hybrid
barrier trades a small performance impact for much better STW latency,
as expected. The magnitude of the performance impact is generally
small:
name old time/op new time/op delta
BinaryTree17-12 2.37s ± 1% 2.42s ± 1% +2.04% (p=0.000 n=19+18)
Fannkuch11-12 2.84s ± 0% 2.72s ± 0% -4.00% (p=0.000 n=19+19)
FmtFprintfEmpty-12 44.2ns ± 1% 45.2ns ± 1% +2.20% (p=0.000 n=17+19)
FmtFprintfString-12 130ns ± 1% 134ns ± 0% +2.94% (p=0.000 n=18+16)
FmtFprintfInt-12 114ns ± 1% 117ns ± 0% +3.01% (p=0.000 n=19+15)
FmtFprintfIntInt-12 176ns ± 1% 182ns ± 0% +3.17% (p=0.000 n=20+15)
FmtFprintfPrefixedInt-12 186ns ± 1% 187ns ± 1% +1.04% (p=0.000 n=20+19)
FmtFprintfFloat-12 251ns ± 1% 250ns ± 1% -0.74% (p=0.000 n=17+18)
FmtManyArgs-12 746ns ± 1% 761ns ± 0% +2.08% (p=0.000 n=19+20)
GobDecode-12 6.57ms ± 1% 6.65ms ± 1% +1.11% (p=0.000 n=19+20)
GobEncode-12 5.59ms ± 1% 5.65ms ± 0% +1.08% (p=0.000 n=17+17)
Gzip-12 223ms ± 1% 223ms ± 1% -0.31% (p=0.006 n=20+20)
Gunzip-12 38.0ms ± 0% 37.9ms ± 1% -0.25% (p=0.009 n=19+20)
HTTPClientServer-12 77.5µs ± 1% 78.9µs ± 2% +1.89% (p=0.000 n=20+20)
JSONEncode-12 14.7ms ± 1% 14.9ms ± 0% +0.75% (p=0.000 n=20+20)
JSONDecode-12 53.0ms ± 1% 55.9ms ± 1% +5.54% (p=0.000 n=19+19)
Mandelbrot200-12 3.81ms ± 0% 3.81ms ± 1% +0.20% (p=0.023 n=17+19)
GoParse-12 3.17ms ± 1% 3.18ms ± 1% ~ (p=0.057 n=20+19)
RegexpMatchEasy0_32-12 71.7ns ± 1% 70.4ns ± 1% -1.77% (p=0.000 n=19+20)
RegexpMatchEasy0_1K-12 946ns ± 0% 946ns ± 0% ~ (p=0.405 n=18+18)
RegexpMatchEasy1_32-12 67.2ns ± 2% 67.3ns ± 2% ~ (p=0.732 n=20+20)
RegexpMatchEasy1_1K-12 374ns ± 1% 378ns ± 1% +1.14% (p=0.000 n=18+19)
RegexpMatchMedium_32-12 107ns ± 1% 107ns ± 1% ~ (p=0.259 n=18+20)
RegexpMatchMedium_1K-12 34.2µs ± 1% 34.5µs ± 1% +1.03% (p=0.000 n=18+18)
RegexpMatchHard_32-12 1.77µs ± 1% 1.79µs ± 1% +0.73% (p=0.000 n=19+18)
RegexpMatchHard_1K-12 53.6µs ± 1% 54.2µs ± 1% +1.10% (p=0.000 n=19+19)
Template-12 61.5ms ± 1% 63.9ms ± 0% +3.96% (p=0.000 n=18+18)
TimeParse-12 303ns ± 1% 300ns ± 1% -1.08% (p=0.000 n=19+20)
TimeFormat-12 318ns ± 1% 320ns ± 0% +0.79% (p=0.000 n=19+19)
Revcomp-12 (*) 509ms ± 3% 504ms ± 0% ~ (p=0.967 n=7+12)
[Geo mean] 54.3µs 54.8µs +0.88%
(*) Revcomp is highly non-linear, so I only took samples with 2
iterations.
name old time/op new time/op delta
XGarbage-12 2.25ms ± 0% 2.32ms ± 1% +2.74% (p=0.000 n=16+16)
XJSON-12 11.6ms ± 0% 12.4ms ± 0% +6.81% (p=0.000 n=18+18)
XHTTP-12 11.6µs ± 1% 11.8µs ± 1% +1.62% (p=0.000 n=17+18)
Updates #17503.
Updates #17099, since you can't have a rescan list bug if there's no
rescan list. I'm not marking it as fixed, since gcrescanstacks can
still be set to re-enable the rescan lists.
Change-Id: I6e926b4c2dbd4cd56721869d4f817bdbb330b851
Reviewed-on: https://go-review.googlesource.com/31766
Reviewed-by: Rick Hudson <rlh@golang.org>
This implements the unconditional version of the hybrid deletion write
barrier, which always shades both the old and new pointer. It's
unconditional for now because barriers on channel operations require
checking both the source and destination stacks and we don't have a
way to funnel this information into the write barrier at the moment.
As part of this change, we modify the typed memclr operations
introduced earlier to invoke the write barrier.
This has basically no overall effect on benchmark performance. This is
good, since it indicates that neither the extra shade nor the new bulk
clear barriers have much effect. It also has little effect on latency.
This is expected, since we haven't yet modified mark termination to
take advantage of the hybrid barrier.
Updates #17503.
Change-Id: Iebedf84af2f0e857bd5d3a2d525f760b5cf7224b
Reviewed-on: https://go-review.googlesource.com/31765
Reviewed-by: Rick Hudson <rlh@golang.org>
With the hybrid barrier, unless we're doing a STW GC or hit a very
rare race (~once per all.bash) that can start mark termination before
all of the work is drained, we don't need to drain the work queue at
all. Even draining an empty work queue is rather expensive since we
have to enter the getfull() barrier, so it's worth avoiding this.
Conveniently, it's quite easy to detect whether or not we actually
need the getufull() barrier: since the world is stopped when we enter
mark termination, everything must have flushed its work to the work
queue, so we can just check the queue. If the queue is empty and we
haven't queued up any jobs that may create more work (which should
always be the case with the hybrid barrier), we can simply have all GC
workers perform non-blocking drains.
Also conveniently, this solution is quite safe. If we do somehow screw
something up and there's work on the work queue, some worker will
still process it, it just may not happen in parallel.
This is not the "right" solution, but it's simple, expedient,
low-risk, and maintains compatibility with debug.gcrescanstacks. When
we remove the gcrescanstacks fallback in Go 1.9, we should also fix
the race that starts mark termination early, and then we can eliminate
work draining from mark termination.
Updates #17503.
Change-Id: I7b3cd5de6a248ab29d78c2b42aed8b7443641361
Reviewed-on: https://go-review.googlesource.com/32186
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently bulkBarrierPreWrite calls writebarrierptr_prewrite, but this
means that we check writeBarrier.needed twice and perform cgo checks
twice.
Change bulkBarrierPreWrite to call writebarrierptr_prewrite1 to skip
over these duplicate checks.
This may speed up bulkBarrierPreWrite slightly, but mostly this will
save us from running out of nosplit stack space on ppc64x in the near
future.
Updates #17503.
Change-Id: I1cea1a2207e884ab1a279c6a5e378dcdc048b63e
Reviewed-on: https://go-review.googlesource.com/31890
Reviewed-by: Rick Hudson <rlh@golang.org>
gobuf.ctxt is set to nil from many places in assembly code and these
assignments require write barriers with the hybrid barrier.
Conveniently, in most of these places ctxt should already be nil, in
which case we don't need the barrier. This commit changes these places
to assert that ctxt is already nil.
gogo is more complicated, since ctxt may not already be nil. For gogo,
we manually perform the write barrier if ctxt is not nil.
Updates #17503.
Change-Id: I9d75e27c75a1b7f8b715ad112fc5d45ffa856d30
Reviewed-on: https://go-review.googlesource.com/31764
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Currently, we perform write barriers after performing pointer writes.
At the moment, it simply doesn't matter what order this happens in, as
long as they appear atomic to GC. But both the hybrid barrier and ROC
are going to require a pre-write write barrier.
For the hybrid barrier, this is important because the barrier needs to
observe both the current value of the slot and the value that will be
written to it. (Alternatively, the caller could do the write and pass
in the old value, but it seems easier and more useful to just swap the
order of the barrier and the write.)
For ROC, this is necessary because, if the pointer write is going to
make the pointer reachable to some goroutine that it currently is not
visible to, the garbage collector must take some special action before
that pointer becomes more broadly visible.
This commits swaps pointer writes around so the write barrier occurs
before the pointer write.
The main subtlety here is bulk memory writes. Currently, these copy to
the destination first and then use the pointer bitmap of the
destination to find the copied pointers and invoke the write barrier.
This is necessary because the source may not have a pointer bitmap. To
handle these, we pass both the source and the destination to the bulk
memory barrier, which uses the pointer bitmap of the destination, but
reads the pointer values from the source.
Updates #17503.
Change-Id: I78ecc0c5c94ee81c29019c305b3d232069294a55
Reviewed-on: https://go-review.googlesource.com/31763
Reviewed-by: Rick Hudson <rlh@golang.org>
Apparently on macOS Sierra LLDB thinks /usr/lib/dyld is mapped
at address 0, even if Go code starts at 0x1000, and it looks up
addresses from dyld which shadows Go symbols. Move Go binary at
a higher address to avoid clash.
Fixes#17463. Re-enable TestLldbPython.
Change-Id: I89ca6f3ee48aa6da9862bfa0c2da91477cc93255
Reviewed-on: https://go-review.googlesource.com/32185
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
As for dropg, save is writing a nil pointer that will generate a write
barrier with the hybrid barrier. However, in this case, ctxt always
should already be nil, so replace the write with an assertion that
this is the case.
At this point, we're ready to disable the write barrier elision
optimizations that interfere with the hybrid barrier.
Updates #17503.
Change-Id: I83208e65aa33403d442401f355b2e013ab9a50e9
Reviewed-on: https://go-review.googlesource.com/31571
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently this contains no write barriers because it's writing nil
pointers, but with the hybrid barrier, even these will produce write
barriers. However, since these are *gs and *ms, they don't need write
barriers, so we can simply eliminate them.
Updates #17503.
Change-Id: Ib188a60492c5cfb352814bf9b2bcb2941fb7d6c0
Reviewed-on: https://go-review.googlesource.com/31570
Reviewed-by: Rick Hudson <rlh@golang.org>
The hybrid barrier requires allocate-black, but there's one case where
we don't currently allocate black: the tiny allocator. If we allocate
a *new* tiny alloc block during GC, it will be allocated black, but if
we allocated the current block before GC, it won't be black, and the
further allocations from it won't mark it, which means we may free a
reachable tiny block during sweeping.
Fix this by passing over all mcaches at the beginning of mark, while
the world is still stopped, and greying their tiny blocks.
Updates #17503.
Change-Id: I04d4df7cc2f553f8f7b1e4cb0b52e2946588111a
Reviewed-on: https://go-review.googlesource.com/31456
Reviewed-by: Rick Hudson <rlh@golang.org>
The hybrid barrier requires barriers on stack-to-stack copies if
either stack is grey. There are only two instances of this in the
runtime: channel sends and starting a goroutine. Channel sends already
use typedmemmove and hence have the necessary barriers. This commits
adds barriers for the stack-to-stack copy when starting a goroutine.
Updates #17503.
Change-Id: Ibb55e08127ca4d021ac54be61cb96732efa5df5b
Reviewed-on: https://go-review.googlesource.com/31455
Reviewed-by: Rick Hudson <rlh@golang.org>
Original Change by Daria Kolistratova <daria.kolistratova@intel.com>
Added functions with suffix proto and stuff from pprof tool to translate
to protobuf. Done as the profile proto is more extensible than the legacy
pprof format and is pprof's preferred profile format. Large part was taken
from https://github.com/google/pprof tool. Tested by hand and compared the
result with translated by pprof tool, profiles are identical.
Fixes#16093
Change-Id: I2751345b09a66ee2b6aa64be76cba4cd1c326aa6
Reviewed-on: https://go-review.googlesource.com/32257
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
Waiting 2ms for all the kicked-off goroutines to run and block
seems a little optimistic. No harm done by waiting for 200ms instead.
Fixes#17238.
Change-Id: I827532ea2f5f1f3ed04179f8957dd2c563946ed0
Reviewed-on: https://go-review.googlesource.com/32103
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently we initialize LR on a new stack by writing nil to it. But
this is an initializing write since the newly allocated stack is not
zeroed, so this is unsafe with the hybrid barrier. Change this is a
uintptr write to avoid a bad write barrier.
Updates #17503.
Change-Id: I062ac352e35df7da4644c1f2a5aaab87049d1f60
Reviewed-on: https://go-review.googlesource.com/32093
Reviewed-by: Rick Hudson <rlh@golang.org>
We reuse finalizers in finblocks, which are allocated off-heap. This
means they have to be zero-initialized before becoming visible to the
garbage collector. We actually already do this by clearing the
finalizer before returning it to the pool, but we're not careful to
enforce correct memory ordering. Fix this by manipulating the
finalizer count atomically so these writes synchronize properly with
the garbage collector.
Updates #17503.
Change-Id: I7797d31df3c656c9fe654bc6da287f66a9e2037d
Reviewed-on: https://go-review.googlesource.com/31454
Reviewed-by: Rick Hudson <rlh@golang.org>
runfinq allocates a stack frame on the heap for constructing the
finalizer function calls and reuses it for each call. However, because
the type of this frame is constantly shifting, it tells mallocgc there
are no pointers in it and it acts essentially like uninitialized
memory between uses. But runfinq uses pointer writes with write
barriers to "initialize" this memory, which is not going to be safe
with the hybrid barrier, since the hybrid barrier may see a stale
pointer left in the "uninitialized" frame.
Fix this by zero-initializing the argument values in the frame before
writing the argument pointers.
Updates #17503.
Change-Id: I951c0a2be427eb9082a32d65c4410e6fdef041be
Reviewed-on: https://go-review.googlesource.com/31453
Reviewed-by: Rick Hudson <rlh@golang.org>
Since barrier-less memclr is only safe in very narrow circumstances,
this commit renames memclr to avoid accidentally calling memclr on
typed memory. This can cause subtle, non-deterministic bugs, so it's
worth some effort to prevent. In the near term, this will also prevent
bugs creeping in from any concurrent CLs that add calls to memclr; if
this happens, whichever patch hits master second will fail to compile.
This also adds the other new memclr variants to the compiler's
builtin.go to minimize the churn on that binary blob. We'll use these
in future commits.
Updates #17503.
Change-Id: I00eead049f5bd35ca107ea525966831f3d1ed9ca
Reviewed-on: https://go-review.googlesource.com/31369
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently fixalloc does not zero memory it reuses. This is dangerous
with the hybrid barrier if the type may contain heap pointers, since
it may cause us to observe a dead heap pointer on reuse. It's also
error-prone since it's the only allocator that doesn't zero on
allocation (mallocgc of course zeroes, but so do persistentalloc and
sysAlloc). It's also largely pointless: for mcache, the caller
immediately memclrs the allocation; and the two specials types are
tiny so there's no real cost to zeroing them.
Change fixalloc to zero allocations by default.
The only type we don't zero by default is mspan. This actually
requires that the spsn's sweepgen survive across freeing and
reallocating a span. If we were to zero it, the following race would
be possible:
1. The current sweepgen is 2. Span s is on the unswept list.
2. Direct sweeping sweeps span s, finds it's all free, and releases s
to the fixalloc.
3. Thread 1 allocates s from fixalloc. Suppose this zeros s, including
s.sweepgen.
4. Thread 1 calls s.init, which sets s.state to _MSpanDead.
5. On thread 2, background sweeping comes across span s in allspans
and cas's s.sweepgen from 0 (sg-2) to 1 (sg-1). Now it thinks it
owns it for sweeping. 6. Thread 1 continues initializing s.
Everything breaks.
I would like to fix this because it's obviously confusing, but it's a
subtle enough problem that I'm leaving it alone for now. The solution
may be to skip sweepgen 0, but then we have to think about wrap-around
much more carefully.
Updates #17503.
Change-Id: Ie08691feed3abbb06a31381b94beb0a2e36a0613
Reviewed-on: https://go-review.googlesource.com/31368
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently the zero value of an mspan is in the "in use" state. This
seems like a bad idea in general. But it's going to wreak havoc when
we make fixalloc zero allocations: even "freed" mspan objects are
still on the allspans list and still get looked at by the garbage
collector. Hence, if we leave the mspan states the way they are,
allocating a span that reuses old memory will temporarily pass that
span (which is visible to GC!) through the "in use" state, which can
cause "unswept span" panics.
Fix all of this by making the zero state "dead".
Updates #17503.
Change-Id: I77c7ac06e297af4b9e6258bc091c37abe102acc3
Reviewed-on: https://go-review.googlesource.com/31367
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The hybrid barrier requires distinguishing typed and untyped memory
even when zeroing because the *current* contents of the memory matters
even when overwriting.
This commit introduces runtime.typedmemclr and runtime.memclrHasPointers
as a typed memory clearing functions parallel to runtime.typedmemmove.
Currently these simply call memclr, but with the hybrid barrier we'll
need to shade any pointers we're overwriting. These will provide us
with the necessary hooks to do so.
Updates #17503.
Change-Id: I74478619f8907825898092aaa204d6e4690f27e6
Reviewed-on: https://go-review.googlesource.com/31366
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently all mcaches are flushed in a single STW root job. This takes
about 5 µs per P, but since it's done sequentially it adds about
5*GOMAXPROCS µs to the STW.
Fix this by parallelizing the job. Since there are exactly GOMAXPROCS
mcaches to flush, this parallelizes quite nicely and brings the STW
latency cost down to a constant 5 µs (assuming GOMAXPROCS actually
reflects the number of CPUs).
Updates #17503.
Change-Id: Ibefeb1c2229975d5137c6e67fac3b6c92103742d
Reviewed-on: https://go-review.googlesource.com/32033
Reviewed-by: Rick Hudson <rlh@golang.org>
Added functions with suffix proto and stuff from pprof tool to translate
to protobuf. Done as the profile proto is more extensible than the legacy
pprof format and is pprof's preferred profile format. Large part was taken
from https://github.com/google/pprof tool. Tested by hand and compared the
result with translated by pprof tool, profiles are identical.
Fixes#16093
Change-Id: I5acdb2809cab0d16ed4694fdaa7b8ddfd68df11e
Reviewed-on: https://go-review.googlesource.com/30556
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
The current logic in gcDrain conflates non-blocking with preemptible
draining for root jobs. As a result, if you do a non-blocking (but
*not* preemptible) drain, like dedicated workers do, the root job
drain will stop if preempted and fall through to heap marking jobs,
which won't stop until it fails to get a heap marking job.
This commit fixes the condition on root marking jobs so they only stop
when preempted if the drain is preemptible.
Coincidentally, this also fixes a nil pointer dereference if we call
gcDrain with gcDrainNoBlock and without a user G, since it tries to
get the preempt flag from the nil user G. This combination never
happens right now, but will in the future.
Change-Id: Ia910ec20a9b46237f7926969144a33b1b4a7b2f9
Reviewed-on: https://go-review.googlesource.com/32291
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
This adds support to the runtime/trace test for saving traces
collected by its tests to disk and a script in internal/trace that
uses this to collect canned traces for the trace test suite. This can
be used to add to the test suite when we introduce a new trace format
version.
Change-Id: Id9ac1ff312235bf02f982fdfff8a827f54035758
Reviewed-on: https://go-review.googlesource.com/32290
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently when a goroutine blocks on a GC assist, it emits a generic
EvGoBlock event. Since assist blocking events and, in particular, the
length of the blocked assist queue, are important for diagnosing GC
behavior, this commit adds a new EvGoBlockGC event for blocking on a
GC assist. The trace viewer uses this event to report a "waiting on
GC" count in the "Goroutines" row. This makes sense because, unlike
other blocked goroutines, these goroutines do have work to do, so
being blocked on a GC assist is quite similar to being in the
"runnable" state, which we also report in the trace viewer.
Change-Id: Ic21a326992606b121ea3d3d00110d8d1fdc7a5ef
Reviewed-on: https://go-review.googlesource.com/30704
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently mark workers are shown in the trace as regular goroutines
labeled "runtime.gcBgMarkWorker". That's somewhat unhelpful to an end
user because of the opaque label and particularly unhelpful to runtime
developers because it doesn't distinguish the different types of mark
workers.
Fix this by introducing a variant of the GoStart event called
GoStartLabel that lets the runtime indicate a label for a goroutine
execution span and using this to label mark worker executions as "GC
(<mode>)" in the trace viewer.
Since this bumps the trace version to 1.8, we also add test data for
1.7 traces.
Change-Id: Id7b9c0536508430c661ffb9e40e436f3901ca121
Reviewed-on: https://go-review.googlesource.com/30702
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently, gcDrain looks for the preemption flag at getg().preempt.
However, commit d6625ca moved mark worker draining to the system
stack, which means getg() returns the g0, which never has the preempt
flag set, so idle and fractional workers don't get preempted after
10ms and just run until they run out of work. As a result, if there's
enough idle time, GC becomes effectively STW.
Fix this by looking for the preemption flag on getg().m.curg, which
will always be the user G (where the preempt flag is set), regardless
of whether gcDrain is running on the user or the g0 stack.
Change-Id: Ib554cf49a705b86ccc3d08940bc869f868c50dd2
Reviewed-on: https://go-review.googlesource.com/32251
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
runtime.SetMutexProfileFraction(n int) will capture 1/n-th of stack
traces of goroutines holding contended mutexes if n > 0. From runtime/pprof,
pprot.Lookup("mutex").WriteTo writes the accumulated
stack traces to w (in essentially the same format that blocking
profiling uses).
Change-Id: Ie0b54fa4226853d99aa42c14cb529ae586a8335a
Reviewed-on: https://go-review.googlesource.com/29650
Reviewed-by: Austin Clements <austin@google.com>
- removes the runtime function stringtoslicebytetmp
- removes the generation of calls to stringtoslicebytetmp from the frontend
- adds handling of OSTRARRAYBYTETMP in the backend
This reduces binary sizes and avoids function call overhead.
Change-Id: Ib9988d48549cee663b685b4897a483f94727b940
Reviewed-on: https://go-review.googlesource.com/32158
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
I do not know why it is included. All tests pass without it.
Change-Id: I839076ee131816dfd177570a902c69fe8fba5022
Reviewed-on: https://go-review.googlesource.com/32144
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Otherwise, the way the ELF dynamic linker works means that you can end up with
the same itab being passed to additab twice, leading to the itab linked list
having a cycle in it. Add a test to additab in runtime to catch this when it
happens, not some arbitrary and surprsing time later.
Fixes#17594
Change-Id: I6c82edcc9ac88ac188d1185370242dc92f46b1ad
Reviewed-on: https://go-review.googlesource.com/32131
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Assembly copied from the clock_gettime(CLOCK_MONOTONIC)
call in runtime.nanotime in these files and then modified to use
CLOCK_REALTIME.
Also comment system call numbers in a few other files.
Fixes#11222.
Change-Id: Ie132086de7386f865908183aac2713f90fc73e0d
Reviewed-on: https://go-review.googlesource.com/32177
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Don't include package path when creating LSyms for auto and param
variables during Prog generation, and update the DWARF emit routine
accordingly (remove the code that chops off package path from names in
DWARF var location expressions). Implementation suggested by mdempsky@.
The intent of this change is to have saner location expressions in cases
where the variable corresponds to a structure field. For example, the
SSA compiler's "decompose" phase can take a slice value and break it
apart into three scalar variables corresponding to the fields (slice "X"
gets split into "X.len", "X.cap", "X.ptr"). In such cases we want the
name in the location expression to omit the package path but preserve
the original variable name (e.g. "X").
Fixes#16338
Change-Id: Ibc444e7f3454b70fc500a33f0397e669d127daa1
Reviewed-on: https://go-review.googlesource.com/31819
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Currently, markroot delays scanning mark worker stacks until mark
termination by putting the mark worker G directly on the rescan list
when it encounters one during the mark phase. Without this, since mark
workers are non-preemptible, two mark workers that attempt to scan
each other's stacks can deadlock.
However, this is annoyingly asymmetric and causes some real problems.
First, markroot does not own the G at that point, so it's not
technically safe to add it to the rescan list. I haven't been able to
find a specific problem this could cause, but I suspect it's the root
cause of issue #17099. Second, this will interfere with the hybrid
barrier, since there is no stack rescanning during mark termination
with the hybrid barrier.
This commit switches to a different approach. We move the mark
worker's call to gcDrain to the system stack and set the mark worker's
status to _Gwaiting for the duration of the drain to indicate that
it's preemptible. This lets another mark worker scan its G stack while
the drain is running on the system stack. We don't return to the G
stack until we can switch back to _Grunning, which ensures we don't
race with a stack scan. This lets us eliminate the special case for
mark worker stack scans and scan them just like any other goroutine.
The only subtlety to this approach is that we have to disable stack
shrinking for mark workers; they could be referring to captured
variables from the G stack, so it's not safe to move their stacks.
Updates #17099 and #17503.
Change-Id: Ia5213949ec470af63e24dfce01df357c12adbbea
Reviewed-on: https://go-review.googlesource.com/31820
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently, if the number of stack barriers for a stack is 0, we'll
create a zero-length slice that points just past the end of the stack
allocation. This bad pointer causes GC panics.
Fix this by creating a nil slice if the stack barrier count is 0.
In practice, the only way this can happen is if
GODEBUG=gcstackbarrieroff=1 is set because even the minimum size stack
reserves space for two stack barriers.
Change-Id: I3527c9a504c445b64b81170ee285a28594e7983d
Reviewed-on: https://go-review.googlesource.com/31762
Reviewed-by: Rick Hudson <rlh@golang.org>
This adds debug code enabled in gccheckmark mode that panics if we
attempt to mark an unallocated object. This is a common issue with the
hybrid barrier when we're manipulating uninitialized memory that
contains stale pointers. This also tends to catch bugs that will lead
to "sweep increased allocation count" crashes closer to the source of
the bug.
Change-Id: I443ead3eac6f316a46f50b106078b524cac317f4
Reviewed-on: https://go-review.googlesource.com/31761
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently reflectcall has a subtle dance with write barriers where the
assembly code copies the result values from the stack to the in-heap
argument frame without write barriers and then calls into the runtime
after the fact to invoke the necessary write barriers.
For the hybrid barrier (and for ROC), we need to switch to a
*pre*-write write barrier, which is very difficult to do with the
current setup. We could tie ourselves in knots of subtle reasoning
about why it's okay in this particular case to have a post-write write
barrier, but this commit instead takes a different approach. Rather
than making things more complex, this simplifies reflection calls so
that the argument copy is done in Go using normal bulk write barriers.
The one difficulty with this approach is that calling into Go requires
putting arguments on the stack, but the call* functions "donate" their
entire stack frame to the called function. We can get away with this
now because the copy avoids using the stack and has copied the results
out before we clobber the stack frame to call into the write barrier.
The solution in this CL is to call another function, passing arguments
in registers instead of on the stack, and let that other function
reserve more stack space and setup the arguments for the runtime.
This approach seemed to work out the best. I also tried making the
call* functions reserve 32 extra bytes of frame for the write barrier
arguments and adjust SP up by 32 bytes around the call. However, even
with the necessary changes to the assembler to correct the spdelta
table, the runtime was still having trouble with the frame layout (and
the changes to the assembler caused many other things that do strange
things with the SP to fail to assemble). The approach I took doesn't
require any funny business with the SP.
Updates #17503.
Change-Id: Ie2bb0084b24d6cff38b5afb218b9e0534ad2119e
Reviewed-on: https://go-review.googlesource.com/31655
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Now that sweeping and span marking use the sweep list, there's no need
for the work.spans snapshot of the allspans list. This change
eliminates the few remaining uses of it, which are either dead code or
can use allspans directly, and removes work.spans and its support
functions.
Change-Id: Id5388b42b1e68e8baee853d8eafb8bb4ff95bb43
Reviewed-on: https://go-review.googlesource.com/30537
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently markrootSpans iterates over all spans ever allocated to find
the in-use spans. Since we now have a list of in-use spans, change it
to iterate over that instead.
This, combined with the previous change, fixes#9265. Before these two
changes, blowing up the heap to 8GB and then shrinking it to a 0MB
live set caused the small-heap portion of the test to run 60x slower
than without the initial blowup. With these two changes, the time is
indistinguishable.
No significant effect on other benchmarks.
Change-Id: I4a27e533efecfb5d18cba3a87c0181a81d0ddc1e
Reviewed-on: https://go-review.googlesource.com/30536
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently sweeping walks the list of all spans, which means the work
in sweeping is proportional to the maximum number of spans ever used.
If the heap was once large but is now small, this causes an
amortization failure: on a small heap, GCs happen frequently, but a
full sweep still has to happen in each GC cycle, which means we spent
a lot of time in sweeping.
Fix this by creating a separate list consisting of just the in-use
spans to be swept, so sweeping is proportional to the number of in-use
spans (which is proportional to the live heap). Specifically, we
create two lists: a list of unswept in-use spans and a list of swept
in-use spans. At the start of the sweep cycle, the swept list becomes
the unswept list and the new swept list is empty. Allocating a new
in-use span adds it to the swept list. Sweeping moves spans from the
unswept list to the swept list.
This fixes the amortization problem because a shrinking heap moves
spans off the unswept list without adding them to the swept list,
reducing the time required by the next sweep cycle.
Updates #9265. This fix eliminates almost all of the time spent in
sweepone; however, markrootSpans has essentially the same bug, so now
the test program from this issue spends all of its time in
markrootSpans.
No significant effect on other benchmarks.
Change-Id: Ib382e82790aad907da1c127e62b3ab45d7a4ac1e
Reviewed-on: https://go-review.googlesource.com/30535
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently we set the len and cap of h.spans to the full reserved
region of the address space and track the actual mapped region
separately in h.spans_mapped. Since we have both the len and cap at
our disposal, change things so len(h.spans) tracks how much of the
spans array is mapped and eliminate h.spans_mapped. This simplifies
mheap and means we'll get nice "index out of bounds" exceptions if we
do try to go off the end of the spans rather than a SIGSEGV.
Change-Id: I8ed9a1a9a844d90e9fd2e269add4704623dbdfe6
Reviewed-on: https://go-review.googlesource.com/30533
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Like h_allspans and mheap_.allspans, these were two ways of referring
to the spans array from when the runtime was split between C and Go.
Clean this up by making mheap_.spans a slice and eliminating h_spans.
Change-Id: I3aa7038d53c3a4252050aa33e468c48dfed0b70e
Reviewed-on: https://go-review.googlesource.com/30532
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This was necessary in the C days when allspans was an mspan**, but now
that allspans is a Go slice, this is redundant with len(allspans) and
we can use range loops over allspans.
Change-Id: Ie1dc39611e574e29a896e01690582933f4c5be7e
Reviewed-on: https://go-review.googlesource.com/30531
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
These are two ways to refer to the allspans array that hark back to
when the runtime was split between C and Go. Clean this up by making
mheap_.allspans a slice and eliminating h_allspans.
Change-Id: Ic9360d040cf3eb590b5dfbab0b82e8ace8525610
Reviewed-on: https://go-review.googlesource.com/30530
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
These are emulated by the assembler and we don't need them.
Change-Id: I2b07c5315a5b642fdb5e50b468453260ae121164
Reviewed-on: https://go-review.googlesource.com/31758
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Looking at the kernel sources, I don't see how this is possible.
But obviously it is. Just try again.
Fixes#17161.
Change-Id: Iea7d53f7cf75944792d2f75a0d07129831c7bcdb
Reviewed-on: https://go-review.googlesource.com/31823
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Makes windows same as others.
Change-Id: Ib4651e06d0bd37473ac345d36c91f39aa8f5e662
Reviewed-on: https://go-review.googlesource.com/31791
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently mspan.isFree technically returns whether the object was not
allocated *during this cycle*. Fix it so it actually returns whether
or not the object is allocated so the method is more generally useful
(especially for debugging).
It has one caller, which is carefully written to be insensitive to
this distinction, but this lets us simplify this caller.
Change-Id: I9d79cf784a56015e434961733093c1d8d03fc091
Reviewed-on: https://go-review.googlesource.com/30145
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
morestack writes the context pointer to gobuf.ctxt, but since
morestack is written in assembly (and has to be very careful with
state), it does *not* invoke the requisite write barrier for this
write. Instead, we patch this up later, in newstack, where we invoke
an explicit write barrier for ctxt.
This already requires some subtle reasoning, and it's going to get a
lot hairier with the hybrid barrier.
Fix this by simplifying the whole mechanism. Instead of writing
gobuf.ctxt in morestack, just pass the value of the context register
to newstack and let it write it to gobuf.ctxt. This is a normal Go
pointer write, so it gets the normal Go write barrier. No subtle
reasoning required.
Updates #17503.
Change-Id: Ia6bf8459bfefc6828f53682ade32c02412e4db63
Reviewed-on: https://go-review.googlesource.com/31550
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This commit fixes two bizarrely related bugs:
1. The signatures for the call* functions were wrong, indicating that
they had only two pointer arguments instead of three. We didn't notice
because the call* functions are defined by a macro expansion, which go
vet doesn't see.
2. deferArgs on a defer object with a zero-sized frame returned a
pointer just past the end of the allocated object, which is illegal in
Go (and can cause the "sweep increased allocation count" crashes).
In a fascinating twist, these two bugs canceled each other out, which
is why I'm fixing them together. The pointer returned by deferArgs is
used in only two ways: as an argument to memmove and as an argument to
reflectcall. memmove is NOSPLIT, so the argument was unobservable.
reflectcall immediately tail calls one of the call* functions, which
are not NOSPLIT, but the deferArgs pointer just happened to be the
third argument that was accidentally marked as a scalar. Hence, when
the garbage collector scanned the stack, it didn't see the bad
pointer as a pointer.
I believe this was all ultimately benign. In principle, stack growth
during the reflectcall could fail to update the args pointer, but it
never points to the stack, so it never needs to be updated. Also in
principle, the garbage collector could fail to mark the args object
because of the incorrect call* signatures, but in all calls to
reflectcall (including the ones spelled "call" in the reflect package)
the args object is kept live by the calling stack.
Change-Id: Ic932c79d5f4382be23118fdd9dba9688e9169e28
Reviewed-on: https://go-review.googlesource.com/31654
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
trace's reader *g is going to cause write barriers in unfortunate
places, so replace it with a guintptr.
Change-Id: Ie8fb13bb89a78238f9d2a77ec77da703e96df8af
Reviewed-on: https://go-review.googlesource.com/31469
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Fixes#16076
Change-Id: I91fa87b642592ee4604537dd8c3197cd61ec8b31
Reviewed-on: https://go-review.googlesource.com/31516
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This is a more robust method for obtaining the availability of vx.
Since this variable may be checked frequently I've also now
padded it so that it will be in its own cache line.
I've kept the other check (in hash/crc32) the same for now until
I can figure out the best way to update it.
Updates #15403.
Change-Id: I74eed651afc6f6a9c5fa3b88fa6a2b0c9ecf5875
Reviewed-on: https://go-review.googlesource.com/31149
Reviewed-by: Austin Clements <austin@google.com>
oneNewExtraM creates a spare M and G for use with cgo callbacks. The G
doesn't run right away, but goes directly into syscall status. For the
garbage collector, it's marked as "scan valid" and not on the rescan
list, but I forgot to also mark it as "scan done". As a result,
gcMarkRootCheck thinks that the goroutine hasn't been scanned and
panics.
This only affects GODEBUG=gccheckmark=1 mode, since we otherwise skip
the gcMarkRootCheck.
Fixes#17473.
Change-Id: I94f5671c42eb44bd5ea7dc68fbf85f0c19e2e52c
Reviewed-on: https://go-review.googlesource.com/31139
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Updating the heap profile stats is one of the most expensive parts of
mark termination other than stack rescanning, but there's really no
need to do this with the world stopped. Move it to right after we've
started the world back up. This creates a *very* small window where
allocations from the next cycle can slip into the profile, but the
exact point where mark termination happens is so non-deterministic
already that a slight reordering here is unimportant.
Change-Id: I2f76f22c70329923ad6a594a2c26869f0736d34e
Reviewed-on: https://go-review.googlesource.com/31363
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The only reason these flushes are still necessary at all is that
gcmarknewobject doesn't flush its gcWork stats like it's supposed to.
By changing gcmarknewobject to follow the standard protocol, the
flushes become completely unnecessary because mark 2 ensures caches
are flushed (and stay flushed) before we ever enter mark termination.
In the garbage benchmark, this takes roughly 50 µs, which is
surprisingly long for doing nothing. We still double-check after
draining that they are in fact empty.
Change-Id: Ia1c7cf98a53f72baa513792eb33eca6a0b4a7128
Reviewed-on: https://go-review.googlesource.com/31134
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The pointer checking code needs to know the exact type of the parameter
expected by the C function, so that it can use a type assertion to
convert the empty interface returned by cgoCheckPointer to the correct
type. Previously this was done by using a type conversion, but that
meant that the code accepted arguments that were convertible to the
parameter type, rather than arguments that were assignable as in a
normal function call. In other words, some code that should not have
passed type checking was accepted.
This CL changes cgo to always use a function literal for pointer
checking. Now the argument is passed to the function literal, which has
the correct argument type, so type checking is performed just as for a
function call as it should be.
Since we now always use a function literal, simplify the checking code
to run as a statement by itself. It now no longer needs to return a
value, and we no longer need a type assertion.
This does have the cost of introducing another function call into any
call to a C function that requires pointer checking, but the cost of the
additional call should be minimal compared to the cost of pointer
checking.
Fixes#16591.
Change-Id: I220165564cf69db9fd5f746532d7f977a5b2c989
Reviewed-on: https://go-review.googlesource.com/31233
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The panic leaves the lock in an unusable state.
Trying to panic with a usable state makes the lock significantly
less efficient and scalable (see early CL patch sets and discussion).
Instead, use runtime.throw, which will crash the program directly.
In general throw is reserved for when the runtime detects truly
serious, unrecoverable problems. This problem is certainly serious,
and, without a significant performance hit, is unrecoverable.
Fixes#13879.
Change-Id: I41920d9e2317270c6f909957d195bd8b68177f8d
Reviewed-on: https://go-review.googlesource.com/31359
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Correcting a line in asm_ppc64x.s in the cmpbodyLE function
that originally was R14 but accidentally changed to R4.
Fixes#17488
Change-Id: Id4ca6fb2e0cd81251557a0627e17b5e734c39e01
Reviewed-on: https://go-review.googlesource.com/31266
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
GC assists retry if preempted or if they fail to park. However, on the
retry path they currently use stale statistics. In particular, the
retry can use "debtBytes", but debtBytes isn't updated when the debt
changes (since other than retries it is only used once). Also, though
less of a problem, the if the assist ratio has changed while the
assist was blocked, the retry will still use the old assist ratio.
Fix all of this by simply making the retry jump back to where we
compute these statistics, rather than just after.
Change-Id: I2ed8b4f0fc9f008ff060aa926f4334b662ac7d3f
Reviewed-on: https://go-review.googlesource.com/30701
Reviewed-by: Rick Hudson <rlh@golang.org>
This puts all of the assist queue-related code together and makes it
easier to modify how the assist queue works.
Change-Id: Id54e06702bdd5a5dd3fef2ce2c14cd7ca215303c
Reviewed-on: https://go-review.googlesource.com/30700
Reviewed-by: Rick Hudson <rlh@golang.org>
It's pretty simple. For:
e = (interface{})(i)
Do:
tmp = i.itab
if tmp != nil {
tmp = tmp.typ_ // load type from itab
}
e = eface{tmp, i.data}
It is smaller and faster than calling the runtime.
Change-Id: I0ad27f62f4ec0b6cd53bc8530e4da0eae3e67a6c
Reviewed-on: https://go-review.googlesource.com/31260
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
getArgInfo for reflect.makeFuncStub and reflect.methodValueCall is
necessarily special. These have dynamically determined argument maps
that are stored in their context (that is, their *funcval). These
functions are written to store this context at 0(SP) when called, and
getArgInfo retrieves it from there.
This technique works if getArgInfo is passed an active call frame for
one of these functions. However, getArgInfo is also used in
tracebackdefers, where the "call" is not a true call with an active
stack frame, but a deferred call. In this situation, getArgInfo
currently crashes because tracebackdefers passes a frame with sp set
to 0. However, the entire approach used by getArgInfo is flawed in
this situation because the wrapper has not actually executed, and
hence hasn't saved this metadata to any stack frame.
In the defer case, we know the *funcval from the _defer itself, so we
can fix this by teaching getArgInfo to use the *funcval context
directly when its available, and otherwise get it from the active call
frame.
While we're here, this commit simplifies getArgInfo a bit by making it
play more nicely with the type system. Rather than decoding the
*reflect.methodValue that is the wrapper's context as a *[2]uintptr,
just write out a copy of the reflect.methodValue type in the runtime.
Fixes#16331. Fixes#17471.
Change-Id: I81db4d985179b4a81c68c490cceeccbfc675456a
Reviewed-on: https://go-review.googlesource.com/31138
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
If morestack runs on the g0 or gsignal stack, it currently performs
some abort operation that typically produces a signal (e.g., it does
an INT $3 on x86). This is useful if you're running in a debugger, but
if you're not, the runtime tries to trap this signal, which is likely
to send the program into a deeper spiral of collapse and lead to very
confusing diagnostic output.
Help out people trying to debug without a debugger by making morestack
print an informative message before blowing up.
Change-Id: I2814c64509b137bfe20a00091d8551d18c2c4749
Reviewed-on: https://go-review.googlesource.com/31133
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This improves the performance for byte.Compare by rewriting
the cmpbody function in runtime/asm_ppc64x.s. The previous code
had a simple loop which loaded a pair of bytes and compared them,
which is inefficient for long buffers. The updated function checks
for 8 or 32 byte chunks and then loads and compares double words where
possible.
Because the byte.Compare result indicates greater or less than,
the doubleword loads must take endianness into account, using a
byte reversed load in the little endian case.
Fixes#17433
benchmark old ns/op new ns/op delta
BenchmarkBytesCompare/8-16 13.6 7.16 -47.35%
BenchmarkBytesCompare/16-16 25.7 7.83 -69.53%
BenchmarkBytesCompare/32-16 38.1 7.78 -79.58%
BenchmarkBytesCompare/64-16 63.0 10.6 -83.17%
BenchmarkBytesCompare/128-16 112 13.0 -88.39%
BenchmarkBytesCompare/256-16 211 28.1 -86.68%
BenchmarkBytesCompare/512-16 410 38.6 -90.59%
BenchmarkBytesCompare/1024-16 807 60.2 -92.54%
BenchmarkBytesCompare/2048-16 1601 103 -93.57%
Change-Id: I121acc74fcd27c430797647b8d682eb0607c63eb
Reviewed-on: https://go-review.googlesource.com/30949
Reviewed-by: David Chase <drchase@google.com>
Copies utf8 constants and EncodeRune implementation from unicode/utf8.
Adds a new decoderune implementation that is used by the compiler
in code generated for ranging over strings. It does not handle
ASCII runes since these are handled directly before calls to decoderune.
The DecodeRuneInString implementation from unicode/utf8 is not used
since it uses a lookup table that would increase the use of cpu caches.
Adds more tests that check decoding of valid and invalid utf8 sequences.
name old time/op new time/op delta
RuneIterate/range2/ASCII-4 7.45ns ± 2% 7.45ns ± 1% ~ (p=0.634 n=16+16)
RuneIterate/range2/Japanese-4 53.5ns ± 1% 49.2ns ± 2% -8.03% (p=0.000 n=20+20)
RuneIterate/range2/MixedLength-4 46.3ns ± 1% 41.0ns ± 2% -11.57% (p=0.000 n=20+20)
new:
"".decoderune t=1 size=423 args=0x28 locals=0x0
old:
"".charntorune t=1 size=666 args=0x28 locals=0x0
Change-Id: I1df1fdb385bb9ea5e5e71b8818ea2bf5ce62de52
Reviewed-on: https://go-review.googlesource.com/28490
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently we use go:nowritebarrier in many places in proc.go.
go:notinheap and go:yeswritebarrierrec now let us use
go:nowritebarrierrec (the recursive form of the go:nowritebarrier
pragma) more liberally. Do so in proc.go
Change-Id: Ia7fcbc12ce6c51cb24730bf835fb7634ad53462f
Reviewed-on: https://go-review.googlesource.com/30942
Reviewed-by: Rick Hudson <rlh@golang.org>
This covers basically all sysAlloc'd, persistentalloc'd, and
fixalloc'd types.
Change-Id: I0487c887c2a0ade5e33d4c4c12d837e97468e66b
Reviewed-on: https://go-review.googlesource.com/30941
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently mspan links to its previous mspan using a **mspan field that
points to the previous span's next field. This simplifies some of the
list manipulation code, but is going to make it very hard to convince
the compiler that mspan list manipulations don't need write barriers.
Fix this by using a more traditional ("boring") linked list that uses
a simple *mspan pointer to the previous mspan. This complicates some
of the list manipulation slightly, but it will let us eliminate all
write barriers from the mspan list manipulation code by marking mspan
go:notinheap.
Change-Id: I0d0b212db5f20002435d2a0ed2efc8aa0364b905
Reviewed-on: https://go-review.googlesource.com/30940
Reviewed-by: Rick Hudson <rlh@golang.org>
This adds a //go:notinheap pragma for declarations of types that must
not be heap allocated. We ensure these rules by disallowing new(T),
make([]T), append([]T), or implicit allocation of T, by disallowing
conversions to notinheap types, and by propagating notinheap to any
struct or array that contains notinheap elements.
The utility of this pragma is that we can eliminate write barriers for
writes to pointers to go:notinheap types, since the write barrier is
guaranteed to be a no-op. This will let us mark several scheduler and
memory allocator structures as go:notinheap, which will let us
disallow write barriers in the scheduler and memory allocator much
more thoroughly and also eliminate some problematic hybrid write
barriers.
This also makes go:nowritebarrierrec and go:yeswritebarrierrec much
more powerful. Currently we use go:nowritebarrier all over the place,
but it's almost never what you actually want: when write barriers are
illegal, they're typically illegal for a whole dynamic scope. Partly
this is because go:nowritebarrier has been around longer, but it's
also because go:nowritebarrierrec couldn't be used in situations that
had no-op write barriers or where some nested scope did allow write
barriers. go:notinheap eliminates many no-op write barriers and
go:yeswritebarrierrec makes it possible to opt back in to write
barriers, so these two changes will let us use go:nowritebarrierrec
far more liberally.
This updates #13386, which is about controlling pointers from non-GC'd
memory to GC'd memory. That would require some additional pragma (or
pragmas), but could build on this pragma.
Change-Id: I6314f8f4181535dd166887c9ec239977b54940bd
Reviewed-on: https://go-review.googlesource.com/30939
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This pragma cancels the effect of go:nowritebarrierrec. This is useful
in the scheduler because there are places where we enter a function
without a valid P (and hence cannot have write barriers), but then
obtain a P. This allows us to annotate the function with
go:nowritebarrierrec and split out the part after we've obtained a P
into a go:yeswritebarrierrec function.
Change-Id: Ic8ce4b6d3c074a1ecd8280ad90eaf39f0ffbcc2a
Reviewed-on: https://go-review.googlesource.com/30938
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
To compile:
m[k] = v
instead of:
mapassign(maptype, m, &k, &v), do
do:
*mapassign(maptype, m, &k) = v
mapassign returns a pointer to the value slot in the map. It is just
like mapaccess except that it will allocate a new slot if k is not
already present in the map.
This makes map accesses faster but potentially larger (codewise).
It is faster because the write into the map is done when the compiler
knows the concrete type, so it can be done with a few store
instructions instead of calling typedmemmove. We also potentially
avoid stack temporaries to hold v.
The code can be larger when the map has pointers in its value type,
since there is a write barrier call in addition to the mapassign call.
That makes the code at the callsite a bit bigger (go binary is 0.3%
bigger).
This CL is in preparation for doing operations like m[k] += v with
only a single runtime call. That will roughly double the speed of
such operations.
Update #17133
Update #5147
Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5
Reviewed-on: https://go-review.googlesource.com/30815
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Update comments for duffzero and duffcopy
which referred to legacy locations:
+ cmd/?g/cgen.go
+ cmd/?g/ggen.go
Remnants of the old days when we had 5g, 6g etc.
Those locations have since moved to:
+ cmd/compile/internal/<arch>/cgen.go
+ cmd/compile/internal/<arch>/ggen.go
Change-Id: Ie2ea668559d52d42b747260ea69a6d5b3d70e859
Reviewed-on: https://go-review.googlesource.com/29073
Reviewed-by: Russ Cox <rsc@golang.org>
Add checks for failure of CreateEvent, SetEvent or
WaitForSingleObject. Any failures are considered fatal and
will throw() after printing an informative message.
Updates #16646
Change-Id: I3bacf9001d2abfa8667cc3aff163ff2de1c99915
Reviewed-on: https://go-review.googlesource.com/26655
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Fix typo in word synchronization in comments.
Change-Id: I453b4e799301e758799c93df1e32f5244ca2fb84
Reviewed-on: https://go-review.googlesource.com/25174
Reviewed-by: Russ Cox <rsc@golang.org>
The canBackTrace variable is true for all of the architectures
Go supports and this is likely to remain the case as new
architectures are added.
Change-Id: I73900c018eb4b2e5c02fccd8d3e89853b2ba9d90
Reviewed-on: https://go-review.googlesource.com/22423
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
ARM direct CALL/JMP instruction has 24 bit offset, which can only
encodes jumps within +/-32M. When the target is too far, the top
bits get truncated and the program jumps wild.
This CL detects too-far jumps and automatically insert trampolines,
currently only internal linking on ARM.
It is necessary to make the following changes to the linker:
- Resolve direct jump relocs when assigning addresses to functions.
this allows trampoline insertion without moving all code that
already laid down.
- Lay down packages in dependency order, so that when resolving a
inter-package direct jump reloc, the target address is already
known. Intra-package jumps are assumed never too far.
- a linker flag -debugtramp is added for debugging trampolines:
"-debugtramp=1 -v" prints trampoline debug message
"-debugtramp=2" forces all inter-package jump to use
trampolines (currently ARM only)
"-debugtramp=2 -v" does both
- Some data structures are changed for bookkeeping.
On ARM, pseudo DIV/DIVU/MOD/MODU instructions now clobber R8
(unfortunate). In the standard library there is no ARM assembly
code that uses these instructions, and the compiler no longer emits
them (CL 29390).
all.bash passes with -debugtramp=2, except a disassembly test (this
is unavoidable as we changed the instruction).
TBD: debug info of trampolines?
Fixes#17028.
Change-Id: Idcce347ea7e0af77c4079041a160b2f6e114b474
Reviewed-on: https://go-review.googlesource.com/29397
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
If we get a SIGPROF on a non-Go thread, and the program has not called
runtime.SetCgoTraceback so we have no way to collect a stack trace, then
record a profile that is just the PC where the signal occurred. That
will at least point the user to the right area.
Retrieving the PC from the sigctxt in a signal handler on a non-G thread
required marking a number of trivial sigctxt methods as nosplit, and,
for extra safety, nowritebarrierrec.
The test shows that the existing test CgoPprofThread test does not test
the stack trace, just the profile signal. Leaving that for later.
Change-Id: I8f8f3ff09ac099fc9d9df94b5a9d210ffc20c4ab
Reviewed-on: https://go-review.googlesource.com/30252
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
CL 14472 solved issue #12030 by explicitly linking msvcrt.dll
to every cgo executable we build. This CL achieves the same
by manually loading ntdll.dll during startup.
Updates #12030
Change-Id: I5d9cd925ef65cc34c5d4031c750f0f97794529b2
Reviewed-on: https://go-review.googlesource.com/30737
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Currently these are labeled "MARK", which was accurate in the STW
collector, but these really indicate mark termination now, since
marking happens for the full duration of the concurrent GC. Re-label
them as "MARK TERMINATION" to clarify this.
Change-Id: Ie98bd961195acde49598b4fa3f9e7d90d757c0a6
Reviewed-on: https://go-review.googlesource.com/30018
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
When GC is disabled, we set gcpercent to -1. However, we still use
gcpercent to compute several values, such as next_gc and gc_trigger.
These calculations are meaningless when gcpercent is -1 and result in
meaningless values. This is okay in a sense because we also never use
these values if gcpercent is -1, but they're confusing when exposed to
the user, for example via MemStats or the execution trace. It's
particularly unfortunate in the execution trace because it attempts to
plot the underflowed value of next_gc, which scales all useful
information in the heap row into oblivion.
Fix this by making next_gc ^0 when gcpercent < 0. This has the
advantage of being true in a way: next_gc is effectively infinite when
gcpercent < 0. We can also detect this special value when updating the
execution trace and report next_gc as 0 so it doesn't blow up the
display of the heap line.
Change-Id: I4f366e4451f8892a4908da7b2b6086bdc67ca9a9
Reviewed-on: https://go-review.googlesource.com/30016
Reviewed-by: Rick Hudson <rlh@golang.org>
On 64-bit big-endian GNU/Linux machines we need to treat sigset as a
single uint64, not as a pair of uint32 values. This fix was already made
for s390x, but not for ppc64 (which is big-endian--the little endian
version is known as ppc64le). So copy os_linux_390.x to
os_linux_be64.go, and use build constraints as needed.
Fixes#17361
Change-Id: Ia0eb18221a8f5056bf17675fcfeb010407a13fb0
Reviewed-on: https://go-review.googlesource.com/30602
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The CgoExternalThreadSIGPROF test starts a thread at constructor time
that does a busy loop. That can throw off some other tests. So only
build that code if testprogcgo is built with the tag threadprof, and
adjust the tests that use that code to pass that build tag.
This revealed that the CgoPprofThread test was not testing what it
should have, as it never actually started the cpuHog thread. It was
passing because of the busy loop thread. Fix it to start the thread as
intended.
Change-Id: I087a9e4fc734a86be16a287456441afac5676beb
Reviewed-on: https://go-review.googlesource.com/30362
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In the function prologue, we emit a jump to the beginning of
the function immediately after calling morestack. And in the
runtime stack growing code, it decodes and emulates that jump.
This emulation was necessary before we had per-PC SP deltas,
since the traceback code assumed that the frame size was fixed
for the whole function, except on the first instruction where
it was 0. Since we now have per-PC SP deltas and PCDATA, we
can correctly record that the frame size is 0. This makes the
emulation unnecessary.
This may be helpful for registerized calling convention, where
there may be unspills of arguments after calling morestack. It
also simplifies the runtime.
Change-Id: I7ebee31eaee81795445b33f521ab6a79624c4ceb
Reviewed-on: https://go-review.googlesource.com/30138
Reviewed-by: David Chase <drchase@google.com>
Calling cgocallback from a signal handler can fail when using the race
detector. Calling cgocallback will lead to a call to newextram which
will call oneNewExtraM which will call racegostart. The racegostart
function will set up some race detector data structures, and doing that
will sometimes call the C memory allocator. If we are running the signal
handler from a signal that interrupted the C memory allocator, we will
crash or hang.
Instead, change the signal handler code to call needm and dropm. The
needm function will grab allocated m and g structures and initialize the
g to use the current stack--the signal stack. That is all we need to
safely call code that allocates memory and checks whether it needs to
split the stack. This may temporarily leave us with no m available to
run a cgo callback, but that is OK in this case since the code we call
will quickly either crash or call dropm to return the m.
Implementing this required changing some of the setSignalstackSP
functions to avoid a write barrier. These functions never need a write
barrier but in some cases generated one anyhow because on some systems
the ss_sp field is a pointer.
Change-Id: I3893f47c3a66278f85eab7f94c1ab11d4f3be133
Reviewed-on: https://go-review.googlesource.com/30218
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Change-Id: I8fd271066925734c3f7196f64db04f27c4ce27cb
Reviewed-on: https://go-review.googlesource.com/30274
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
I avoided anywhere in the compiler or things which might be used by
the compiler in the future, since they need to build with Go 1.4.
I also avoided anywhere where there was no benefit to changing it.
I probably missed some.
Updates #16721
Change-Id: Ib3c895ff475c6dec2d4322393faaf8cb6a6d4956
Reviewed-on: https://go-review.googlesource.com/30250
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
In particular, it wasn't obvious that some values are special (unless
you also found those special values), so document that it isn't
necessarily a hash value.
Change-Id: Iff292822b44408239e26cd882dc07be6df2c1d38
Reviewed-on: https://go-review.googlesource.com/30143
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
gcDumpObject is often used on a stack pointer (for example, when
checkmark finds an unmarked object on the stack), but since stack
spans don't have an elemsize, it doesn't print any of the memory from
the frame. Make it at least slightly more useful by printing
everything between obj and obj+off (inclusive). While we're here, also
print out the span state.
Change-Id: I51be064ea8791b4a365865bfdc7afa7b5aaecfbd
Reviewed-on: https://go-review.googlesource.com/30142
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently span states are untyped constants and the field is just a
uint8. Make this more type-safe by introducing a type for the span
state.
Change-Id: I369bf59fe6e8234475f4921611424fceb7d0a6de
Reviewed-on: https://go-review.googlesource.com/30141
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently the SetFinalizer documentation makes a strong claim that
SetFinalizer will panic if the pointer is not to an object allocated
by calling new, to a composite literal, or to a local variable. This
is not true. For example, it doesn't panic when passed the address of
a package-level variable. Nor can we practically make it true. For
example, we can't distinguish between passing a pointer to a composite
literal and passing a pointer to its first field.
Hence, weaken the guarantee to say that it "may" panic.
Updates #17311. (Might fix it, depending on what we want to do with
package-level variables.)
Change-Id: I1c68ea9d0a5bbd3dd1b7ce329d92b0f05e2e0877
Reviewed-on: https://go-review.googlesource.com/30137
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In FreeBSD 10.0, the _umtx_op syscall was changed to allow sleeping on
any supported clock, but the default clock was switched from a monotonic
clock to CLOCK_REALTIME.
Prior to 10.0, the __umtx_op_wait* functions ignored the fourth argument
to _umtx_op (uaddr1), expected the fifth argument (uaddr2) to be a
struct timespec pointer, and used a monotonic clock (nanouptime(9)) for
timeout calculations.
Since 10.0, if callers want a clock other than CLOCK_REALTIME, they must
call _umtx_op with uaddr1 set to a value greater than sizeof(struct
timespec), and with uaddr2 as pointer to a struct _umtx_time, rather
than a timespec. Callers can set the _clockid field of the struct
_umtx_time to request the clock they want.
The relevant FreeBSD commit:
https://svnweb.freebsd.org/base?view=revision&revision=232144Fixes#17168
Change-Id: I3dd7b32b683622b8d7b4a6a8f9eb56401bed6bdf
Reviewed-on: https://go-review.googlesource.com/30154
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The endcgo function call is currently deferred in case a cgo
callback into Go panics and unwinds through cgocall. Typical cgo
calls do not have callbacks into Go, and even fewer panic, so we
pay the cost of this defer for no typical benefit.
Amazingly, there is another defer on the cgocallback path also used
to cleanup in case the Go code called by cgo panics. This CL folds
the first defer into the second, to reduce the cost of typical cgo
calls.
This reduces the overhead for a no-op cgo call significantly:
name old time/op new time/op delta
CgoNoop-8 93.5ns ± 0% 51.1ns ± 1% -45.34% (p=0.016 n=4+5)
The total effect between Go 1.7 and 1.8 is even greater, as CL 29656
reduced the cost of defer recently. Hopefully a future Go release
will drop the cost of defer to nothing, making this optimization
unnecessary. But until then, this is nice.
Change-Id: Id1a5648f687a87001d95bec6842e4054bd20ee4f
Reviewed-on: https://go-review.googlesource.com/30080
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>