qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-08 03:36:12 -07:00

Author	SHA1	Message	Date
Josh Bleecher Snyder	031f71efdf	runtime: add TestSizeof Borrowed from cmd/compile, TestSizeof ensures that the size of important types doesn't change unexpectedly. It also helps reviewers see the impact of intended changes. Change-Id: If57955f0c3e66054de3f40c6bba585b88694c7be Reviewed-on: https://go-review.googlesource.com/99837 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-09 17:03:25 +00:00
Tobias Klauser	91f74069ef	runtime: fix comment for hwcap on linux/arm hwcap is set in archauxv, setup_auxv no longer exists. Change-Id: I0fc9393e0c1c45192e0eff4715e9bdd69fab2653 Reviewed-on: https://go-review.googlesource.com/99779 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-09 16:02:38 +00:00
Austin Clements	5d22cebb12	runtime: explain and enforce that _panic values live on the stack It's a bit mysterious that _defer.sp is a uintptr that gets stack-adjusted explicitly while _panic.argp is an unsafe.Pointer that doesn't, but turns out to be critically important when a deferred function grows the stack before doing a recover. Add a comment explaining that this works because _panic values live on the stack. Enforce this by marking _panic go:notinheap. Change-Id: I9ca49e84ee1f86d881552c55dccd0662b530836b Reviewed-on: https://go-review.googlesource.com/99735 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-08 23:35:46 +00:00
Austin Clements	60a9e5d613	runtime: ensure abort actually crashes the process On all non-x86 arches, runtime.abort simply reads from nil. Unfortunately, if this happens on a user stack, the signal handler will dutifully turn this into a panicmem, which lets user defers run and which user code can even recover from. To fix this, add an explicit check to the signal handler that turns faults in abort into hard crashes directly in the signal handler. This has the added benefit of giving a register dump at the abort point. Change-Id: If26a7f13790745ee3867db7f53b72d8281176d70 Reviewed-on: https://go-review.googlesource.com/93661 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:55 +00:00
Austin Clements	c950a90d72	runtime: call abort instead of raw INT $3 or bad MOV Everything except for amd64, amd64p32, and 386 currently defines and uses an abort function. This CL makes these match. The next CL will recognize the abort function to make this more useful. Change-Id: I7c155871ea48919a9220417df0630005b444f488 Reviewed-on: https://go-review.googlesource.com/93660 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:54 +00:00
Austin Clements	7f1b2738bb	runtime: make throw safer to call Currently, throw may grow the stack, which means whenever we call it from a context where it's not safe to grow the stack, we first have to switch to the system stack. This is pretty easy to get wrong. Fix this by making throw switch to the system stack so it doesn't grow the stack and is hence safe to call without a system stack switch at the call site. The only thing this complicates is badsystemstack itself, which would now go into an infinite loop before printing anything (previously it would also go into an infinite loop, but would at least print the error first). Fix this by making badsystemstack do a direct write and then crash hard. Change-Id: Ic5b4a610df265e47962dcfa341cabac03c31c049 Reviewed-on: https://go-review.googlesource.com/93659 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:52 +00:00
Austin Clements	9d59234cbe	runtime: move unrecoverable panic handling to the system stack Currently parts of unrecoverable panic handling (notably, printing panic messages) can happen on the user stack. This may grow the stack, which is generally fine, but if we're handling a runtime panic, it's better to do as little as possible in case the runtime is in an inconsistent state. Hence, this commit rearranges the handling of unrecoverable panics so that it's done entirely on the system stack. This is mostly a matter of shuffling code a bit so everything can move into a systemstack block. The one slight subtlety is in the "panic during panic" case, where we now depend on startpanic_m's caller to print the stack rather than startpanic_m itself. To make this work, startpanic_m now returns a boolean indicating that the caller should avoid trying to print any panic messages and get right to the stack trace. Since the caller is already in a position to do this, this actually simplifies things a little. Change-Id: Id72febe8c0a9fb31d9369b600a1816d65a49bfed Reviewed-on: https://go-review.googlesource.com/93658 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:51 +00:00
Ian Lance Taylor	3d69ef37b8	runtime: use systemstack around throw in sysSigaction Try to fix the build on ppc64-linux and ppc64le-linux, avoiding: --- FAIL: TestInlinedRoutineRecords (2.12s) dwarf_test.go:97: build: # command-line-arguments runtime.systemstack: nosplit stack overflow 752 assumed on entry to runtime.sigtrampgo (nosplit) 480 after runtime.sigtrampgo (nosplit) uses 272 400 after runtime.sigfwdgo (nosplit) uses 80 264 after runtime.setsig (nosplit) uses 136 208 after runtime.sigaction (nosplit) uses 56 136 after runtime.sysSigaction (nosplit) uses 72 88 after runtime.throw (nosplit) uses 48 16 after runtime.dopanic (nosplit) uses 72 -16 after runtime.systemstack (nosplit) uses 32 dwarf_test.go:98: build error: exit status 2 --- FAIL: TestAbstractOriginSanity (10.22s) dwarf_test.go:97: build: # command-line-arguments runtime.systemstack: nosplit stack overflow 752 assumed on entry to runtime.sigtrampgo (nosplit) 480 after runtime.sigtrampgo (nosplit) uses 272 400 after runtime.sigfwdgo (nosplit) uses 80 264 after runtime.setsig (nosplit) uses 136 208 after runtime.sigaction (nosplit) uses 56 136 after runtime.sysSigaction (nosplit) uses 72 88 after runtime.throw (nosplit) uses 48 16 after runtime.dopanic (nosplit) uses 72 -16 after runtime.systemstack (nosplit) uses 32 dwarf_test.go:98: build error: exit status 2 FAIL FAIL cmd/link/internal/ld 13.404s Change-Id: I4840604adb0e9f68a8d8e24f2f2a1a17d1634a58 Reviewed-on: https://go-review.googlesource.com/99415 Reviewed-by: Austin Clements <austin@google.com>	2018-03-08 16:35:53 +00:00
Ian Lance Taylor	419c06455a	runtime: get traceback from VDSO code Currently if a profiling signal arrives while executing within a VDSO the profiler will report _ExternalCode, which is needlessly confusing for a pure Go program. Change the VDSO calling code to record the caller's PC/SP, so that we can do a traceback from that point. If that fails for some reason, report _VDSO rather than _ExternalCode, which should at least point in the right direction. This adds some instructions to the code that calls the VDSO, but the slowdown is reasonably negligible: name old time/op new time/op delta ClockVDSOAndFallbackPaths/vDSO-8 40.5ns ± 2% 41.3ns ± 1% +1.85% (p=0.002 n=10+10) ClockVDSOAndFallbackPaths/Fallback-8 41.9ns ± 1% 43.5ns ± 1% +3.84% (p=0.000 n=9+9) TimeNow-8 41.5ns ± 3% 41.5ns ± 2% ~ (p=0.723 n=10+10) Fixes #24142 Change-Id: Iacd935db3c4c782150b3809aaa675a71799b1c9c Reviewed-on: https://go-review.googlesource.com/97315 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-07 23:35:25 +00:00
Ian Lance Taylor	c2f28de732	runtime: change from rt_sigaction to sigaction This normalizes the Linux code to act like other targets. The size argument to the rt_sigaction system call is pushed to a single function, sysSigaction. This is intended as a simplification step for CL 93875 for #14327. Change-Id: I594788e235f0da20e16e8a028e27ac8c883907c4 Reviewed-on: https://go-review.googlesource.com/99077 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-07 23:30:02 +00:00
Elias Naur	7a2a96d6ad	runtime/cgo: make sure nil is undefined before defining it While working on standalone builds of gomobile bindings, I ran into errors on the form: gcc_darwin_arm.c:30:31: error: ambiguous expansion of macro 'nil' [-Werror,-Wambiguous-macro] /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS11.2.sdk/usr/include/MacTypes.h:94:15: note: expanding this definition of 'nil' Fix it by undefining nil before defining it in libcgo.h. Change-Id: I8e9660a68c6c351e592684d03d529f0d182c0493 Reviewed-on: https://go-review.googlesource.com/99215 Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-07 21:08:19 +00:00
Yuval Pavel Zholkover	083f3957b8	runtime: add missing build constraints to os_linux_{be64,noauxv,novdso,ppc64x}.go files They do not match the file name patterns of _GOOS _GOARCH *_GOOS_GOARCH therefore the implicit linux constraint was not being added. Change-Id: Ie506c51cee6818db445516f96fffaa351df62cf5 Reviewed-on: https://go-review.googlesource.com/99116 Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-07 14:26:19 +00:00
Matthew Dempsky	2c0c68d621	cmd/compile: fix miscompilation of "defer delete(m, k)" Previously, for slow map key types (i.e., any type other than a 32-bit or 64-bit plain memory type), we would rewrite defer delete(m, k) into ktmp := k defer delete(m, &ktmp) However, if the defer statement was inside a loop, we would end up reusing the same ktmp value for all of the deferred deletes. We already rewrite defer print(x, y, z) into defer func(a1, a2, a3) { print(a1, a2, a3) }(x, y, z) This CL generalizes this rewrite to also apply for slow map deletes. This could be extended to apply even more generally to other builtins, but as discussed on #24259, there are cases where we must not do this (e.g., "defer recover()"). However, if we elect to do this more generally, this CL should still make that easier. Lastly, while here, fix a few isues in wrapCall (nee walkprintfunc): 1) lookupN appends the generation number to the symbol anyway, so "%d" was being literally included in the generated function names. 2) walkstmt will be called when the function is compiled later anyway, so no need to do it now. Fixes #24259. Change-Id: I70286867c64c69c18e9552f69e3f4154a0fc8b04 Reviewed-on: https://go-review.googlesource.com/99017 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-06 23:33:28 +00:00
Josh Bleecher Snyder	f7739c07c8	runtime: skip pointless writes in freedefer Change-Id: I501a0e5c87ec88616c7dcdf1b723758b6df6c088 Reviewed-on: https://go-review.googlesource.com/98758 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-06 18:58:57 +00:00
Tobias Klauser	9745397e1d	runtime: fix stack switch check in walltime/nanotime on linux/arm CL 98095 got the check wrong. We should be testing 'getg() == getg().m.curg', not 'getg().m == getg().m.curg'. Change-Id: I32f6238b00409b67afa8efe732513d542aec5bc7 Reviewed-on: https://go-review.googlesource.com/98855 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-06 14:24:19 +00:00
Meng Zhuo	8916773a3d	runtime, cmd/compile: use ldp for DUFFCOPY on ARM64 name old time/op new time/op delta CopyFat8 2.15ns ± 1% 2.19ns ± 6% ~ (p=0.171 n=8+9) CopyFat12 2.15ns ± 0% 2.17ns ± 2% ~ (p=0.137 n=8+10) CopyFat16 2.17ns ± 3% 2.15ns ± 0% ~ (p=0.211 n=10+10) CopyFat24 2.16ns ± 1% 2.15ns ± 0% ~ (p=0.087 n=10+10) CopyFat32 11.5ns ± 0% 12.8ns ± 2% +10.87% (p=0.000 n=8+10) CopyFat64 20.2ns ± 2% 12.9ns ± 0% -36.11% (p=0.000 n=10+10) CopyFat128 37.2ns ± 0% 21.5ns ± 0% -42.20% (p=0.000 n=10+10) CopyFat256 71.6ns ± 0% 38.7ns ± 0% -45.95% (p=0.000 n=10+10) CopyFat512 140ns ± 0% 73ns ± 0% -47.86% (p=0.000 n=10+9) CopyFat520 142ns ± 0% 74ns ± 0% -47.54% (p=0.000 n=10+10) CopyFat1024 277ns ± 0% 141ns ± 0% -49.10% (p=0.000 n=10+10) Change-Id: If54bc571add5db674d5e081579c87e80153d0a5a Reviewed-on: https://go-review.googlesource.com/97395 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-06 04:14:59 +00:00
Hana Kim	d3946f75d3	internal/trace: remove backlinks from span/task end to start This is an updated version of golang.org/cl/96395, with the fix to TestUserSpan. This reverts commit 7b6f6267e90a8e4eab37a3f2164ba882e6222adb. Change-Id: I31eec8ba0997f9178dffef8dac608e731ab70872 Reviewed-on: https://go-review.googlesource.com/98236 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-05 20:10:22 +00:00
Ian Lance Taylor	7178267b59	runtime: rename vdso symbols to use camel case This was originally C code using names with underscores, which were retained when the code was rewritten into Go. Change the code to use Go-like camel case names. The names that come from the ELF ABI are left unchanged. Change-Id: I181bc5dd81284c07bc67b7df4635f4734b41d646 Reviewed-on: https://go-review.googlesource.com/98520 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-05 19:12:32 +00:00
Tobias Klauser	5f80e70912	runtime: remove unused SYS_* definitions on Linux Also fix the indentation of the SYS_* definitions in sys_linux_mipsx.s and order them numerically. Change-Id: I0c454301c329a163e7db09dcb25d4e825149858c Reviewed-on: https://go-review.googlesource.com/98448 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-05 18:32:08 +00:00
Keith Randall	ee58eccc56	internal/bytealg: move short string Index implementations into bytealg Also move the arm64 CountByte implementation while we're here. Fixes #19792 Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e Reviewed-on: https://go-review.googlesource.com/98518 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-04 19:49:44 +00:00
Keith Randall	f6332bb84a	internal/bytealg: move compare functions to bytealg Move bytes.Compare and runtime·cmpstring to bytealg. Update #19792 Change-Id: I139e6d7c59686bef7a3017e3dec99eba5fd10447 Reviewed-on: https://go-review.googlesource.com/98515 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-04 17:49:39 +00:00
Keith Randall	45964e4f9c	internal/bytealg: move Count to bytealg Move bytes.Count and strings.Count to bytealg. Update #19792 Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb Reviewed-on: https://go-review.googlesource.com/98495 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-04 17:49:25 +00:00
Tobias Klauser	51b027116c	runtime: use vDSO for clock_gettime on linux/arm Use the __vdso_clock_gettime fast path via the vDSO on linux/arm to speed up nanotime and walltime. This results in the following performance improvement for time.Now on a RaspberryPi 3 (running 32bit Raspbian, i.e. GOOS=linux/GOARCH=arm): name old time/op new time/op delta TimeNow 0.99µs ± 0% 0.39µs ± 1% -60.74% (p=0.000 n=12+20) Change-Id: I3598278a6c88d7f6a6ce66c56b9d25f9dd2f4c9a Reviewed-on: https://go-review.googlesource.com/98095 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-03 12:12:58 +00:00
Tobias Klauser	c69f60d071	runtime: remove unused __vdso_time_sym It's unused since https://golang.org/cl/99320043 Change-Id: I74d69ff894aa2fb556f1c2083406c118c559d91b Reviewed-on: https://go-review.googlesource.com/98195 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-03 12:11:38 +00:00
Keith Randall	1dfa380e3d	internal/bytealg: move equal functions to bytealg Move bytes.Equal, runtime.memequal, and runtime.memequal_varlen to the bytealg package. Update #19792 Change-Id: Ic4175e952936016ea0bda6c7c3dbb33afdc8e4ac Reviewed-on: https://go-review.googlesource.com/98355 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-03 04:18:27 +00:00
Keith Randall	403ab0f221	internal/bytealg: move IndexByte asssembly to the new bytealg package Move the IndexByte function from the runtime to a new bytealg package. The new package will eventually hold all the optimized assembly for groveling through byte slices and strings. It seems a better home for this code than randomly keeping it in runtime. Once this is in, the next step is to move the other functions (Compare, Equal, ...). Update #19792 This change seems complicated enough that we might just declare "not worth it" and abandon. Opinions welcome. The core assembly is all unchanged, except minor modifications where the code reads cpu feature bits. The wrapper functions have been cleaned up as they are now actually checked by vet. Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796 Reviewed-on: https://go-review.googlesource.com/98016 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-02 22:46:15 +00:00
Zhou Peng	b77aad0891	runtime: fix typo, func comments should start with function name Change-Id: I289af4884583537639800e37928c22814d38cba9 Reviewed-on: https://go-review.googlesource.com/98115 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>	2018-03-02 12:03:30 +00:00
Brad Fitzpatrick	1fadbc1a76	Revert "runtime: use bytes.IndexByte in findnull" This reverts commit `7365fac2db`. Reason for revert: breaks the build on some architectures, reading unmapped pages? Change-Id: I3a8c02dc0b649269faacea79ecd8213defa97c54 Reviewed-on: https://go-review.googlesource.com/97995 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-01 22:22:51 +00:00
Balaram Makam	213a75171d	runtime: improve arm64 memmove implementation Improve runtime memmove_arm64.s specializing for small copies and processing 32 bytes per iteration for 32 bytes or more. Benchmark results of runtime/Memmove on Amberwing: name old time/op new time/op delta Memmove/0 7.61ns ± 0% 7.20ns ± 0% ~ (p=0.053 n=5+7) Memmove/1 9.28ns ± 0% 8.80ns ± 0% -5.17% (p=0.000 n=4+8) Memmove/2 9.65ns ± 0% 9.20ns ± 0% -4.68% (p=0.000 n=5+8) Memmove/3 10.0ns ± 0% 9.2ns ± 0% -7.83% (p=0.000 n=5+8) Memmove/4 10.6ns ± 0% 9.2ns ± 0% -13.21% (p=0.000 n=5+8) Memmove/5 11.0ns ± 0% 9.2ns ± 0% -16.36% (p=0.000 n=5+8) Memmove/6 12.4ns ± 0% 9.2ns ± 0% -25.81% (p=0.000 n=5+8) Memmove/7 13.1ns ± 0% 9.2ns ± 0% -29.56% (p=0.000 n=5+8) Memmove/8 9.10ns ± 1% 9.20ns ± 0% +1.08% (p=0.002 n=5+8) Memmove/9 9.67ns ± 0% 9.20ns ± 0% -4.88% (p=0.000 n=5+8) Memmove/10 10.4ns ± 0% 9.2ns ± 0% -11.54% (p=0.000 n=5+8) Memmove/11 10.9ns ± 0% 9.2ns ± 0% -15.60% (p=0.000 n=5+8) Memmove/12 11.5ns ± 0% 9.2ns ± 0% -20.00% (p=0.000 n=5+8) Memmove/13 12.4ns ± 0% 9.2ns ± 0% -25.81% (p=0.000 n=5+8) Memmove/14 13.1ns ± 0% 9.2ns ± 0% -29.77% (p=0.000 n=5+8) Memmove/15 13.8ns ± 0% 9.2ns ± 0% -33.33% (p=0.000 n=5+8) Memmove/16 9.70ns ± 0% 9.20ns ± 0% -5.19% (p=0.000 n=5+8) Memmove/32 10.6ns ± 0% 9.2ns ± 0% -13.21% (p=0.000 n=4+8) Memmove/64 13.4ns ± 0% 10.2ns ± 0% -23.88% (p=0.000 n=4+8) Memmove/128 18.1ns ± 1% 13.2ns ± 0% -26.99% (p=0.000 n=5+8) Memmove/256 25.2ns ± 0% 16.4ns ± 0% -34.92% (p=0.000 n=5+8) Memmove/512 36.4ns ± 0% 22.8ns ± 0% -37.36% (p=0.000 n=5+8) Memmove/1024 70.1ns ± 0% 36.8ns ±11% -47.49% (p=0.002 n=5+8) Memmove/2048 121ns ± 0% 61ns ± 0% ~ (p=0.053 n=5+7) Memmove/4096 224ns ± 0% 120ns ± 0% -46.43% (p=0.000 n=5+8) MemmoveUnalignedDst/0 8.40ns ± 0% 8.00ns ± 0% -4.76% (p=0.000 n=5+8) MemmoveUnalignedDst/1 9.87ns ± 1% 10.00ns ± 0% ~ (p=0.070 n=5+8) MemmoveUnalignedDst/2 10.6ns ± 0% 10.4ns ± 0% -1.89% (p=0.000 n=5+8) MemmoveUnalignedDst/3 10.8ns ± 0% 10.4ns ± 0% -3.70% (p=0.000 n=5+8) MemmoveUnalignedDst/4 10.9ns ± 0% 10.3ns ± 0% ~ (p=0.053 n=5+7) MemmoveUnalignedDst/5 11.5ns ± 0% 10.3ns ± 1% -10.22% (p=0.000 n=4+8) MemmoveUnalignedDst/6 13.2ns ± 0% 10.4ns ± 1% -21.50% (p=0.000 n=5+8) MemmoveUnalignedDst/7 13.7ns ± 0% 10.3ns ± 1% -24.64% (p=0.000 n=4+8) MemmoveUnalignedDst/8 10.1ns ± 0% 10.4ns ± 0% +2.97% (p=0.002 n=5+8) MemmoveUnalignedDst/9 10.7ns ± 0% 10.4ns ± 0% -2.80% (p=0.000 n=5+8) MemmoveUnalignedDst/10 11.2ns ± 1% 10.4ns ± 0% -6.81% (p=0.000 n=5+8) MemmoveUnalignedDst/11 11.6ns ± 0% 10.4ns ± 0% -10.34% (p=0.000 n=5+8) MemmoveUnalignedDst/12 12.5ns ± 2% 10.4ns ± 0% -16.53% (p=0.000 n=5+8) MemmoveUnalignedDst/13 13.7ns ± 0% 10.4ns ± 0% -24.09% (p=0.000 n=5+8) MemmoveUnalignedDst/14 14.0ns ± 0% 10.4ns ± 0% -25.71% (p=0.000 n=5+8) MemmoveUnalignedDst/15 14.6ns ± 0% 10.4ns ± 0% -28.77% (p=0.000 n=5+8) MemmoveUnalignedDst/16 10.5ns ± 0% 10.4ns ± 0% -0.95% (p=0.000 n=5+8) MemmoveUnalignedDst/32 12.4ns ± 0% 11.6ns ± 0% -6.05% (p=0.000 n=5+8) MemmoveUnalignedDst/64 15.2ns ± 0% 12.3ns ± 0% -19.08% (p=0.000 n=5+8) MemmoveUnalignedDst/128 18.7ns ± 0% 15.2ns ± 0% -18.72% (p=0.000 n=5+8) MemmoveUnalignedDst/256 25.1ns ± 0% 18.6ns ± 0% -25.90% (p=0.000 n=5+8) MemmoveUnalignedDst/512 37.8ns ± 0% 24.4ns ± 0% -35.45% (p=0.000 n=5+8) MemmoveUnalignedDst/1024 74.6ns ± 0% 40.4ns ± 0% ~ (p=0.053 n=5+7) MemmoveUnalignedDst/2048 133ns ± 0% 75ns ± 0% -43.91% (p=0.000 n=5+8) MemmoveUnalignedDst/4096 247ns ± 0% 141ns ± 0% -42.91% (p=0.000 n=5+8) MemmoveUnalignedSrc/0 8.40ns ± 0% 8.00ns ± 0% -4.76% (p=0.000 n=5+8) MemmoveUnalignedSrc/1 9.81ns ± 0% 10.00ns ± 0% +1.98% (p=0.002 n=5+8) MemmoveUnalignedSrc/2 10.5ns ± 0% 10.0ns ± 0% -4.76% (p=0.000 n=5+8) MemmoveUnalignedSrc/3 10.7ns ± 1% 10.0ns ± 0% -6.89% (p=0.000 n=5+8) MemmoveUnalignedSrc/4 11.3ns ± 0% 10.0ns ± 0% -11.50% (p=0.000 n=5+8) MemmoveUnalignedSrc/5 11.6ns ± 0% 10.0ns ± 0% -13.79% (p=0.000 n=5+8) MemmoveUnalignedSrc/6 13.6ns ± 0% 10.0ns ± 0% -26.47% (p=0.000 n=5+8) MemmoveUnalignedSrc/7 14.4ns ± 0% 10.0ns ± 0% -30.75% (p=0.000 n=5+8) MemmoveUnalignedSrc/8 9.87ns ± 1% 10.00ns ± 0% ~ (p=0.070 n=5+8) MemmoveUnalignedSrc/9 10.4ns ± 0% 10.0ns ± 0% -3.85% (p=0.000 n=5+8) MemmoveUnalignedSrc/10 11.2ns ± 0% 10.0ns ± 0% -10.71% (p=0.000 n=5+8) MemmoveUnalignedSrc/11 11.8ns ± 0% 10.0ns ± 0% -15.25% (p=0.000 n=5+8) MemmoveUnalignedSrc/12 12.1ns ± 0% 10.0ns ± 0% -17.36% (p=0.000 n=5+8) MemmoveUnalignedSrc/13 13.6ns ± 0% 10.0ns ± 0% -26.47% (p=0.000 n=5+8) MemmoveUnalignedSrc/14 14.7ns ± 0% 10.0ns ± 0% -31.79% (p=0.000 n=5+8) MemmoveUnalignedSrc/15 14.4ns ± 0% 10.0ns ± 0% -30.56% (p=0.000 n=5+8) MemmoveUnalignedSrc/16 11.0ns ± 0% 10.0ns ± 0% -9.09% (p=0.000 n=5+8) MemmoveUnalignedSrc/32 11.5ns ± 0% 10.0ns ± 0% -13.04% (p=0.000 n=5+8) MemmoveUnalignedSrc/64 14.9ns ± 0% 11.2ns ± 0% -24.83% (p=0.000 n=4+8) MemmoveUnalignedSrc/128 19.5ns ± 0% 15.2ns ± 0% -22.05% (p=0.000 n=5+8) MemmoveUnalignedSrc/256 27.3ns ± 2% 19.2ns ± 0% -29.62% (p=0.000 n=5+8) MemmoveUnalignedSrc/512 40.4ns ± 0% 27.2ns ± 0% -32.67% (p=0.000 n=5+8) MemmoveUnalignedSrc/1024 75.4ns ± 0% 44.4ns ± 0% -41.15% (p=0.000 n=5+8) MemmoveUnalignedSrc/2048 131ns ± 0% 77ns ± 3% -41.56% (p=0.002 n=5+8) MemmoveUnalignedSrc/4096 248ns ± 0% 145ns ± 0% -41.53% (p=0.000 n=5+8) name old speed new speed delta Memmove/1 108MB/s ± 0% 114MB/s ± 0% +5.37% (p=0.004 n=4+8) Memmove/2 207MB/s ± 0% 217MB/s ± 0% +4.85% (p=0.002 n=5+8) Memmove/3 301MB/s ± 0% 326MB/s ± 0% +8.45% (p=0.002 n=5+8) Memmove/4 377MB/s ± 0% 435MB/s ± 0% +15.31% (p=0.004 n=4+8) Memmove/5 455MB/s ± 0% 543MB/s ± 0% +19.46% (p=0.002 n=5+8) Memmove/6 483MB/s ± 0% 652MB/s ± 0% +34.88% (p=0.003 n=5+7) Memmove/7 537MB/s ± 0% 761MB/s ± 0% +41.71% (p=0.002 n=5+8) Memmove/8 879MB/s ± 1% 869MB/s ± 0% -1.15% (p=0.000 n=5+7) Memmove/9 931MB/s ± 0% 978MB/s ± 0% +5.05% (p=0.002 n=5+8) Memmove/10 960MB/s ± 0% 1086MB/s ± 0% +13.13% (p=0.002 n=5+8) Memmove/11 1.00GB/s ± 0% 1.20GB/s ± 0% +18.92% (p=0.003 n=5+7) Memmove/12 1.04GB/s ± 0% 1.30GB/s ± 0% +25.40% (p=0.002 n=5+8) Memmove/13 1.05GB/s ± 0% 1.41GB/s ± 0% +34.87% (p=0.002 n=5+8) Memmove/14 1.07GB/s ± 0% 1.52GB/s ± 0% +42.14% (p=0.002 n=5+8) Memmove/15 1.09GB/s ± 0% 1.63GB/s ± 0% +49.91% (p=0.002 n=5+8) Memmove/16 1.65GB/s ± 0% 1.74GB/s ± 0% +5.40% (p=0.003 n=5+7) Memmove/32 3.01GB/s ± 0% 3.48GB/s ± 0% +15.58% (p=0.003 n=5+7) Memmove/64 4.76GB/s ± 0% 6.27GB/s ± 0% +31.75% (p=0.003 n=5+7) Memmove/128 7.08GB/s ± 1% 9.69GB/s ± 0% +36.96% (p=0.002 n=5+8) Memmove/256 10.2GB/s ± 0% 15.6GB/s ± 0% +53.58% (p=0.002 n=5+8) Memmove/512 14.1GB/s ± 0% 22.4GB/s ± 0% +59.57% (p=0.003 n=5+7) Memmove/1024 14.6GB/s ± 0% 27.9GB/s ±10% +91.00% (p=0.002 n=5+8) Memmove/2048 16.9GB/s ± 0% 33.4GB/s ± 0% +98.32% (p=0.003 n=5+7) Memmove/4096 18.3GB/s ± 0% 33.9GB/s ± 0% +85.80% (p=0.002 n=5+8) MemmoveUnalignedDst/1 101MB/s ± 1% 100MB/s ± 0% ~ (p=0.586 n=5+8) MemmoveUnalignedDst/2 189MB/s ± 0% 192MB/s ± 0% +1.82% (p=0.002 n=5+8) MemmoveUnalignedDst/3 278MB/s ± 0% 288MB/s ± 0% +3.88% (p=0.003 n=5+7) MemmoveUnalignedDst/4 368MB/s ± 0% 387MB/s ± 0% +5.41% (p=0.003 n=5+7) MemmoveUnalignedDst/5 434MB/s ± 0% 484MB/s ± 0% +11.52% (p=0.002 n=5+8) MemmoveUnalignedDst/6 454MB/s ± 0% 580MB/s ± 0% +27.62% (p=0.002 n=5+8) MemmoveUnalignedDst/7 509MB/s ± 0% 677MB/s ± 0% +33.01% (p=0.002 n=5+8) MemmoveUnalignedDst/8 792MB/s ± 0% 770MB/s ± 0% -2.77% (p=0.002 n=5+8) MemmoveUnalignedDst/9 841MB/s ± 0% 866MB/s ± 0% +2.92% (p=0.002 n=5+8) MemmoveUnalignedDst/10 896MB/s ± 0% 962MB/s ± 0% +7.35% (p=0.003 n=5+7) MemmoveUnalignedDst/11 947MB/s ± 0% 1058MB/s ± 0% +11.80% (p=0.002 n=5+8) MemmoveUnalignedDst/12 962MB/s ± 2% 1154MB/s ± 0% +19.97% (p=0.002 n=5+8) MemmoveUnalignedDst/13 947MB/s ± 0% 1251MB/s ± 0% +32.08% (p=0.002 n=5+8) MemmoveUnalignedDst/14 1.00GB/s ± 0% 1.35GB/s ± 0% +34.55% (p=0.002 n=5+8) MemmoveUnalignedDst/15 1.03GB/s ± 0% 1.44GB/s ± 0% +40.50% (p=0.002 n=5+8) MemmoveUnalignedDst/16 1.53GB/s ± 0% 1.54GB/s ± 0% +0.77% (p=0.002 n=5+8) MemmoveUnalignedDst/32 2.58GB/s ± 0% 2.75GB/s ± 0% +6.52% (p=0.003 n=5+7) MemmoveUnalignedDst/64 4.21GB/s ± 0% 5.19GB/s ± 0% +23.40% (p=0.004 n=5+6) MemmoveUnalignedDst/128 6.86GB/s ± 0% 8.42GB/s ± 0% +22.78% (p=0.003 n=5+7) MemmoveUnalignedDst/256 10.2GB/s ± 0% 13.8GB/s ± 0% +35.15% (p=0.002 n=5+8) MemmoveUnalignedDst/512 13.5GB/s ± 0% 21.0GB/s ± 0% +54.90% (p=0.002 n=5+8) MemmoveUnalignedDst/1024 13.7GB/s ± 0% 25.3GB/s ± 0% +84.61% (p=0.003 n=5+7) MemmoveUnalignedDst/2048 15.3GB/s ± 0% 27.5GB/s ± 0% +79.52% (p=0.002 n=5+8) MemmoveUnalignedDst/4096 16.5GB/s ± 0% 28.9GB/s ± 0% +74.74% (p=0.002 n=5+8) MemmoveUnalignedSrc/1 102MB/s ± 0% 100MB/s ± 0% -2.02% (p=0.000 n=5+7) MemmoveUnalignedSrc/2 191MB/s ± 0% 200MB/s ± 0% +4.78% (p=0.002 n=5+8) MemmoveUnalignedSrc/3 279MB/s ± 0% 300MB/s ± 0% +7.45% (p=0.002 n=5+8) MemmoveUnalignedSrc/4 354MB/s ± 0% 400MB/s ± 0% +13.10% (p=0.002 n=5+8) MemmoveUnalignedSrc/5 431MB/s ± 0% 500MB/s ± 0% +16.02% (p=0.002 n=5+8) MemmoveUnalignedSrc/6 441MB/s ± 0% 600MB/s ± 0% +36.03% (p=0.002 n=5+8) MemmoveUnalignedSrc/7 485MB/s ± 0% 700MB/s ± 0% +44.29% (p=0.002 n=5+8) MemmoveUnalignedSrc/8 811MB/s ± 1% 800MB/s ± 0% -1.36% (p=0.016 n=5+8) MemmoveUnalignedSrc/9 864MB/s ± 0% 900MB/s ± 0% +4.07% (p=0.002 n=5+8) MemmoveUnalignedSrc/10 893MB/s ± 0% 999MB/s ± 0% +11.97% (p=0.002 n=5+8) MemmoveUnalignedSrc/11 932MB/s ± 0% 1099MB/s ± 0% +18.01% (p=0.002 n=5+8) MemmoveUnalignedSrc/12 988MB/s ± 0% 1199MB/s ± 0% +21.35% (p=0.002 n=5+8) MemmoveUnalignedSrc/13 955MB/s ± 0% 1299MB/s ± 0% +36.02% (p=0.002 n=5+8) MemmoveUnalignedSrc/14 955MB/s ± 0% 1399MB/s ± 0% +46.52% (p=0.002 n=5+8) MemmoveUnalignedSrc/15 1.04GB/s ± 0% 1.50GB/s ± 0% +44.18% (p=0.002 n=5+8) MemmoveUnalignedSrc/16 1.45GB/s ± 0% 1.60GB/s ± 0% +10.14% (p=0.002 n=5+8) MemmoveUnalignedSrc/32 2.78GB/s ± 0% 3.20GB/s ± 0% +15.16% (p=0.003 n=5+7) MemmoveUnalignedSrc/64 4.30GB/s ± 0% 5.72GB/s ± 0% +32.90% (p=0.003 n=5+7) MemmoveUnalignedSrc/128 6.57GB/s ± 0% 8.42GB/s ± 0% +28.06% (p=0.002 n=5+8) MemmoveUnalignedSrc/256 9.39GB/s ± 1% 13.33GB/s ± 0% +41.96% (p=0.002 n=5+8) MemmoveUnalignedSrc/512 12.7GB/s ± 0% 18.8GB/s ± 0% +48.53% (p=0.003 n=5+7) MemmoveUnalignedSrc/1024 13.6GB/s ± 0% 23.0GB/s ± 0% +69.82% (p=0.002 n=5+8) MemmoveUnalignedSrc/2048 15.6GB/s ± 0% 26.8GB/s ± 3% +71.37% (p=0.002 n=5+8) MemmoveUnalignedSrc/4096 16.5GB/s ± 0% 28.2GB/s ± 0% +71.40% (p=0.002 n=5+8) Fixes #22925 Change-Id: I38c1a9ad5c6e3f4f95fc521c4b7e3140b58b4737 Reviewed-on: https://go-review.googlesource.com/83799 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-01 20:34:11 +00:00
Josh Bleecher Snyder	7365fac2db	runtime: use bytes.IndexByte in findnull bytes.IndexByte is heavily optimized. Use it in findnull. name old time/op new time/op delta GoString-8 65.5ns ± 1% 40.2ns ± 1% -38.62% (p=0.000 n=19+19) findnull is also used in gostringnocopy, which is used in many hot spots in the runtime. Fixes #23830 Change-Id: I2e6cb279c7d8078f8844065de684cc3567fe89d7 Reviewed-on: https://go-review.googlesource.com/97523 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-01 20:34:07 +00:00
Hana Kim	e75f805e6f	runtime/trace: skip TestUserTaskSpan upon timestamp error Change-Id: I030baaa0a0abf1e43449faaf676d389a28a868a3 Reviewed-on: https://go-review.googlesource.com/97857 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Peter Weinberger <pjw@google.com>	2018-03-01 18:38:49 +00:00
Josh Bleecher Snyder	9372e3f5ef	runtime: don't allocate to build strings of length 1 Use staticbytes instead. Instrumenting make.bash shows approx 0.5% of all slicebytetostrings have a buffer of length 1. name old time/op new time/op delta SliceByteToString/1-8 14.1ns ± 1% 4.1ns ± 1% -71.13% (p=0.000 n=17+20) SliceByteToString/2-8 15.5ns ± 2% 15.5ns ± 1% ~ (p=0.061 n=20+18) SliceByteToString/4-8 14.9ns ± 1% 15.0ns ± 2% +1.25% (p=0.000 n=20+20) SliceByteToString/8-8 17.1ns ± 1% 17.5ns ± 1% +2.16% (p=0.000 n=19+19) SliceByteToString/16-8 23.6ns ± 1% 23.9ns ± 1% +1.41% (p=0.000 n=20+18) SliceByteToString/32-8 26.0ns ± 1% 25.8ns ± 0% -1.05% (p=0.000 n=19+16) SliceByteToString/64-8 30.0ns ± 0% 30.2ns ± 0% +0.56% (p=0.000 n=16+18) SliceByteToString/128-8 38.9ns ± 0% 39.0ns ± 0% +0.23% (p=0.019 n=19+15) Fixes #24172 Change-Id: I3dfa14eefbf9fb4387114e20c9cb40e186abe962 Reviewed-on: https://go-review.googlesource.com/97717 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-01 17:38:06 +00:00
Josh Bleecher Snyder	aa9c1a8f80	runtime: fix amd64p32 indexbytes in presence of overflow When the slice/string length is very large, probably artifically large as in CL 97523, adding BX (length) to R11 (pointer) overflows. As a result, checking DI < R11 yields the wrong result. Since they will be equal when the loop is done, just check DI != R11 instead. Yes, the pointer itself could overflow, but if that happens, something else has gone pretty wrong; not our concern here. Fixes #24187 Change-Id: I2f60fc6ccae739345d01bc80528560726ad4f8c6 Reviewed-on: https://go-review.googlesource.com/97802 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-01 16:53:33 +00:00
Tobias Klauser	c7c01efd96	runtime: clean up libc_* definitions on Solaris All functions defined in syscall2_solaris.go have the respective libc_* var in syscall_solaris.go, except for libc_close. Move it from os3_solaris.go Remove unused libc_fstat. Order go:cgo_import_dynamic and go:linkname lists in syscall2_solaris.go alphabetically. Change-Id: I9f12fa473cf1ae351448ac45597c82a67d799c31 Reviewed-on: https://go-review.googlesource.com/97736 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-01 07:31:53 +00:00
Richard Miller	c2cdfbd1a7	runtime: don't try to shrink address space with brk in Plan 9 Plan 9 won't let brk shrink the data segment if it's shared with other processes (which it is in the go runtime). So we keep track of the notional end of the segment as it moves up and down, and call brk only when it grows. Corrects CL 94776. Updates #23860. Fixes #24013. Change-Id: I754232decab81dfd71d690f77ee6097a17d9be11 Reviewed-on: https://go-review.googlesource.com/97595 Reviewed-by: David du Colombier <0intro@gmail.com> Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: David du Colombier <0intro@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-02-28 15:57:10 +00:00
Keith Randall	2413b54888	cmd/compile: mark the first word of an interface as a uintptr The first word of an interface is a pointer, but for the purposes of GC we don't need to treat it as such. 1. If it is a non-empty interface, the pointer points to an itab which is always in persistentalloc space. 2. If it is an empty interface, the pointer points to a _type. a. If it is a compile-time-allocated type, it points into the read-only data section. b. If it is a reflect-allocated type, it points into the Go heap. Reflect is responsible for keeping a reference to the underlying type so it won't be GCd. If we ever have a moving GC, we need to change this for 2b (as well as scan itabs to update their itab._type fields). Write barriers on the first word of interfaces have already been removed. Change-Id: I643e91d7ac4de980ac2717436eff94097c65d959 Reviewed-on: https://go-review.googlesource.com/97518 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-02-27 22:58:32 +00:00
Tobias Klauser	5b21bf6f81	runtime: simplify walltime/nanotime on linux/{386,amd64} Avoid an unnecessary MOVL/MOVQ. Follow CL 97377 Change-Id: Ic43976d6b0cece3ed455496d18aedd67e0337d3f Reviewed-on: https://go-review.googlesource.com/97358 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-02-27 18:42:41 +00:00
Josh Bleecher Snyder	c5d6c42d35	runtime: improve 386/amd64 systemstack Minor improvements, noticed while investigating other things. Shorten the prologue. Make branch direction better for static branch prediction; the most common case by far is switching stacks (g==curg). Change-Id: Ib2211d3efecb60446355cda56194221ccb78057d Reviewed-on: https://go-review.googlesource.com/97377 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-02-27 18:10:38 +00:00
Josh Bleecher Snyder	486caa26d7	runtime: short-circuit typedmemmove when dst==src Change-Id: I855268a4c0d07ad602ec90f5da66422d3d87c5f2 Reviewed-on: https://go-review.googlesource.com/94595 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-02-27 00:56:18 +00:00
Ian Lance Taylor	804e3e565e	runtime: don't check for String/Error methods in printany They have either already been called by preprintpanics, or they can not be called safely because of the various conditions checked at the start of gopanic. Fixes #24059 Change-Id: I4a6233d12c9f7aaaee72f343257ea108bae79241 Reviewed-on: https://go-review.googlesource.com/96755 Reviewed-by: Austin Clements <austin@google.com>	2018-02-23 22:39:46 +00:00
Austin Clements	788464724c	runtime: reduce arena size to 4MB on 64-bit Windows Currently, we use 64MB heap arenas on 64-bit platforms. This works well on UNIX-like OSes because they treat untouched pages as essentially free. However, on Windows, committed memory is charged against a process whether or not it has demand-faulted physical pages in. Hence, on Windows, even a process with a tiny heap will commit 64MB for one heap arena, plus another 32MB for the arena map. Things are much worse under the race detector, which increases the heap commitment by a factor of 5.5X, leading to 384MB of committed memory at runtime init. Fix this by reducing the heap arena size to 4MB on Windows. To counterbalance the effect of increasing the arena map size by a factor of 16, and to further reduce the impact of the commitment for the arena map, we switch from a single entry L1 arena map to a 64 entry L1 arena map. Compared to the original arena design, this slows down the x/benchmarks garbage benchmark by 0.49% (the slow down of this commit alone is 1.59%, but the previous commit bought us a 1% speed-up): name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.28ms ± 1% 2.29ms ± 1% +0.49% (p=0.000 n=17+18) (https://perf.golang.org/search?q=upload:20180223.1) (This was measured on linux/amd64 by modifying its arena configuration as above.) Fixes #23900. Change-Id: I6b7fa5ecebee2947bf20cfeb78c248809469c6b1 Reviewed-on: https://go-review.googlesource.com/96780 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:51 +00:00
Austin Clements	ec25210564	runtime: support a two-level arena map Currently, the heap arena map is a single, large array that covers every possible arena frame in the entire address space. This is practical up to about 48 bits of address space with 64 MB arenas. However, there are two problems with this: 1. mips64, ppc64, and s390x support full 64-bit address spaces (though on Linux only s390x has kernel support for 64-bit address spaces). On these platforms, it would be good to support these larger address spaces. 2. On Windows, processes are charged for untouched memory, so for processes with small heaps, the mostly-untouched 32 MB arena map plus a 64 MB arena are significant overhead. Hence, it would be good to reduce both the arena map size and the arena size, but with a single-level arena, these are inversely proportional. This CL adds support for a two-level arena map. Arena frame numbers are now divided into arenaL1Bits of L1 index and arenaL2Bits of L2 index. At the moment, arenaL1Bits is always 0, so we effectively have a single level map. We do a few things so that this has no cost beyond the current single-level map: 1. We embed the L2 array directly in mheap, so if there's a single entry in the L2 array, the representation is identical to the current representation and there's no extra level of indirection. 2. Hot code that accesses the arena map is structured so that it optimizes to nearly the same machine code as it does currently. 3. We make some small tweaks to hot code paths and to the inliner itself to keep some important functions inlined despite their now-larger ASTs. In particular, this is necessary for heapBitsForAddr and heapBits.next. Possibly as a result of some of the tweaks, this actually slightly improves the performance of the x/benchmarks garbage benchmark: name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.28ms ± 1% 2.26ms ± 1% -1.07% (p=0.000 n=17+19) (https://perf.golang.org/search?q=upload:20180223.2) For #23900. Change-Id: If5164e0961754f97eb9eca58f837f36d759505ff Reviewed-on: https://go-review.googlesource.com/96779 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:50 +00:00
Austin Clements	33b76920ec	runtime: rename "arena index" to "arena map" There are too many places where I want to talk about "indexing into the arena index". Make this less awkward and ambiguous by calling it the "arena map" instead. Change-Id: I726b0667bb2139dbc006175a0ec09a871cdf73f9 Reviewed-on: https://go-review.googlesource.com/96777 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:48 +00:00
Austin Clements	9680980efe	runtime: don't assume arena is in address order On amd64, the arena is no longer in address space order, but currently the heap dumper assumes that it is. Fix this assumption. Change-Id: Iab1953cd36b359d0fb78ed49e5eb813116a18855 Reviewed-on: https://go-review.googlesource.com/96776 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:47 +00:00
mingrammer	fceaa2e242	runtime: rename the TestGcHashmapIndirection to TestGcMapIndirection There was still the word 'Hashmap' in gc_test.go, so I renamed it to just 'Map' Previous renaming commit: https://golang.org/cl/90336 Change-Id: I5b0e5c2229d1c30937c7216247f4533effb81ce7 Reviewed-on: https://go-review.googlesource.com/96675 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-23 16:48:01 +00:00
Jerrin Shaji George	5b3cd56038	runtime: fix a few typos in comments Change-Id: I07a1eb02ffc621c5696b49491181300bf411f822 Reviewed-on: https://go-review.googlesource.com/96475 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-23 00:17:20 +00:00
Austin Clements	ea8d7a370d	runtime: clarify address space limit constants and comments Now that we support the full non-contiguous virtual address space of amd64 hardware, some of the comments and constants related to this are out of date. This renames memLimitBits to heapAddrBits because 1<<memLimitBits is no longer the limit of the address space and rewrites the comment to focus first on hardware limits (which span OSes) and then discuss kernel limits. Second, this eliminates the memLimit constant because there's no longer a meaningful "highest possible heap pointer value" on amd64. Updates #23862. Change-Id: I44b32033d2deb6b69248fb8dda14fc0e65c47f11 Reviewed-on: https://go-review.googlesource.com/95498 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-21 20:32:36 +00:00
Austin Clements	ed1959c6e6	runtime: offset the heap arena index by 2^47 on amd64 On amd64, the virtual address space, when interpreted as signed values, is [-2^47, 2^47). Currently, we only support heap addresses in the "positive" half of this, [0, 2^47). This suffices for linux/amd64 and windows/amd64, but solaris/amd64 can map user addresses in the negative part of this range. Specifically, addresses 0xFFFF8000'00000000 to 0xFFFFFD80'00000000 are part of user space. This leads to "memory allocated by OS not in usable address space" panic, since we don't map heap arena index space for these addresses. Fix this by offsetting addresses when computing arena indexes so that arena entry 0 corresponds to address -2^47 on amd64. We already map enough arena space for 2^48 heap addresses on 64-bit (because arm64's virtual address space is [0, 2^48)), so we don't need to grow any structures to support this. A different approach would be to simply mask out the top 16 bits. However, there are two advantages to the offset approach: 1) invalid heap addresses continue to naturally map to invalid arena indexes so we don't need extra checks and 2) it perturbs the mapping of addresses to arena indexes more, which helps check that we don't accidentally compute incorrect arena indexes somewhere that happen to be right most of the time. Several comments and constant names are now somewhat misleading. We'll fix that in the next CL. This CL is the core change the arena indexing. Fixes #23862. Change-Id: Idb8e299fded04593a286b01a9582da6ddbac2f9a Reviewed-on: https://go-review.googlesource.com/95497 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-21 20:32:35 +00:00
Austin Clements	e9db7b9dd1	runtime: abstract indexing of arena index Accessing the arena index is about to get slightly more complicated. Abstract this away into a set of functions for going back and forth between addresses and arena slice indexes. For #23862. Change-Id: I0b20e74ef47a07b78ed0cf0a6128afe6f6e40f4b Reviewed-on: https://go-review.googlesource.com/95496 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-21 20:32:34 +00:00
Austin Clements	3e214e5693	runtime: simplify bulkBarrierPreWrite Currently, bulkBarrierPreWrite uses inheap to decide whether the destination is in the heap or whether to check for stack or global data. However, this isn't the best question to ask. Instead, get the span directly and query its state. This lets us directly determine whether this might be a global, or is stack memory, or is heap memory. At this point, inheap is no longer used in the hot path, so drop it from the must-be-inlined list and substitute spanOf. This will help in a circuitous way with #23862, since fixing that is going to push inheap very slightly over the inline-able threshold on a few platforms. Change-Id: I5360fc1181183598502409f12979899e1e4d45f7 Reviewed-on: https://go-review.googlesource.com/95495 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-21 20:32:33 +00:00
Austin Clements	c823155828	runtime: ensure sysStat for mheap_.arenas is aligned We don't want to account the memory for mheap_.arenas because most of it is never touched, so currently we pass the address of a uint64 on the heap. However, at least on mips, it's possible for this uint64 to be unaligned, which causes the atomic add in mSysStatInc to crash. Fix this by instead passing a nil stat pointer. Fixes #23946. Change-Id: I091587df1b3066c330b6bb4d834e4596c407910f Reviewed-on: https://go-review.googlesource.com/95695 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-21 03:27:07 +00:00
Martin Möhrmann	8999b1d6c9	runtime: shorten reflect.unsafe_New call chain reflect.unsafe_New is an often called function according to profiling in a large production environment. Since newobject is not inlined currently there is call overhead that can be avoided by calling mallocgc directly. name old time/op new time/op delta New 32.4ns ± 2% 29.8ns ± 1% -8.03% (p=0.000 n=19+20) Change-Id: I572e4be830ed8e5c0da555dc3a8864c8363112be Reviewed-on: https://go-review.googlesource.com/95015 Reviewed-by: Austin Clements <austin@google.com>	2018-02-21 00:31:21 +00:00
Ryuma Yoshida	8fc25b531b	all: remove duplicate word "the" Change-Id: Ia5908e94a6bd362099ca3c63f6ffb7e94457131d GitHub-Last-Rev: `545a40571a` GitHub-Pull-Request: golang/go#23942 Reviewed-on: https://go-review.googlesource.com/95435 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-20 16:45:55 +00:00
Martin Möhrmann	dfb0e4f6c7	runtime: avoid clearing memory during byte slice allocation in gobytes Avoid using make in gobytes which clears the byte slice backing array unnecessarily since the content is overwritten immediately again. Check that the user provided length is positive and below the maximum allowed allocation size explicitly in gobytes as this was done in makeslice before this change. Fixes #23634 Change-Id: Id852619e932aabfc468871c42ad07d34da91f45c Reviewed-on: https://go-review.googlesource.com/94760 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-02-19 05:58:51 +00:00
Kunpei Sakai	f356e83e2e	all: remove "the" duplications Change-Id: I1f25b11fb9b7cd3c09968ed99913dc85db2025ef Reviewed-on: https://go-review.googlesource.com/94976 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-18 17:54:20 +00:00
Tobias Klauser	1b1c8b34d1	runtime: remove unused getrlimit function Follow CL 93655 which removed the (commented-out) usage of this function. Also remove unused constant _RLIMIT_AS and type rlimit. Change-Id: Ifb6e6b2104f4c2555269f8ced72bfcae24f5d5e9 Reviewed-on: https://go-review.googlesource.com/94775 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-02-17 18:35:41 +00:00
Martin Möhrmann	d58593d8aa	runtime: move map fast functions into type specific files Overall code is unchanged. The functions for different types (32, 64, str) of map fast routines are collected in map_fast.go that has grown to ~1300 lines. Moving the functions for each map fast type into a separate file allows for an easier overview and navigation within the map code. Change-Id: Ic09e4212f9025a66a10b11ef8dac23ad49d1d5ae Reviewed-on: https://go-review.googlesource.com/90335 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-02-17 15:32:26 +00:00
Martin Möhrmann	f4bb25c937	runtime: rename map implementation and test files to use a common prefix Rename all map implementation and test files to use "map" as a file name prefix instead of "hashmap" for the implementation and "map" for the test file names. Change-Id: I7b317c1f7a660b95c6d1f1a185866f2839e69446 Reviewed-on: https://go-review.googlesource.com/90336 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-02-17 14:57:32 +00:00
Richard Miller	b1dbce31d7	runtime: don't ignore address hint for sysReserve in Plan 9 On Plan 9, sysReserve was ignoring the address hint and allocating memory wherever it is available. This causes the new TestArenaCollision test to fail on 32-bit Plan 9. We now use the address hint in the specific case where sysReserve is extending the process address space at its end, and similarly we contract the address space in the case where sysFree is releasing memory at the end. Fixes #23860 Change-Id: Ia5254779ba8f1698c999832720a88de400b5f91a Reviewed-on: https://go-review.googlesource.com/94776 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: David du Colombier <0intro@gmail.com>	2018-02-16 16:50:14 +00:00
Elias Naur	ba99433d33	runtime: only run TestArenaCollision if the target can exec Replace the test for nacl with testenv.MustHaveExec to also skip test on iOS. Change-Id: I6822714f6d71533d1b18bbb7894f6ad339d8aea1 Reviewed-on: https://go-review.googlesource.com/94755 Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-02-16 15:33:42 +00:00
Austin Clements	d7691d055a	runtime: replace _MaxMem with maxAlloc Now that we have memLimit, also having _MaxMem is a bit confusing. Replace it with maxAlloc, which better conveys what it limits. We also define maxAlloc slightly differently: since it's now clear that it limits allocation size, we can account for a subtle difference between 32-bit and 64-bit. Change-Id: Iac39048018cc0dae7f0919e25185fee4b3eed529 Reviewed-on: https://go-review.googlesource.com/85890 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:26 +00:00
Austin Clements	90666b8a3d	runtime: move comment about address space sizes to malloc.go Currently there's a detailed comment in lfstack_64bit.go about address space limitations on various architectures. Since that's now relevant to malloc, move it to a more prominent place in the documentation for memLimitBits. Updates #10460. Change-Id: If9708291cf3a288057b8b3ba0ba6a59e3602bbd6 Reviewed-on: https://go-review.googlesource.com/85889 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:25 +00:00
Austin Clements	51ae88ee2f	runtime: remove non-reserved heap logic Currently large sysReserve calls on some OSes don't actually reserve the memory, but just check that it can be reserved. This was important when we called sysReserve to "reserve" many gigabytes for the heap up front, but now that we map memory in small increments as we need it, this complication is no longer necessary. This has one curious side benefit: currently, on Linux, allocations that are large enough to be rejected by mmap wind up freezing the application for a long time before it panics. This happens because sysReserve doesn't reserve the memory, so sysMap calls mmap_fixed, which calls mmap, which fails because the mapping is too large. However, mmap_fixed doesn't inspect why mmap fails, so it falls back to probing every page in the desired region individually with mincore before performing an (otherwise dangerous) MAP_FIXED mapping, which will also fail. This takes a long time for a large region. Now this logic is gone, so the mmap failure leads to an immediate panic. Updates #10460. Change-Id: I8efe88c611871cdb14f99fadd09db83e0161ca2e Reviewed-on: https://go-review.googlesource.com/85888 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:24 +00:00
Austin Clements	2b415549b8	runtime: use sparse mappings for the heap This replaces the contiguous heap arena mapping with a potentially sparse mapping that can support heap mappings anywhere in the address space. This has several advantages over the current approach: * There is no longer any limit on the size of the Go heap. (Currently it's limited to 512GB.) Hence, this fixes #10460. * It eliminates many failures modes of heap initialization and growing. In particular it eliminates any possibility of panicking with an address space conflict. This can happen for many reasons and even causes a low but steady rate of TSAN test failures because of conflicts with the TSAN runtime. See #16936 and #11993. * It eliminates the notion of "non-reserved" heap, which was added because creating huge address space reservations (particularly on 64-bit) led to huge process VSIZE. This was at best confusing and at worst conflicted badly with ulimit -v. However, the non-reserved heap logic is complicated, can race with other mappings in non-pure Go binaries (e.g., #18976), and requires that the entire heap be either reserved or non-reserved. We currently maintain the latter property, but it's quite difficult to convince yourself of that, and hence difficult to keep correct. This logic is still present, but will be removed in the next CL. * It fixes problems on 32-bit where skipping over parts of the address space leads to mapping huge (and never-to-be-used) metadata structures. See #19831. This also completely rewrites and significantly simplifies mheap.sysAlloc, which has been a source of many bugs. E.g., #21044, #20259, #18651, and #13143 (and maybe #23222). This change also makes it possible to allocate individual objects larger than 512GB. As a result, a few tests that expected huge allocations to fail needed to be changed to make even larger allocations. However, at the moment attempting to allocate a humongous object may cause the program to freeze for several minutes on Linux as we fall back to probing every page with addrspace_free. That logic (and this failure mode) will be removed in the next CL. Fixes #10460. Fixes #22204 (since it rewrites the code involved). This slightly slows down compilebench and the x/benchmarks garbage benchmark. name old time/op new time/op delta Template 184ms ± 1% 185ms ± 1% ~ (p=0.065 n=10+9) Unicode 86.9ms ± 3% 86.3ms ± 1% ~ (p=0.631 n=10+10) GoTypes 599ms ± 0% 602ms ± 0% +0.56% (p=0.000 n=10+9) Compiler 2.87s ± 1% 2.89s ± 1% +0.51% (p=0.002 n=9+10) SSA 7.29s ± 1% 7.25s ± 1% ~ (p=0.182 n=10+9) Flate 118ms ± 2% 118ms ± 1% ~ (p=0.113 n=9+9) GoParser 147ms ± 1% 148ms ± 1% +1.07% (p=0.003 n=9+10) Reflect 401ms ± 1% 404ms ± 1% +0.71% (p=0.003 n=10+9) Tar 175ms ± 1% 175ms ± 1% ~ (p=0.604 n=9+10) XML 209ms ± 1% 210ms ± 1% ~ (p=0.052 n=10+10) (https://perf.golang.org/search?q=upload:20171231.4) name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.23ms ± 1% 2.25ms ± 1% +0.84% (p=0.000 n=19+19) (https://perf.golang.org/search?q=upload:20171231.3) Relative to the start of the sparse heap changes (starting at and including "runtime: fix various contiguous bitmap assumptions"), overall slowdown is roughly 1% on GC-intensive benchmarks: name old time/op new time/op delta Template 183ms ± 1% 185ms ± 1% +1.32% (p=0.000 n=9+9) Unicode 84.9ms ± 2% 86.3ms ± 1% +1.65% (p=0.000 n=9+10) GoTypes 595ms ± 1% 602ms ± 0% +1.19% (p=0.000 n=9+9) Compiler 2.86s ± 0% 2.89s ± 1% +0.91% (p=0.000 n=9+10) SSA 7.19s ± 0% 7.25s ± 1% +0.75% (p=0.000 n=8+9) Flate 117ms ± 1% 118ms ± 1% +1.10% (p=0.000 n=10+9) GoParser 146ms ± 2% 148ms ± 1% +1.48% (p=0.002 n=10+10) Reflect 398ms ± 1% 404ms ± 1% +1.51% (p=0.000 n=10+9) Tar 173ms ± 1% 175ms ± 1% +1.17% (p=0.000 n=10+10) XML 208ms ± 1% 210ms ± 1% +0.62% (p=0.011 n=10+10) [Geo mean] 369ms 373ms +1.17% (https://perf.golang.org/search?q=upload:20180101.2) name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.22ms ± 1% 2.25ms ± 1% +1.51% (p=0.000 n=20+19) (https://perf.golang.org/search?q=upload:20180101.3) Change-Id: I5daf4cfec24b252e5a57001f0a6c03f22479d0f0 Reviewed-on: https://go-review.googlesource.com/85887 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:23 +00:00
Austin Clements	45ffeab549	runtime: eliminate most uses of mheap_.arena_* This replaces all uses of the mheap_.arena_* fields outside of mallocinit and sysAlloc. These fields fundamentally assume a contiguous heap between two bounds, so eliminating these is necessary for a sparse heap. Many of these are replaced with checks for non-nil spans at the test address (which in turn checks for a non-nil entry in the heap arena array). Some of them are just for debugging and somewhat meaningless with a sparse heap, so those we just delete. Updates #10460. Change-Id: I8345b95ffc610aed694f08f74633b3c63506a41f Reviewed-on: https://go-review.googlesource.com/85886 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:22 +00:00
Austin Clements	d6e8218581	runtime: make span map sparse This splits the span map into separate chunks for every 64MB of the heap. The span map chunks now live in the same indirect structure as the bitmap. Updates #10460. This causes a slight improvement in compilebench and the x/benchmarks garbage benchmark. I'm not sure why it improves performance. name old time/op new time/op delta Template 185ms ± 1% 184ms ± 1% ~ (p=0.315 n=9+10) Unicode 86.9ms ± 1% 86.9ms ± 3% ~ (p=0.356 n=9+10) GoTypes 602ms ± 1% 599ms ± 0% -0.59% (p=0.002 n=9+10) Compiler 2.89s ± 0% 2.87s ± 1% -0.50% (p=0.003 n=9+9) SSA 7.25s ± 0% 7.29s ± 1% ~ (p=0.400 n=9+10) Flate 118ms ± 1% 118ms ± 2% ~ (p=0.065 n=10+9) GoParser 147ms ± 2% 147ms ± 1% ~ (p=0.549 n=10+9) Reflect 403ms ± 1% 401ms ± 1% -0.47% (p=0.035 n=9+10) Tar 176ms ± 1% 175ms ± 1% -0.59% (p=0.013 n=10+9) XML 211ms ± 1% 209ms ± 1% -0.83% (p=0.011 n=10+10) (https://perf.golang.org/search?q=upload:20171231.1) name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.24ms ± 1% 2.23ms ± 1% -0.36% (p=0.001 n=20+19) (https://perf.golang.org/search?q=upload:20171231.2) Change-Id: I2563f8704ab9812434947faf293c5327f9b0d07a Reviewed-on: https://go-review.googlesource.com/85885 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:20 +00:00
Austin Clements	0de5324d61	runtime: abstract remaining mheap.spans access This abstracts the remaining direct accesses to mheap.spans into new mheap.setSpan and mheap.setSpans methods. For #10460. Change-Id: Id1db8bc5e34a77a9221032aa2e62d05322707364 Reviewed-on: https://go-review.googlesource.com/85884 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:19 +00:00
Austin Clements	c0392d2e7f	runtime: make the heap bitmap sparse This splits the heap bitmap into separate chunks for every 64MB of the heap and introduces an index mapping from virtual address to metadata. It modifies the heapBits abstraction to use this two-level structure. Finally, it modifies heapBitsSetType to unroll the bitmap into the object itself and then copy it out if the bitmap would span discontiguous bitmap chunks. This is a step toward supporting general sparse heaps, which will eliminate address space conflict failures as well as the limit on the heap size. It's also advantageous for 32-bit. 32-bit already supports discontiguous heaps by always starting the arena at address 0. However, as a result, with a contiguous bitmap, if the kernel chooses a high address (near 2GB) for a heap mapping, the runtime is forced to map up to 128MB of heap bitmap. Now the runtime can map sections of the bitmap for just the parts of the address space used by the heap. Updates #10460. This slightly slows down the x/garbage and compilebench benchmarks. However, I think the slowdown is acceptably small. name old time/op new time/op delta Template 178ms ± 1% 180ms ± 1% +0.78% (p=0.029 n=10+10) Unicode 85.7ms ± 2% 86.5ms ± 2% ~ (p=0.089 n=10+10) GoTypes 594ms ± 0% 599ms ± 1% +0.70% (p=0.000 n=9+9) Compiler 2.86s ± 0% 2.87s ± 0% +0.40% (p=0.001 n=9+9) SSA 7.23s ± 2% 7.29s ± 2% +0.94% (p=0.029 n=10+10) Flate 116ms ± 1% 117ms ± 1% +0.99% (p=0.000 n=9+9) GoParser 146ms ± 1% 146ms ± 0% ~ (p=0.193 n=10+7) Reflect 399ms ± 0% 403ms ± 1% +0.89% (p=0.001 n=10+10) Tar 173ms ± 1% 174ms ± 1% +0.91% (p=0.013 n=10+9) XML 208ms ± 1% 210ms ± 1% +0.93% (p=0.000 n=10+10) [Geo mean] 368ms 371ms +0.79% name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.17ms ± 1% 2.21ms ± 1% +2.15% (p=0.000 n=20+20) Change-Id: I037fd283221976f4f61249119d6b97b100bcbc66 Reviewed-on: https://go-review.googlesource.com/85883 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:18 +00:00
Austin Clements	f61057c497	runtime: fix various contiguous bitmap assumptions There are various places that assume the heap bitmap is contiguous and scan it sequentially. We're about to split up the heap bitmap. This commit modifies all of these except heapBitsSetType to use the heapBits abstractions so they can transparently switch to a discontiguous bitmap. Updates #10460. This is a step toward supporting sparse heaps. Change-Id: I2f3994a5785e4dccb66602fb3950bbd290d9392c Reviewed-on: https://go-review.googlesource.com/85882 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:17 +00:00
Austin Clements	29e9c4d4a4	runtime: lay out heap bitmap forward in memory Currently the heap bitamp is laid in reverse order in memory relative to the heap itself. This was originally done out of "excessive cleverness" so that computing a bitmap pointer could load only the arena_start field and so that heaps could be more contiguous by growing the arena and the bitmap out from a common center point. However, this appears to have no actual performance benefit, it complicates nearly every use of the bitmap, and it makes already confusing code more confusing. Furthermore, it's still possible to use a single field (the new bitmap_delta) for the bitmap pointer computation by employing slightly different excessive cleverness. Hence, this CL puts the bitmap into forward order. This is a (very) updated version of CL 9404. Change-Id: I743587cc626c4ecd81e660658bad85b54584108c Reviewed-on: https://go-review.googlesource.com/85881 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:16 +00:00
Austin Clements	4de468621a	runtime: use spanOf* more widely The logic in the spanOf* functions is open-coded in a lot of places right now. Replace these with calls to the spanOf* functions. Change-Id: I3cc996aceb9a529b60fea7ec6fef22008c012978 Reviewed-on: https://go-review.googlesource.com/85880 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:15 +00:00
Austin Clements	a90f9a00ca	runtime: consolidate mheap.lookup* and spanOf* I think we'd forgotten about the mheap.lookup APIs when we introduced spanOf, but, at any rate, the spanOf functions are used far more widely at this point, so this CL eliminates the mheap.lookup* functions in favor of spanOf*. Change-Id: I15facd0856e238bb75d990e838a092b5bef5bdfc Reviewed-on: https://go-review.googlesource.com/85879 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:14 +00:00
Austin Clements	058bb7ea27	runtime: split object finding out of heapBitsForObject heapBitsForObject does two things: it finds the base of the object and it creates the heapBits for the base of the object. There are several places where we just care about the base of the object. Furthermore, greyobject only needs the heapBits in the checkmark path and can easily compute them only when needed. Once we eliminate passing the heap bits to grayobject, almost all uses of heapBitsForObject don't need the heap bits. Hence, this splits heapBitsForObject into findObject and heapBitsForAddr (the latter already exists), removes the hbits argument to grayobject, and replaces all heapBitsForObject calls with calls to findObject. In addition to making things cleaner overall, heapBitsForAddr is going to get more expensive shortly, so it's important that we don't do it needlessly. Note that there's an interesting performance pitfall here. I had originally moved findObject to mheap.go, since it made more sense there. However, that leads to a ~2% slow down and a whopping 11% increase in L1 icache misses on both the x/garbage and compilebench benchmarks. This suggests we may want to be more principled about this, but, for now, let's just leave findObject in mbitmap.go. (I tried to make findObject small enough to inline by splitting out the error case, but, sadly, wasn't quite able to get it under the inlining budget.) Change-Id: I7bcb92f383ade565d22a9f2494e4c66fd513fb10 Reviewed-on: https://go-review.googlesource.com/85878 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:13 +00:00
Austin Clements	41e6abdc61	runtime: replace mlookup and findObject with heapBitsForObject These functions all serve essentially the same purpose. mlookup is used in only one place and findObject in only three. Use heapBitsForObject instead, which is the most optimized implementation. (This may seem slightly silly because none of these uses care about the heap bits, but we're about to split up the functionality of heapBitsForObject anyway. At that point, findObject will rise from the ashes.) Change-Id: I906468c972be095dd23cf2404a7d4434e802f250 Reviewed-on: https://go-review.googlesource.com/85877 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:12 +00:00
Austin Clements	b1d94c118f	runtime: validate lfnode addresses Change-Id: Ic8c506289caaf6218494e5150d10002e0232feaa Reviewed-on: https://go-review.googlesource.com/85876 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:11 +00:00
Austin Clements	981d0495b7	runtime: expand/update lfstack address space assumptions I was spelunking Linux's address space code and found that some of the information about maximum virtual addresses in lfstack's comments was out of date. This expands and updates the comment. Change-Id: I9f54b23e6b266b3c5cc20259a849231fb751f6e7 Reviewed-on: https://go-review.googlesource.com/85875 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:09 +00:00
Hana Kim	1ae22d8cfe	internal/trace: link user span start and end events Also add testdata for version 1.11 including UserTaskSpan test trace. Change-Id: I673fb29bb3aee96a14fadc0ab860d4f5832143f5 Reviewed-on: https://go-review.googlesource.com/93795 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-02-15 19:33:20 +00:00
Hana Kim	6977a3b257	runtime/trace: implement annotation API This implements the annotation API proposed in golang.org/cl/63274. traceString is updated to protect the string map with trace.stringsLock because the assumption that traceString is called by a single goroutine (either at the beginning of tracing and at the end of tracing when dumping all the symbols and function names) is no longer true. traceString is used by the annotation apis (NewContext, StartSpan, Log) to register frequently appearing strings (task and span names, and log keys) after this change. NewContext -> one or two records (EvString, EvUserTaskCreate) end function -> one record (EvUserTaskEnd) StartSpan -> one or two records (EvString, EvUserSpan) span end function -> one or two records (EvString, EvUserSpan) Log -> one or two records (EvString, EvUserLog) EvUserLog record is of the typical record format written by traceEvent except that it is followed by bytes that represents the value string. In addition to runtime/trace change, this change includes corresponding changes in internal/trace to parse the new record types. Future work to improve efficiency: More efficient unique task id generation instead of atomic. (per-P counter). Instead of a centralized trace.stringsLock, consider using per-P string cache or something more efficient. R=go1.11 Change-Id: Iec9276c6c51e5be441ccd52dec270f1e3b153970 Reviewed-on: https://go-review.googlesource.com/71690 Reviewed-by: Austin Clements <austin@google.com>	2018-02-15 18:54:14 +00:00
Hana Kim	32d1cd33c7	runtime/trace: user annotation API This CL presents the proposed user annotation API skeleton. This CL bumps up the trace version to 1.11. Design doc https://goo.gl/iqJfJ3 Implementation CLs are followed. The API introduces three basic building blocks. Log, Span, and Task. Log is for basic logging. When called, the message will be recorded to the trace along with timestamp, goroutine id, and stack info. trace.Log(ctx, messageType message) Span can be thought as an extension of log to record interesting time interval during a goroutine's execution. A span is local to a goroutine by definition. trace.WithSpan(ctx, "doVeryExpensiveOp", func(ctx context) { /* do something very expensive */ }) Task is higher-level concept that aids tracing of complex operations that encompass multiple goroutines or are asynchronous. For example, an RPC request, a HTTP request, a file write, or a batch job can be traced with a Task. Note we chose to design the API around context.Context so it allows easier integration with other tracing tools, often designed around context.Context as well. Log and WithSpan APIs recognize the task information embedded in the context and record it in the trace as well. That allows the Go execution tracer to associate and group the spans and log messages based on the task information. In order to create a Task, ctx, end := trace.NewContext(ctx, "myTask") defer end() The Go execution tracer measures the time between the task created and the task ended for the task latency. More discussion history in golang.org/cl/59572. Update #16619 R=go1.11 Change-Id: I59a937048294dafd23a75cf1723c6db461b193cd Reviewed-on: https://go-review.googlesource.com/63274 Reviewed-by: Austin Clements <austin@google.com>	2018-02-15 18:52:43 +00:00
Tobias Klauser	afb9fc1de9	runtime: move ELF structure definitions into own files Move the ELF32 and ELF64 structure definitions into their own files so they can be reused when vDSO support is added for other architectures. Change-Id: Id0171b4e5cea4add8635743c881e3bf3469597af Reviewed-on: https://go-review.googlesource.com/93995 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-02-15 16:15:19 +00:00
Josh Bleecher Snyder	605c9feeb1	runtime: speed up stack copying a little Remove a branch and a stack spill. name old time/op new time/op delta StackCopy-8 79.2ms ± 1% 79.1ms ± 2% ~ (p=0.063 n=96+95) StackCopyNoCache-8 121ms ± 1% 120ms ± 2% -0.46% (p=0.000 n=97+88) Change-Id: Ifcbbb05d773178fad84cb11a9a6768ace69fcf24 Reviewed-on: https://go-review.googlesource.com/94029 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-02-15 15:06:34 +00:00
Josh Bleecher Snyder	910d232a28	runtime: simplify amd64 memmove of 3/4 bytes Change-Id: I132d3627ae301b68bf87eacb5bf41fd1ba2dcd91 Reviewed-on: https://go-review.googlesource.com/94025 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-15 15:05:53 +00:00
Josh Bleecher Snyder	8e0b814a3a	runtime: fix minor doc typos in amd64 memmove Change-Id: Ic1ce2f93d6a225699e9ce5307d62cdda8f97630d Reviewed-on: https://go-review.googlesource.com/94024 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-15 15:05:34 +00:00
Josh Bleecher Snyder	3658299f44	runtime: short-circuit typedslicecopy when dstp == srcp If copying from a slice to itself, skip the write barriers and actual memory copies. This happens in practice in code like this snippet from the trim pass in the compiler, when k ends up being 0: copy(s.Values[k:], s.Values[:m]) Change-Id: Ie6924acfd56151f874d87f1d7f1f74320b4c4f10 Reviewed-on: https://go-review.googlesource.com/94023 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-02-15 15:05:15 +00:00
Martin Möhrmann	bf9f1c1503	runtime: use new instead of newobject to create hmap in makemap The runtime.hmap type is known at compile time. Using new(hmap) avoids loading the hmap type from the maptype supplied as an argument to makemap which is only known at runtime. This change makes makemap consistent with makemap_small by using new(hmap) instead of newobject in both functions. Change-Id: Ia47acfda527e8a71d15a1a7a4c2b54fb923515eb Reviewed-on: https://go-review.googlesource.com/91775 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-02-15 08:57:26 +00:00
Martin Möhrmann	530927e08a	runtime: improve test file naming The runtime builtin functions that are tested in append_test.go are defined in slice.go. Renaming the test file to slice_test.go makes this relation explicit with a common file name prefix. Change-Id: I2f89ec23a6077fe6b80d2161efc760df828c8cd4 Reviewed-on: https://go-review.googlesource.com/90655 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-02-15 08:56:58 +00:00
Ian Lance Taylor	07751f4b58	runtime: use private futexes on Linux By default futexes are permitted in shared memory regions, which requires the kernel to translate the memory address. Since our futexes are never in shared memory, set FUTEX_PRIVATE_FLAG, which makes futex operations slightly more efficient. Change-Id: I2a82365ed27d5cd8d53c5382ebaca1a720a80952 Reviewed-on: https://go-review.googlesource.com/80144 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2018-02-14 17:37:26 +00:00
Cherry Zhang	5a43a271e8	cmd/compile: CALLudiv on nacl/arm doesn't clobber R12 On nacl/arm, R12 is clobbered by the RET instruction in function that has a frame. runtime.udiv doesn't have a frame, so it does not clobber R12. Change-Id: I0de448749f615908f6659e92d201ba3eb2f8266d Reviewed-on: https://go-review.googlesource.com/93116 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-02-14 17:09:15 +00:00
Cherry Zhang	633b38c5d2	runtime/internal/atomic: add early nil check on ARM If nil, fault before taking the lock or calling into the kernel. Change-Id: I013d78a5f9233c2a9197660025f679940655d384 Reviewed-on: https://go-review.googlesource.com/93636 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-02-14 17:09:05 +00:00
Cherry Zhang	97124af99a	runtime/internal/atomic: unify sys_*_arm.s on non-linux Updates #23778. Change-Id: I80e57a15b6e3bbc2e25ea186399ff0e360fc5c21 Reviewed-on: https://go-review.googlesource.com/93635 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-02-14 17:08:58 +00:00
David Crawshaw	b03f1d1a7e	runtime: remove extraneous stackPreempt setting The stackguard is set to stackPreempt earlier in reentersyscall, and as it comes with throwsplit = true there's no way for the stackguard to be set to anything else by the end of reentersyscall. Change-Id: I4e942005b22ac784c52398c74093ac887fc8ec24 Reviewed-on: https://go-review.googlesource.com/65673 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-02-14 15:27:11 +00:00
Tobias Klauser	0e1bcfc638	runtime: add symbol for AT_FDCWD on Linux amd64 and mips64x Also order the syscall number list by numerically for mips64x. Follow-up for CL 92895. Change-Id: I5f01f8c626132a06160997fce8a2aef0c486bb1c Reviewed-on: https://go-review.googlesource.com/93616 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-02-14 08:48:44 +00:00
David du Colombier	5114a7daa2	runtime/trace: fix TestTraceSymbolize when GOMAXPROCS=1 CL 92916 added the GOMAXPROCS test in TestTraceSymbolize. This test only succeeds when the value of GOMAXPROCS changes. Since the test calls runtime.GOMAXPROCS(1), it will fails on machines where GOMAXPROCS=1. This change fixes the test by calling runtime.GOMAXPROCS(oldGoMaxProcs+1). Fixes #23816. Change-Id: I1183dbbd7db6077cbd7fa0754032ff32793b2195 Reviewed-on: https://go-review.googlesource.com/93735 Run-TryBot: David du Colombier <0intro@gmail.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-02-13 22:55:49 +00:00
Austin Clements	8693b4f095	runtime: remove unused memlimit function Change-Id: Id057dcc85d64e5c670710fbab6cacd4b906cf594 Reviewed-on: https://go-review.googlesource.com/93655 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-13 22:35:47 +00:00
Austin Clements	ddb503be96	runtime: avoid bad unwinding from sigpanic in C code Currently, if a sigpanic call is injected into C code, it's possible for preparePanic to leave the stack in a state where traceback can't unwind correctly past the sigpanic. Specifically, shouldPushPanic sniffs the stack to decide where to put the PC from the signal context. In the cgo case, it will find that !findfunc(pc).valid() because pc is in C code, and then it will check if the top of the stack looks like a Go PC. However, this stack slot is just in a C frame, so it could be uninitialized and contain anything, including what looks like a valid Go PC. For example, in https://build.golang.org/log/c601a18e2af24794e6c0899e05dddbb08caefc17, it sees 1c02c23a <runtime.newproc1+682>. When this condition is met, it skips putting the signal PC on the stack at all. As a result, when we later unwind from the sigpanic, we'll "successfully" but incorrectly unwind to whatever PC was in this uninitialized slot and go who knows where from there. Fix this by making shouldPushPanic assume that the signal PC is always usable if we're running C code, so we always make it appear like sigpanic's caller. This lets us be pickier again about unexpected return PCs in gentraceback. Updates #23640. Change-Id: I1e8ade24b031bd905d48e92d5e60c982e8edf160 Reviewed-on: https://go-review.googlesource.com/91137 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-02-13 21:01:26 +00:00
Austin Clements	615d44c287	runtime: refactor test for pushing sigpanic frame This logic is duplicated in all of the preparePanic functions. Pull it out into one architecture-independent function. Change-Id: I7ef4e78e3eda0b7be1a480fb5245fc7424fb2b4e Reviewed-on: https://go-review.googlesource.com/91255 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-02-13 21:01:25 +00:00
Hana Kim	dc3bef3635	runtime/gdb: use goroutine atomicstatus to determine the state Previously find_goroutine determined whether a goroutine is stopped by checking the sched.sp field. This heuristic doesn't always hold but causes find_goroutine to return bogus pc/sp info for running goroutines. This change uses the atomicstatus bit to determine the state which is more accurate. R=go1.11 Change-Id: I537d432d9e0363257120a196ce2ba52da2970f59 Reviewed-on: https://go-review.googlesource.com/49691 Reviewed-by: Austin Clements <austin@google.com>	2018-02-13 19:23:37 +00:00
Hana Kim	ef175731ff	runtime: remove hardcoded runtime consts from gdb script Instead evaluate and read the runtime internal constants defined in runtime2.go R=go1.11 Change-Id: If2f4b87e5b3f62f0c0ff1e86a90db8e37a78abb6 Reviewed-on: https://go-review.googlesource.com/87877 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Austin Clements <austin@google.com>	2018-02-13 19:23:21 +00:00
Hana Kim	ebd04885c8	runtime/trace: add stack tests for GOMAXPROCS and reorganize test log messages for stack dumps for easier debugging. The error log will be formatted like the following: trace_stack_test.go:282: Did not match event GoCreate with stack runtime/trace_test.TestTraceSymbolize :39 testing.tRunner :0 Seen 30 events of the type Offset 1890 runtime/trace_test.TestTraceSymbolize /go/src/runtime/trace/trace_stack_test.go:30 testing.tRunner /go/src/testing/testing.go:777 Offset 1899 runtime/trace_test.TestTraceSymbolize /go/src/runtime/trace/trace_stack_test.go:30 testing.tRunner /go/src/testing/testing.go:777 ... Change-Id: I0468de04507d6ae38ba84d99d13f7bf592e8d115 Reviewed-on: https://go-review.googlesource.com/92916 Reviewed-by: Heschi Kreinick <heschi@google.com> Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>	2018-02-13 18:45:32 +00:00
Austin Clements	2010189407	runtime: remove legacy eager write barrier Now that the buffered write barrier is implemented for all architectures, we can remove the old eager write barrier implementation. This CL removes the implementation from the runtime, support in the compiler for calling it, and updates some compiler tests that relied on the old eager barrier support. It also makes sure that all of the useful comments from the old write barrier implementation still have a place to live. Fixes #22460. Updates #21640 since this fixes the layering concerns of the write barrier (but not the other things in that issue). Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74 Reviewed-on: https://go-review.googlesource.com/92705 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-02-13 16:34:46 +00:00
Austin Clements	245310883d	runtime: eliminate all writebarrierptr* calls Calls to writebarrierptr can simply be actual pointer writes. Calls to writebarrierptr_prewrite need to go through the write barrier buffer. Updates #22460. Change-Id: I92cee4da98c5baa499f1977563757c76f95bf0ca Reviewed-on: https://go-review.googlesource.com/92704 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-13 16:34:45 +00:00
Austin Clements	2ae1e1ae2f	runtime: buffered write barrier for s390x Updates #22460. Change-Id: I3f793e69577c1b837ad2666e6209a97a452405d4 Reviewed-on: https://go-review.googlesource.com/92703 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-13 16:34:24 +00:00
Austin Clements	ae7d5f84f8	runtime: buffered write barrier for ppc64 Updates #22460. Change-Id: I6040c4024111c80361c81eb7eec5071ec9efb4f9 Reviewed-on: https://go-review.googlesource.com/92702 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-13 16:34:23 +00:00
Austin Clements	313a4b2b7f	runtime: buffered write barrier for mips Updates #22460. Change-Id: Ieaca94385c3bb88dcc8351c3866b4b0e2a1412b5 Reviewed-on: https://go-review.googlesource.com/92701 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-13 16:34:21 +00:00
Austin Clements	a39de96438	runtime: buffered write barrier for mips64 Updates #22460. Change-Id: I9718bff3a346e765601cfd1890417bdfa0f7b9d8 Reviewed-on: https://go-review.googlesource.com/92700 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-13 16:34:20 +00:00
Austin Clements	79594ee95a	runtime: buffered write barrier for arm64 Updates #22460. Change-Id: I5f8fbece9545840f5fc4c9834e2050b0920776f0 Reviewed-on: https://go-review.googlesource.com/92699 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-13 16:34:19 +00:00
Austin Clements	1de1f316df	runtime: buffered write barrier for arm Updates #22460. Change-Id: I5581df7ad553237db7df3701b117ad99e0593b78 Reviewed-on: https://go-review.googlesource.com/92698 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-02-13 16:34:17 +00:00
Austin Clements	24dd83d7eb	runtime: buffered write barrier for amd64p32 Updates #22460. Change-Id: I6656d478625e5e54aa2eaa38d99dfb0f71ea1fdd Reviewed-on: https://go-review.googlesource.com/92697 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-13 16:34:16 +00:00
Austin Clements	252f1170e5	runtime: buffered write barrier for 386 Updates #22460. Change-Id: I3c8e90fd6bcda7e28911036591873d63665aaca7 Reviewed-on: https://go-review.googlesource.com/92696 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-13 16:34:15 +00:00
Jason A. Donenfeld	04e6ae6bc3	runtime: use Android O friendly syscalls on 64-bit machines Android O disallows open on 64-bit, so let's use openat with AT_FDCWD to achieve the same behavior. Android O disallows epoll_wait on 64-bit, so let's use epoll_pwait with the last argument as NULL to achieve the same behavior. See here: https://android.googlesource.com/platform/bionic/+/master/libc/seccomp/arm64_app_policy.cpp https://android.googlesource.com/platform/bionic/+/master/libc/seccomp/mips64_app_policy.cpp https://android.googlesource.com/platform/bionic/+/master/libc/seccomp/x86_64_app_policy.cpp Fixes #23750 Change-Id: If8d5a663357471e5d2c1f516151344a9d05b188a Reviewed-on: https://go-review.googlesource.com/92895 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-02-13 15:33:19 +00:00
Yasuhiro Matsumoto	4dad4ab57b	runtime: fix typo in comment GitHub-Last-Rev: `d6a6fa3909` GitHub-Pull-Request: golang/go#23809 Change-Id: Ife18ba2f982b5e1c30bda32d13dcd441778b986a Reviewed-on: https://go-review.googlesource.com/93575 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-02-13 15:12:17 +00:00
Austin Clements	01b8f5d7cf	runtime: remove legacy comments and code from arm morestack CL 137410043 deleted support for split stacks, which means morestack no longer needed to save its caller's frame or argument size or its caller's argument pointer. However, this commit failed to update the comment or delete the line that computed the caller's argument pointer. Clean these up now. Change-Id: I65725d3d42c86e8adb6645d5aa80c305d473363d Reviewed-on: https://go-review.googlesource.com/92437 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-12 21:41:34 +00:00
Austin Clements	dfbf568c9f	runtime: use NOFRAME on mips and mips64 This replaces frame size -4/-8 with the NOFRAME flag in mips and mips64 assembly. This was automated with: sed -i -e 's/$^TEXT.[A-Z]$,$ $\$-[84]/\1\|NOFRAME,\2$0/' $(find -name '_mips.s') Plus a manual fix to mkduff.go. The go binary is identical on both architectures before and after this change. Change-Id: I0310384d1a584118c41d1cd3a042bb8ea7227efb Reviewed-on: https://go-review.googlesource.com/92044 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-12 21:41:32 +00:00
Austin Clements	beeabbcb25	runtime: use NOFRAME on arm64 This replaces frame size -8 with the NOFRAME flag in arm64 assembly. This was automated with: sed -i -e 's/$^TEXT.[A-Z]$,$ $\$-8/\1\|NOFRAME,\2$0/' $(find -name '*_arm64.s') Plus a manual fix to mkduff.go. The go binary is identical before and after this change. Change-Id: I0310384d1a584118c41d1cd3a042bb8ea7227efa Reviewed-on: https://go-review.googlesource.com/92043 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-12 21:41:31 +00:00
Austin Clements	a046caa1e8	runtime, sync/atomic: use NOFRAME on arm This replaces frame size -4 with the NOFRAME flag in arm assembly. This was automated with: sed -i -e 's/$^TEXT.[A-Z]$,$ $\$-4/\1\|NOFRAME,\2$0/' $(find -name '_arm.s') Plus three manual comment changes found by: grep '\$-4' $(find -name '_arm.s') The go binary is identical before and after this change. Change-Id: I0310384d1a584118c41d1cd3a042bb8ea7227ef9 Reviewed-on: https://go-review.googlesource.com/92042 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-12 21:41:30 +00:00
Austin Clements	8a064c6008	runtime: fix silly frame sizes on arm and arm64 "-8" is not a sensible frame size on arm and we're about to start rejecting it. Replace it with -4. Likewise, "-4" is not a sensible frame size on arm64 and we're about to start rejecting it. Replace it with -8. Finally, clean up some places we're weirdly inconsistent about using 0 versus -8. Change-Id: If85e229993d5f7f1f0cfa9852b4e294d053bd784 Reviewed-on: https://go-review.googlesource.com/92038 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-02-12 21:41:23 +00:00
Austin Clements	e5186895fc	runtime: restore RSB for sigpanic call on mips64x preparePanic must set all registers expected by Go runtime conventions in case the sigpanic is being injected into C code. However, on mips64x it fails to restore RSB (R28). As a result, if C code modifies RSB and then raises a signal that turns into a sigpanic call, sigpanic may crash when it attempts to lock runtime.debuglock (the first global it references). Fix this by restoring RSB in the signal context using the same convention as main and sigtramp. Fixes #23641. Change-Id: Ib47e83df89e2a3eece10f480e4e91ce9e4424388 Reviewed-on: https://go-review.googlesource.com/91156 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-01-31 20:57:53 +00:00
Austin Clements	3ff41cdffa	runtime: suppress "unexpected return pc" any time we're in cgo Currently, gentraceback suppresses the "unexpected return pc" error for sigpanic's caller if the M was running C code. However, there are various situations where a sigpanic is injected into C code that can cause traceback to unwind past the sigpanic before realizing that it's in trouble (the traceback beyond the sigpanic will be wrong). Rather than try to fix these issues for Go 1.10, this CL simply disables complaining about unexpected return PCs if we're in cgo regardless of whether or not they're from the sigpanic frame. Go 1.9 never complained about unexpected return PCs when printing, so this is simply a step closer to the old behavior. This should fix the openbsd-386 failures on the dashboard, though this issue could affect any architecture. Fixes #23640. Change-Id: I8c32c1ee86a70d2f280661ed1f8caf82549e324b Reviewed-on: https://go-review.googlesource.com/91136 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-01-31 20:57:52 +00:00
Austin Clements	ebe38b867c	runtime: fail silently if we unwind over sigpanic into C code If we're running C code and the code panics, the runtime will inject a call to sigpanic into the C code just like it would into Go code. However, the return PC from this sigpanic will be in C code. We used to silently abort the traceback if we didn't recognize a return PC, so this went by quietly. Now we're much louder because in general this is a bad thing. However, in this one particular case, it's fine, so if we're in cgo and are looking at the return PC of sigpanic, silence the debug output. Fixes #23576. Change-Id: I03d0c14d4e4d25b29b1f5804f5e9ccc4f742f876 Reviewed-on: https://go-review.googlesource.com/90896 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-01-31 02:13:21 +00:00
Austin Clements	5c2be42a68	runtime: don't unwind past asmcgocall asmcgocall switches to the system stack and aligns the SP, so gentraceback both can't unwind over it when it appears on the system stack (it'll read some uninitialized stack slot as the return PC). There's also no point in unwinding over it, so don't. Updates #23576. Change-Id: Idfcc9599c7636b80dec5451cb65ae892b4611981 Reviewed-on: https://go-review.googlesource.com/90895 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-01-31 02:13:19 +00:00
Hana Kim	e89d08e021	runtime/pprof: scale mutex profile with sampling rate pprof expects the samples are scaled and reflects unsampled numbers. The legacy profile parser uses the sampling period in the output and multiplies all values with the period. `0138a3cd6d/profile/legacy_profile.go (L815)` Apply the same scaling when we output the mutex profile in the pprof proto format. Block profile shares the same code, but how to infer unsampled values is unclear. Legacy profile parser doesn't do anything special so we do nothing for block profile here. Tested by checking the profiles reported with debug=0 (proto format) are similar to the profiles computed from legacy format profile when the profile rate is a non-trivial number (e.g. 2) manually. Change-Id: Iaa33f92051deed67d8be43ddffc7c1016db566ca Reviewed-on: https://go-review.googlesource.com/89295 Reviewed-by: Peter Weinberger <pjw@google.com>	2018-01-24 14:06:59 +00:00
Austin Clements	2edc4d4634	runtime: never allocate during an unrecoverable panic Currently, startpanic_m (which prepares for an unrecoverable panic) goes out of its way to make it possible to allocate during panic handling by allocating an mcache if there isn't one. However, this is both potentially dangerous and unnecessary. Allocating an mcache is a generally complex thing to do in an already precarious situation. Specifically, it requires obtaining the heap lock, and there's evidence that this may be able to deadlock (#23360). However, it's also unnecessary because we never allocate from the unrecoverable panic path. This didn't use to be the case. The call to allocmcache was introduced long ago, in CL 7388043, where it was in preparation for separating Ms and Ps and potentially running an M without an mcache. At the time, after calling startpanic, the runtime could call String and Error methods on panicked values, which could do anything including allocating. That was generally unsafe even at the time, and CL 19792 fixed this be pre-printing panic messages before calling startpanic. As a result, we now no longer allocate after calling startpanic. This CL not only removes the allocmcache call, but goes a step further to explicitly disallow any allocation during unrecoverable panic handling, even in situations where it might be safe. This way, if panic handling ever does an allocation that would be unsafe in unusual circumstances, we'll know even if it happens during normal circumstances. This would help with debugging #23360, since the deadlock in allocmcache is currently masking the real failure. Beyond all.bash, I manually tested this change by adding panics at various points in early runtime init, signal handling, and the scheduler to check unusual panic situations. Change-Id: I85df21e2b4b20c6faf1f13fae266c9339eebc061 Reviewed-on: https://go-review.googlesource.com/88835 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-01-23 20:08:46 +00:00
Austin Clements	9483a0bc23	runtime: don't grow the stack on sigpanic if throwsplit Currently, if a _SigPanic signal arrives in a throwsplit context, nothing is stopping the runtime from injecting a call to sigpanic that may attempt to grow the stack. This will fail and, in turn, mask the real problem. Fix this by checking for throwsplit in the signal handler itself before injecting the sigpanic call. Updates #21431, where this problem is likely masking the real problem. Change-Id: I64b61ff08e8c4d6f6c0fb01315d7d5e66bf1d3e2 Reviewed-on: https://go-review.googlesource.com/87595 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-01-23 19:50:18 +00:00
Austin Clements	dbd8f3d739	runtime: print hexdump on traceback failure Currently, if anything goes wrong when printing a traceback, we simply cut off the traceback without any further diagnostics. Unfortunately, right now, we have a few issues that are difficult to debug because the traceback simply cuts off (#21431, #23484). This is an attempt to improve the debuggability of traceback failure by printing a diagnostic message plus a hex dump around the failed traceback frame when something goes wrong. The failures look like: goroutine 5 [running]: runtime: unexpected return pc for main.badLR2 called from 0xbad stack: frame={sp:0xc42004dfa8, fp:0xc42004dfc8} stack=[0xc42004d800,0xc42004e000) 000000c42004dea8: 0000000000000001 0000000000000001 000000c42004deb8: 000000c42004ded8 000000c42004ded8 000000c42004dec8: 0000000000427eea <runtime.dopanic+74> 000000c42004ded8 000000c42004ded8: 000000000044df70 <runtime.dopanic.func1+0> 000000c420001080 000000c42004dee8: 0000000000427b21 <runtime.gopanic+961> 000000c42004df08 000000c42004def8: 000000c42004df98 0000000000427b21 <runtime.gopanic+961> 000000c42004df08: 0000000000000000 0000000000000000 000000c42004df18: 0000000000000000 0000000000000000 000000c42004df28: 0000000000000000 0000000000000000 000000c42004df38: 0000000000000000 000000c420001080 000000c42004df48: 0000000000000000 0000000000000000 000000c42004df58: 0000000000000000 0000000000000000 000000c42004df68: 000000c4200010a0 0000000000000000 000000c42004df78: 00000000004c6400 00000000005031d0 000000c42004df88: 0000000000000000 0000000000000000 000000c42004df98: 000000c42004dfb8 00000000004ae7d9 <main.badLR2+73> 000000c42004dfa8: <00000000004c6400 00000000005031d0 000000c42004dfb8: 000000c42004dfd0 !0000000000000bad 000000c42004dfc8: >0000000000000000 0000000000000000 000000c42004dfd8: 0000000000451821 <runtime.goexit+1> 0000000000000000 000000c42004dfe8: 0000000000000000 0000000000000000 000000c42004dff8: 0000000000000000 main.badLR2(0x0) /go/src/runtime/testdata/testprog/badtraceback.go:42 +0x49 For #21431, #23484. Change-Id: I8718fc76ced81adb0b4b0b4f2293f3219ca80786 Reviewed-on: https://go-review.googlesource.com/89016 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-01-22 21:51:29 +00:00
Ian Lance Taylor	6104939432	runtime: pass dummy argc/argv correctly in r0_386_android_lib Fix breakage introduced in CL 70530. Change-Id: I87f3da6b20554d4f405a1143b0d894c5953b63aa Reviewed-on: https://go-review.googlesource.com/88516 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2018-01-21 04:56:36 +00:00
Brad Fitzpatrick	165e7523fb	sync: consistently use article "a" for RWMutex We used a mix of both before. I've never heard anybody say "an arr-double you mutex" when speaking. Fixes #23457 Change-Id: I802b5eb2339f885ca9d24607eeda565763165298 Reviewed-on: https://go-review.googlesource.com/87896 Reviewed-by: Andrew Bonventre <andybons@golang.org>	2018-01-16 23:09:57 +00:00
Giovanni Bajo	2d6f941e8c	runtime: fix time.Now on Sierra and older CL 67332 created the fast no-syscall path for time.Now in High Sierra but managed to break Sierra and older by forcing them into the slow syscall path: the version check based on commpage version was wrong. This CL uses the Darwin version number instead. The assembly diff is noisy because many variables had to be renamed, but the only actual change is the version check. Fixes #23419. Change-Id: Ie31ef5fb88f66d1517a8693942a7fb6100c213b0 Reviewed-on: https://go-review.googlesource.com/87655 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-01-16 16:49:41 +00:00
Tobias Klauser	7e054553ad	runtime: update URL of the Linux vDSO parser tool The tool was moved to tools/Testing/selftests within the Linux kernel source tree. Adjust the URL in the comments of vdso_linux.go Change-Id: I86b9cae4b898c4a45bc7c54891ce6ead91a22670 Reviewed-on: https://go-review.googlesource.com/87815 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-01-16 15:11:05 +00:00
Ian Lance Taylor	4b3a3bd3aa	runtime: don't issue cgocheck error for timer bucket source pointer The cgo checker was issuing an error with cgocheck=2 when a timer bucket was stored in a pollDesc. The pollDesc values are allocated using persistentalloc, so they are not in the Go heap. The code is OK since timer bucket pointers point into a global array, and as such are never garbage collected or moved. Mark timersBucket notinheap to avoid the problem. timersBucket values only occur in the global timers array. Fixes #23435 Change-Id: I835f31caafd54cdacc692db5989de63bb49e7697 Reviewed-on: https://go-review.googlesource.com/87637 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-01-15 22:18:55 +00:00
Kunpei Sakai	e858a6b9f0	all: use Fatalf instead of Fatal if format is given Change-Id: I30e9b938bb19ed4e674c3ea4a1cd389b9c4f0b88 Reviewed-on: https://go-review.googlesource.com/86875 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-01-10 01:35:45 +00:00
Russ Cox	8396015e80	cmd/link: set runtime.GOROOT default during link Suppose you build the Go toolchain in directory A, move the whole thing to directory B, and then use it from B to build a new program hello.exe, and then run hello.exe, and hello.exe crashes with a stack trace into the standard library. Long ago, you'd have seen hello.exe print file names in the A directory tree, even though the files had moved to the B directory tree. About two years ago we changed the compiler to write down these files with the name "$GOROOT" (that literal string) instead of A, so that the final link from B could replace "$GOROOT" with B, so that hello.exe's crash would show the correct source file paths in the stack trace. (golang.org/cl/18200) Now suppose that you do the same thing but hello.exe doesn't crash: it prints fmt.Println(runtime.GOROOT()). And you run hello.exe after clearing $GOROOT from the environment. Long ago, you'd have seen hello.exe print A instead of B. Before this CL, you'd still see hello.exe print A instead of B. This case is the one instance where a moved toolchain still divulges its origin. Not anymore. After this CL, hello.exe will print B, because the linker sets runtime/internal/sys.DefaultGoroot with the effective GOROOT from link time. This makes the default result of runtime.GOROOT once again match the file names recorded in the binary, after two years of divergence. With that cleared up, we can reintroduce GOROOT into the link action ID and also reenable TestExecutableGOROOT/RelocatedExe. When $GOROOT_FINAL is set during link, it is used in preference to $GOROOT, as always, but it was easier to explain the behavior above without introducing that complication. Fixes #22155. Fixes #20284. Fixes #22475. Change-Id: Ifdaeb77fd4678fdb337cf59ee25b2cd873ec1016 Reviewed-on: https://go-review.googlesource.com/86835 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-01-09 21:46:18 +00:00
Austin Clements	7c2cf4e779	runtime: avoid race on allp in findrunnable findrunnable loops over allp to check run queues after it has dropped its own P. This is unsafe because allp can change when nothing is blocking safe-points. Hence, procresize could change allp concurrently with findrunnable's loop. Beyond generally violating Go's memory model, in the best case this could findrunnable to observe a nil P pointer if allp has been grown but the new slots not yet initialized. In the worst case, the reads of allp could tear, causing findrunnable to read a word that isn't even a valid *P pointer. Fix this by taking a snapshot of the allp slice header (but not the backing store) before findrunnable drops its P and iterating over this snapshot. The actual contents of allp are immutable up to len(allp), so this fixes the race. Updates #23098 (may fix). Change-Id: I556ae2dbfffe9fe4a1bf43126e930b9e5c240ea8 Reviewed-on: https://go-review.googlesource.com/86215 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-01-04 18:01:55 +00:00
Austin Clements	77ea9f9f31	runtime: always use 1MB stacks on 32-bit Windows Commit `c2c07c7989` (CL 49331) changed the linker and runtime to always use 2MB stacks on 64-bit Windows. This is the corresponding change to make 32-bit Windows always use large (1MB) stacks because it's difficult to detect when Windows applications will call into arbitrary C code that may expect a large stack. This is done as a separate change because it's possible this will cause too much address space pressure for a 32-bit address space. On the other hand, cgo binaries on Windows already use 1MB stacks and there haven't been complaints. Updates #20975. Change-Id: I8ce583f07cb52254fb4bd47250f1ef2b789bc490 Reviewed-on: https://go-review.googlesource.com/49610 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2018-01-03 18:49:57 +00:00
Hana Kim	a58286c289	cmd/trace: init goroutine info entries with GoCreate event golang.org/cl/81315 attempted to distinguish system goroutines by examining the function name in the goroutine stack. It assumes that the information would be available when GoSysBlock or GoInSyscall events are processed, but it turned out the stack information is set too late (when the goroutine gets a chance to run). This change initializes the goroutine information entry when processing GoCreate event which should be one of the very first events for the every goroutine in trace. Fixes #22574 Change-Id: I1ed37087ce2e78ed27c9b419b7d942eb4140cc69 Reviewed-on: https://go-review.googlesource.com/83595 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-20 23:04:21 +00:00
Austin Clements	44213336f0	runtime: symbolize morestack caller in throwsplit panic This attempts to symbolize the PC of morestack's caller when there's a stack split at a bad time. The stack trace starts at the caller of the function that attempted to grow the stack, so this is useful if it isn't obvious what's being called at that point, such as in #21431. Change-Id: I5dee305d87c8069611de2d14e7a3083d76264f8f Reviewed-on: https://go-review.googlesource.com/84115 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-12-15 17:21:07 +00:00
Russ Cox	de14b2f638	all: fix t.Skipf formats Found by upcoming cmd/vet change. Change-Id: I7a8264a304b2a4f26f3bd418c1b28cc849889c9b Reviewed-on: https://go-review.googlesource.com/83835 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-13 21:31:45 +00:00
Austin Clements	043f112e52	runtime: reset write barrier buffer on all flush paths Currently, wbBufFlush does nothing if the goroutine is dying on the assumption that the system is crashing anyway and running the write barrier may crash it even more. However, it fails to reset the buffer's "next" pointer. As a result, if there are later write barriers on the same P, the write barrier will overflow the write barrier buffer and start corrupting other fields in the P or other heap objects. Often, this corrupts fields in the next allocated P since they tend to be together in the heap. Fix this by always resetting the buffer's "next" pointer, even if we're not doing anything with the pointers in the buffer. Updates #22987 and #22988. (May fix; it's hard to say.) Change-Id: I82c11ea2d399e1658531c3e8065445a66b7282b2 Reviewed-on: https://go-review.googlesource.com/83016 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-12-11 14:51:39 +00:00
Austin Clements	3675bff55d	runtime: mark heapBits.bits nosplit heapBits.bits is used during bulkBarrierPreWrite via heapBits.isPointer, which means it must not be preempted. If it is preempted, several bad things can happen: 1. This could allow a GC phase change, and the resulting shear between the barriers and the memory writes could result in a lost pointer. 2. Since bulkBarrierPreWrite uses the P's local write barrier buffer, if it also migrates to a different P, it could try to append to the write barrier buffer concurrently with another write barrier. This can result in the buffer's next pointer skipping over its end pointer, which results in a buffer overflow that can corrupt arbitrary other fields in the Ps (or anything in the heap, really, but it'll probably crash from the corrupted P quickly). Fix this by marking heapBits.bits go:nosplit. This would be the perfect use for a recursive no-preempt annotation (#21314). This doesn't actually affect any binaries because this function was always inlined anyway. (I discovered it when I was modifying heapBits and make h.bits() no longer inline, which led to rampant crashes from problem 2 above.) Updates #22987 and #22988 (but doesn't fix because it doesn't actually change the generated code). Change-Id: I60ebb928b1233b0613361ac3d0558d7b1cb65610 Reviewed-on: https://go-review.googlesource.com/83015 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-11 14:51:36 +00:00
Ian Lance Taylor	29cb57c5bd	runtime: don't use MAP_STACK in SigStack test On DragonFly mmap with MAP_STACK returns the top of the region, not the bottom. Rather than try to cope, just don't use the flag anywhere. Fixes #23061 Change-Id: Ib5df4dd7c934b3efecfc4bc87f8989b4c37555d7 Reviewed-on: https://go-review.googlesource.com/83035 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-12-09 01:21:32 +00:00
Paul Boyd	66ba18bf21	fix a typo in the runtime.MemStats documentation Change-Id: If553950446158cee486006ba85c3663b986008a6 Reviewed-on: https://go-review.googlesource.com/82936 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-12-08 18:01:57 +00:00
Brad Fitzpatrick	613f8cad90	runtime: make RawSyscall panic on Solaris It's unused and doesn't work. Fixes #20833 Change-Id: I09335e84c60f88dd1771f7353b0097f36a5e7660 Reviewed-on: https://go-review.googlesource.com/82636 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-12-08 00:11:19 +00:00
Ian Lance Taylor	0ec59e4c08	runtime: sleep longer in dieFromSignal on Darwin Fixes #20315 Change-Id: I5d5c82f10902b59168fc0cca0af50286843df55d Reviewed-on: https://go-review.googlesource.com/82375 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-07 00:56:23 +00:00
Christos Zoulas	2ff2eab0d2	runtime: fix NetBSD CPU spin in lwp_park when CPU profiling is active Fixes #22981 Change-Id: I449eb7b5e022401e80a3ab138063e2f4499fbdf8 Reviewed-on: https://go-review.googlesource.com/81855 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-05 00:08:51 +00:00
Christos Zoulas	66fcf45477	runtime: make NetBSD lwp_park use monotonic time This change updates runtime.semasleep to no longer call runtime.nanotime and instead calls lwp_park with a duration to sleep relative to the monotonic clock, so the nanotime is never called. (This requires updating to a newer version of the lwp_park system call, which is safe, because Go 1.10 will require the unreleased NetBSD 8+ anyway) Additionally, this change makes the nanotime function use the monotonic clock for netbsd/arm, which was forgotten from https://golang.org/cl/81135 which updated netbsd/amd64 and netbsd/386. Because semasleep previously depended on nanotime, the past few days of netbsd have likely been unstable because lwp_park was then mixing the monotonic and wall clocks. After this CL, lwp_park no longer depends on nanotime. Original patch submitted at: https://www.netbsd.org/~christos/go-lwp-park-clock-monotonic.diff This commit message (any any mistakes therein) were written by Brad Fitzpatrick. (Brad migrated the patch to Gerrit and checked CLAs) Updates #6007 Fixes #22968 Also updates netbsd/arm to use monotonic time for Change-Id: If77ef7dc610b3025831d84cdfadfbbba2c52acb2 Reviewed-on: https://go-review.googlesource.com/81715 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-12-04 03:29:56 +00:00
Austin Clements	ce5292a1f2	runtime: use MAP_ANON in sigstack check MAP_ANON is the deprecated but more portable spelling of MAP_ANONYMOUS. Use MAP_ANON to un-break the Darwin 10.10 builder. Updates #22930. Change-Id: Iedd6232b94390b3b2a7423c45cdcb25c1a5b3323 Reviewed-on: https://go-review.googlesource.com/81615 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-01 21:52:02 +00:00
Brad Fitzpatrick	7b57e21a07	runtime: skip gdb tests earlier before blocking goroutines in a t.Parallel Minor. Makes reading failing runtime test stacktraces easier (by having fewer goroutines to read) on machines where these gdb tests wouldn't have ever run anyway. Change-Id: I3fab0667e017f20ef3bf96a8cc4cfcc614d25b5c Reviewed-on: https://go-review.googlesource.com/81575 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-01 20:54:31 +00:00
Austin Clements	2e5011d802	runtime: even more TestStackGrowth timeout debugging This adds logging for the expected duration of a growStack, plus progress information on the growStack that timed out. Updates #19381. Change-Id: Ic358f8350f499ff22dd213b658aece7d1aa62675 Reviewed-on: https://go-review.googlesource.com/81556 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-12-01 20:46:46 +00:00
Austin Clements	aaccb3834c	runtime: improve sigsend documentation I think of "sending" a signal as calling kill, but sigsend is involved in handling a signal and, specifically delivering it to the internal signal queue. The term "delivery" is already used in signalWaitUntilIdle, so this CL also uses it in the documentation for sigsend. Change-Id: I86e171f247f525ece884a680bace616fa9a3c7bd Reviewed-on: https://go-review.googlesource.com/81235 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-12-01 20:34:13 +00:00
Austin Clements	292558be02	runtime: restore the Go-allocated signal stack in unminit Currently, when we minit on a thread that already has an alternate signal stack (e.g., because the M was an extram being used for a cgo callback, or to handle a signal on a C thread, or because the platform's libc always allocates a signal stack like on Android), we simply drop the Go-allocated gsignal stack on the floor. This is a problem for Ms on the extram list because those Ms may later be reused for a different thread that may not have its own alternate signal stack. On tip, this manifests as a crash in sigaltstack because we clear the gsignal stack bounds in unminit and later try to use those cleared bounds when we re-minit that M. On 1.9 and earlier, we didn't clear the bounds, so this manifests as running more than one signal handler on the same signal stack, which could lead to arbitrary memory corruption. This CL fixes this problem by saving the Go-allocated gsignal stack in a new field in the m struct when overwriting it with a system-provided signal stack, and then restoring the original gsignal stack in unminit. This CL is designed to be easy to back-port to 1.9. It won't quite cherry-pick cleanly, but it should be sufficient to simply ignore the change in mexit (which didn't exist in 1.9). Now that we always have a place to stash the original signal stack in the m struct, there are some simplifications we can make to the signal stack handling. We'll do those in a later CL. Fixes #22930. Change-Id: I55c5a6dd9d97532f131146afdef0b216e1433054 Reviewed-on: https://go-review.googlesource.com/81476 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-12-01 20:20:45 +00:00
Joe Tsai	b53088a634	Revert "go/printer: forbid empty line before first comment in block" This reverts commit `08f19bbde1`. Reason for revert: The changed transformation takes effect on a larger set of code snippets than expected. For example, this: func foo() { // Comment bar() } becomes: func foo() { // Comment bar() } This is an unintended consequence. Change-Id: Ifca88d6267dab8a8170791f7205124712bf8ace8 Reviewed-on: https://go-review.googlesource.com/81335 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Joe Tsai <joetsai@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-01 01:12:26 +00:00
Brad Fitzpatrick	2065685664	runtime: use monotonic time on NetBSD Fixes #6007 Change-Id: I239a1699122e086e907ac1f18b1c86a650e1438a Reviewed-on: https://go-review.googlesource.com/81135 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-12-01 00:42:03 +00:00
Russ Cox	301b127a05	runtime/pprof: read memstats earlier in profile handler Reading the mem stats before our own allocations avoids cluttering memory stats with our recent garbage. Fixes #20565. Change-Id: I3b0046c8300dca83cea24013ffebc32b2ae7f742 Reviewed-on: https://go-review.googlesource.com/80739 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-12-01 00:23:05 +00:00
Ian Lance Taylor	eb97160f46	runtime: don't block signals that will kill the program Otherwise we may delay the delivery of these signals for an arbitrary length of time. We are already careful to not block signals that the program has asked to see. Also make sure that we don't miss a signal delivery if a thread decides to stop for a while while executing the signal handler. Also clean up the TestAtomicStop output a little bit. Fixes #21433 Change-Id: Ic0c1a4eaf7eba80d1abc1e9537570bf4687c2434 Reviewed-on: https://go-review.googlesource.com/79581 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-11-30 23:29:30 +00:00
Austin Clements	fa81d6134d	runtime: more specific reason for skipping GDB tests on NetBSD Updates #22893. Change-Id: I2cf5efb4fa6b77aaf82de5d8877c99f9aa5d519a Reviewed-on: https://go-review.googlesource.com/81195 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-11-30 20:57:03 +00:00
Vladimir Stefanovic	2708da0dc1	runtime/cgo, math: don't use FP instructions for soft-float mips{,le} Updates #18162 Change-Id: I591fcf71a02678a99a56a6487da9689d3c9b1bb6 Reviewed-on: https://go-review.googlesource.com/37955 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-11-30 17:12:32 +00:00
Vladimir Stefanovic	ac987df87c	runtime: implement some soft-float routines (used by GOMIPS=softfloat) Updates #18162 Change-Id: Iee854f48b2d1432955fdb462f2073ebbe76c34f8 Reviewed-on: https://go-review.googlesource.com/37957 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-11-30 17:12:05 +00:00
Than McIntosh	4435fcfd6c	compiler,linker: support for DWARF inlined instances Compiler and linker changes to support DWARF inlined instances, see https://go.googlesource.com/proposal/+/HEAD/design/22080-dwarf-inlining.md for design details. This functionality is gated via the cmd/compile option -gendwarfinl=N, where N={0,1,2}, where a value of 0 disables dwarf inline generation, a value of 1 turns on dwarf generation without tracking of formal/local vars from inlined routines, and a value of 2 enables inlines with variable tracking. Updates #22080 Change-Id: I69309b3b815d9fed04aebddc0b8d33d0dbbfad6e Reviewed-on: https://go-review.googlesource.com/75550 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-11-30 14:39:19 +00:00
Sebastien Binet	f09a3d8223	runtime: fix documentation typo for gostartcall This CL is a simple doc typo fix, uncovered while reviewing the go-wasm port. Change-Id: I0fce915c341aaaea3a7cc365819abbc5f2c468c3 Reviewed-on: https://go-review.googlesource.com/80715 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-11-29 18:42:49 +00:00
Brad Fitzpatrick	70ee9b4a07	runtime: fix sysctl calling convention on netbsd/386 Thanks to coypoop for noticing at: https://github.com/golang/go/issues/22914#issuecomment-347761838 FreeBSD/386 and NetBSD/386 diverged between Go 1.4 and Go 1.5 when Russ sent https://golang.org/cl/135830043 (git rev `25f6b02ab0`) to change the calling convention of the C compilers to match Go. But netbsd wasn't updated. Tested on a NetBSD/386 VM, since the builders aren't back up yet (due to this bug) Fixes #22914 Updates #19339 Updates #20852 Updates #16511 Change-Id: Id76ebe8f29bcc85e39b1c11090639d906cd6cf04 Reviewed-on: https://go-review.googlesource.com/80515 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Benny Siegert <bsiegert@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-11-29 16:24:04 +00:00
Ian Lance Taylor	b5c7183001	runtime: skip GDB tests on NetBSD TestGdbAutotmpTypes times out for unknown reasons on NetBSd. Skip the gdb tests on NetBSD for now. Updates #22893 Change-Id: Ibb05b7260eabb74d805d374b25a43770939fa5f2 Reviewed-on: https://go-review.googlesource.com/80136 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-11-28 01:18:54 +00:00
Austin Clements	be589f8d2b	runtime: fix final stack split in exitsyscall exitsyscall should be recursively nosplit, but we don't have a way to annotate that right now (see #21314). There's exactly one remaining place where this is violated right now: exitsyscall -> casgstatus -> print. The other prints in casgstatus are wrapped in systemstack calls. This fixes the remaining print. Updates #21431 (in theory could fix it, but that would just indicate that we have a different G status-related crash and we've never seen that failure on the dashboard.) Change-Id: I9a5e8d942adce4a5c78cfc6b306ea5bda90dbd33 Reviewed-on: https://go-review.googlesource.com/79815 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-11-24 15:48:04 +00:00
Emmanuel Odeke	2e1f07133d	runtime: tweak doc for Goexit Use singular form of panic and remove the unnecessary 'however', when comparing Goexit's behavior to 'a panic' as well as what happens for deferred recovers with Goexit. Change-Id: I3116df3336fa135198f6a39cf93dbb88a0e2f46e Reviewed-on: https://go-review.googlesource.com/79755 Reviewed-by: Rob Pike <r@golang.org>	2017-11-24 01:13:53 +00:00
Austin Clements	294963fb7f	runtime: document sigtrampgo better Add an explanation of why sigtrampgo is nosplit. Updates #21314. Change-Id: I3f5909d2b2c180f9fa74d53df13e501826fd4316 Reviewed-on: https://go-review.googlesource.com/79615 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-11-23 03:05:56 +00:00
Austin Clements	4671da0414	runtime: print runtime frames in throwsplit trace newstack manually prints the stack trace if we try to grow the stack when throwsplit is set. However, the default behavior is to omit runtime frames. Since runtime frames can be critical to understanding this crash, this change fixes this traceback to include them. Updates #21431. Change-Id: I5aa43f43aa2f10a8de7d67bcec743427be3a3b5d Reviewed-on: https://go-review.googlesource.com/79518 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-11-22 21:44:38 +00:00
Austin Clements	09739d2850	runtime: call throw on systemstack in exitsyscall If exitsyscall tries to grow the stack it will panic, but throw calls print, which can grow the stack. Move the two bare throws in exitsyscall to the system stack. Updates #21431. Change-Id: I5b29da5d34ade908af648a12075ed327a864476c Reviewed-on: https://go-review.googlesource.com/79517 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-11-22 21:44:35 +00:00
Austin Clements	64b68bedc5	runtime/debug: make SetGCPercent(-1) wait for concurrent GC Currently, SetGCPercent(-1) disables GC, but doesn't wait for any currently running concurrent GC to finish, so GC can still be running when it returns. This is a change in behavior from Go 1.8, probably defies user expectations, and can break various runtime tests that depend on SetGCPercent(-1) to disable garbage collection in order to prevent preemption deadlocks. Fix this by making SetGCPercent(-1) block until any concurrently running GC cycle finishes. Fixes #22443. Change-Id: I904133a34acf97a7942ef4531ace0647b13930ef Reviewed-on: https://go-review.googlesource.com/79195 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-11-22 14:47:12 +00:00
Keith Randall	48e207d518	cmd/compile: fix mapassign_fast* routines for pointer keys The signature of the mapassign_fast* routines need to distinguish the pointerness of their key argument. If the affected routines suspend part way through, the object pointed to by the key might get garbage collected because the key is typed as a uint{32,64}. This is not a problem for mapaccess or mapdelete because the key in those situations do not live beyond the call involved. If the object referenced by the key is garbage collected prematurely, the code still works fine. Even if that object is subsequently reallocated, it can't be written to the map in time to affect the lookup/delete. Fixes #22781 Change-Id: I0bbbc5e9883d5ce702faf4e655348be1191ee439 Reviewed-on: https://go-review.googlesource.com/79018 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Martin Möhrmann <moehrmann@google.com>	2017-11-22 04:30:27 +00:00
Brad Fitzpatrick	1e3f563b14	runtime: fix build on non-Linux platforms CL 78538 was updated after running TryBots to depend on syscall.NanoSleep which isn't available on all non-Linux platforms. Change-Id: I1fa615232b3920453431861310c108b208628441 Reviewed-on: https://go-review.googlesource.com/79175 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-11-21 21:52:58 +00:00
Michael Pratt	b75b4d0ee6	runtime: skip netpoll check if there are no waiters If there are no netpoll waiters then calling netpoll will never find any goroutines. The later blocking netpoll in findrunnable already has this optimization. With golang.org/cl/78538 also applied, this change has a small impact on latency: name old time/op new time/op delta WakeupParallelSpinning/0s-12 13.6µs ± 1% 13.7µs ± 1% ~ (p=0.873 n=19+20) WakeupParallelSpinning/1µs-12 17.7µs ± 0% 17.6µs ± 0% -0.31% (p=0.000 n=20+20) WakeupParallelSpinning/2µs-12 20.2µs ± 2% 19.9µs ± 1% -1.59% (p=0.000 n=20+19) WakeupParallelSpinning/5µs-12 32.0µs ± 1% 32.1µs ± 1% ~ (p=0.201 n=20+19) WakeupParallelSpinning/10µs-12 51.7µs ± 0% 51.4µs ± 1% -0.60% (p=0.000 n=20+18) WakeupParallelSpinning/20µs-12 92.2µs ± 0% 92.2µs ± 0% ~ (p=0.474 n=19+19) WakeupParallelSpinning/50µs-12 215µs ± 0% 215µs ± 0% ~ (p=0.319 n=20+19) WakeupParallelSpinning/100µs-12 330µs ± 2% 331µs ± 2% ~ (p=0.296 n=20+19) WakeupParallelSyscall/0s-12 127µs ± 0% 126µs ± 0% -0.57% (p=0.000 n=18+18) WakeupParallelSyscall/1µs-12 129µs ± 0% 128µs ± 1% -0.43% (p=0.000 n=18+19) WakeupParallelSyscall/2µs-12 131µs ± 1% 130µs ± 1% -0.78% (p=0.000 n=20+19) WakeupParallelSyscall/5µs-12 137µs ± 1% 136µs ± 0% -0.54% (p=0.000 n=18+19) WakeupParallelSyscall/10µs-12 147µs ± 1% 146µs ± 0% -0.58% (p=0.000 n=18+19) WakeupParallelSyscall/20µs-12 168µs ± 0% 167µs ± 0% -0.52% (p=0.000 n=19+19) WakeupParallelSyscall/50µs-12 228µs ± 0% 227µs ± 0% -0.37% (p=0.000 n=19+18) WakeupParallelSyscall/100µs-12 329µs ± 0% 328µs ± 0% -0.28% (p=0.000 n=20+18) There is a bigger improvement in CPU utilization. Before this CL, these benchmarks spent 12% of cycles in netpoll, which are gone after this CL. This also fixes the sched.lastpoll load, which should be atomic. Change-Id: I600961460608bd5ba3eeddc599493d2be62064c6 Reviewed-on: https://go-review.googlesource.com/78915 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Austin Clements <austin@google.com>	2017-11-21 19:36:56 +00:00
Jamie Liu	868c8b374d	runtime: only sleep before stealing work from a running P The sleep in question does not make sense if the stolen-from P cannot run the stolen G. The usleep(3) has been observed delaying execution of woken G's by ~60us; skipping it reduces the wakeup-to-execution latency to ~7us in these cases, improving CPU utilization. Benchmarks added by this change: name old time/op new time/op delta WakeupParallelSpinning/0s-12 14.4µs ± 1% 14.3µs ± 1% ~ (p=0.227 n=19+20) WakeupParallelSpinning/1µs-12 18.3µs ± 0% 18.3µs ± 1% ~ (p=0.950 n=20+19) WakeupParallelSpinning/2µs-12 22.3µs ± 1% 22.3µs ± 1% ~ (p=0.670 n=20+18) WakeupParallelSpinning/5µs-12 31.7µs ± 0% 31.7µs ± 0% ~ (p=0.460 n=20+17) WakeupParallelSpinning/10µs-12 51.8µs ± 0% 51.8µs ± 0% ~ (p=0.883 n=20+20) WakeupParallelSpinning/20µs-12 91.9µs ± 0% 91.9µs ± 0% ~ (p=0.245 n=20+20) WakeupParallelSpinning/50µs-12 214µs ± 0% 214µs ± 0% ~ (p=0.509 n=19+20) WakeupParallelSpinning/100µs-12 335µs ± 0% 335µs ± 0% -0.05% (p=0.006 n=17+15) WakeupParallelSyscall/0s-12 228µs ± 2% 129µs ± 1% -43.32% (p=0.000 n=20+19) WakeupParallelSyscall/1µs-12 232µs ± 1% 131µs ± 1% -43.60% (p=0.000 n=19+20) WakeupParallelSyscall/2µs-12 236µs ± 1% 133µs ± 1% -43.44% (p=0.000 n=18+19) WakeupParallelSyscall/5µs-12 248µs ± 2% 139µs ± 1% -43.68% (p=0.000 n=18+19) WakeupParallelSyscall/10µs-12 263µs ± 3% 150µs ± 2% -42.97% (p=0.000 n=18+20) WakeupParallelSyscall/20µs-12 281µs ± 2% 170µs ± 1% -39.43% (p=0.000 n=19+19) WakeupParallelSyscall/50µs-12 345µs ± 4% 246µs ± 7% -28.85% (p=0.000 n=20+20) WakeupParallelSyscall/100µs-12 460µs ± 5% 350µs ± 4% -23.85% (p=0.000 n=20+20) Benchmarks associated with the change that originally added this sleep (see https://golang.org/s/go15gomaxprocs): name old time/op new time/op delta Chain 19.4µs ± 2% 19.3µs ± 1% ~ (p=0.101 n=19+20) ChainBuf 19.5µs ± 2% 19.4µs ± 2% ~ (p=0.840 n=19+19) Chain-2 19.9µs ± 1% 19.9µs ± 2% ~ (p=0.734 n=19+19) ChainBuf-2 20.0µs ± 2% 20.0µs ± 2% ~ (p=0.175 n=19+17) Chain-4 20.3µs ± 1% 20.1µs ± 1% -0.62% (p=0.010 n=19+18) ChainBuf-4 20.3µs ± 1% 20.2µs ± 1% -0.52% (p=0.023 n=19+19) Powser 2.09s ± 1% 2.10s ± 3% ~ (p=0.908 n=19+19) Powser-2 2.21s ± 1% 2.20s ± 1% -0.35% (p=0.010 n=19+18) Powser-4 2.31s ± 2% 2.31s ± 2% ~ (p=0.578 n=18+19) Sieve 13.6s ± 1% 13.6s ± 1% ~ (p=0.909 n=17+18) Sieve-2 8.02s ±52% 7.28s ±15% ~ (p=0.336 n=20+16) Sieve-4 4.00s ±35% 3.98s ±26% ~ (p=0.654 n=20+18) Change-Id: I58edd8ce01075859d871e2348fc0833e9c01f70f Reviewed-on: https://go-review.googlesource.com/78538 Reviewed-by: Austin Clements <austin@google.com>	2017-11-21 19:31:06 +00:00
Davor Kapsa	83634e9cf2	runtime/pprof: fix doc typo Change-Id: I6e814182d89c3e7ff184141af097af0afb844d00 Reviewed-on: https://go-review.googlesource.com/78620 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-11-18 20:40:15 +00:00
Bill O'Farrell	c2efb2fde5	cmd/link: enable c-shared and c-archive mode on s390x Adding s390x to the list of architectures that support c-shared and c-archive. Required adding load-time initialization (via _rt0_s390x_linux_lib) and adding s390x to the c-shared and c-archive tests. Change-Id: I75883b2891c310fe8ce7f08c27b06895c074e123 Reviewed-on: https://go-review.googlesource.com/74910 Reviewed-by: Michael Munday <mike.munday@ibm.com>	2017-11-17 15:54:54 +00:00
Austin Clements	bf9ad7080d	runtime: remove another TODO I experimented with having the compiler spill the two registers that are clobbered by the write barrier fast path, but it slightly slows down compilebench, which is a good write barrier benchmark: name old time/op new time/op delta Template 175ms ± 0% 176ms ± 1% ~ (p=0.393 n=10+10) Unicode 83.6ms ± 1% 85.1ms ± 2% +1.79% (p=0.000 n=9+10) GoTypes 585ms ± 0% 588ms ± 1% ~ (p=0.173 n=8+10) Compiler 2.78s ± 1% 2.81s ± 2% +0.81% (p=0.023 n=10+10) SSA 7.11s ± 1% 7.15s ± 1% +0.59% (p=0.029 n=10+10) Flate 115ms ± 1% 116ms ± 2% ~ (p=0.853 n=10+10) GoParser 144ms ± 2% 145ms ± 2% ~ (p=1.000 n=10+10) Reflect 389ms ± 1% 390ms ± 1% ~ (p=0.481 n=10+10) Tar 185ms ± 2% 185ms ± 2% ~ (p=0.529 n=10+10) XML 205ms ± 0% 207ms ± 2% ~ (p=0.065 n=9+10) Since this didn't pan out, remove the TODO. Change-Id: I2186942c6d1ba10585a5da03cd7c1d26ce906273 Reviewed-on: https://go-review.googlesource.com/78034 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-11-17 01:00:05 +00:00
Austin Clements	366f46fe00	runtime: remove TODO I experimented with changing the write barrier to take the value in SI rather than AX to improve register allocation. It had no effect on performance and only made the "hello world" text 0.07% smaller, so let's just remove the comment. Change-Id: I6a261d14139b7a02a8467b31e74951dfb927ffb4 Reviewed-on: https://go-review.googlesource.com/78033 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-11-17 00:59:52 +00:00
Austin Clements	89b7a08aea	runtime: fix gctrace STW CPU time and CPU fraction The CPU time reported in the gctrace for STW phases is simply work.stwprocs times the wall-clock duration of these phases. However, work.stwprocs is set to gcprocs(), which is wrong for multiple reasons: 1. gcprocs is intended to limit the number of Ms used for mark termination based on how well the garbage collector actually scales, but the gctrace wants to report how much CPU time is being stolen from the application. During STW, that's all of the CPU, regardless of how many the garbage collector can actually use. 2. gcprocs assumes it's being called during STW, so it limits its result to sched.nmidle+1. However, we're not calling it during STW, so sched.nmidle is typically quite small, even if GOMAXPROCS is quite large. Fix this by setting work.stwprocs to min(ncpu, GOMAXPROCS). This also fixes the overall GC CPU fraction, which is based on the computed CPU times. Fixes #22725. Change-Id: I64b5ce87e28dbec6870aa068ce7aecdd28c058d1 Reviewed-on: https://go-review.googlesource.com/77710 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-11-15 18:23:23 +00:00
Hana Kim	f71cbc8a96	runtime/trace: fix a typo in doc Change-Id: I63f3d2edb09801c99957a1f744639523fb6d0b62 Reviewed-on: https://go-review.googlesource.com/60331 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-11-15 15:13:50 +00:00
wei xiao	d259815ccb	runtime: IndexByte and memclr perf improvements on arm64 Update runtime asm_arm64.s and memclr_arm64.s to improve performance by using SIMD instructions to do more in parallel. It shows improvement on bytes, html and go1 benchmarks (particualrly regexp, which uses IndexByte frequently). Benchmark results of bytes: name old time/op new time/op delta IndexByte/10-8 28.5ns ± 0% 19.5ns ± 0% -31.58% (p=0.000 n=10+10) IndexByte/32-8 52.6ns ± 0% 19.0ns ± 0% -63.88% (p=0.000 n=10+10) IndexByte/4K-8 4.12µs ± 0% 0.49µs ± 0% -88.16% (p=0.000 n=10+10) IndexByte/4M-8 4.29ms ± 1% 0.70ms ±26% -83.65% (p=0.000 n=10+10) IndexByte/64M-8 69.7ms ± 0% 16.0ms ± 0% -76.97% (p=0.000 n=9+10) IndexBytePortable/10-8 34.0ns ± 0% 34.0ns ± 0% ~ (all equal) IndexBytePortable/32-8 66.1ns ± 0% 66.1ns ± 0% ~ (p=0.471 n=9+9) IndexBytePortable/4K-8 6.17µs ± 0% 6.17µs ± 0% ~ (all equal) IndexBytePortable/4M-8 6.33ms ± 0% 6.35ms ± 0% +0.21% (p=0.002 n=10+9) IndexBytePortable/64M-8 103ms ± 0% 103ms ± 0% +0.01% (p=0.017 n=9+10) name old speed new speed delta IndexByte/10-8 351MB/s ± 0% 512MB/s ± 0% +46.14% (p=0.000 n=9+10) IndexByte/32-8 609MB/s ± 0% 1683MB/s ± 0% +176.40% (p=0.000 n=10+10) IndexByte/4K-8 994MB/s ± 0% 8378MB/s ± 0% +742.75% (p=0.000 n=10+10) IndexByte/4M-8 977MB/s ± 1% 6149MB/s ±32% +529.29% (p=0.000 n=10+10) IndexByte/64M-8 963MB/s ± 0% 4182MB/s ± 0% +334.29% (p=0.000 n=9+10) IndexBytePortable/10-8 294MB/s ± 0% 294MB/s ± 0% +0.17% (p=0.000 n=8+8) IndexBytePortable/32-8 484MB/s ± 0% 484MB/s ± 0% ~ (p=0.877 n=9+9) IndexBytePortable/4K-8 664MB/s ± 0% 664MB/s ± 0% ~ (p=0.242 n=8+9) IndexBytePortable/4M-8 662MB/s ± 0% 661MB/s ± 0% -0.21% (p=0.002 n=10+9) IndexBytePortable/64M-8 652MB/s ± 0% 652MB/s ± 0% ~ (p=0.065 n=10+10) Benchmark results of html: name old time/op new time/op delta Escape-8 62.0µs ± 1% 61.0µs ± 1% -1.69% (p=0.000 n=9+10) EscapeNone-8 10.2µs ± 0% 10.2µs ± 0% -0.09% (p=0.022 n=9+10) Unescape-8 71.9µs ± 0% 68.7µs ± 0% -4.35% (p=0.000 n=10+10) UnescapeNone-8 4.03µs ± 0% 0.48µs ± 0% -88.08% (p=0.000 n=10+10) UnescapeSparse-8 10.7µs ± 2% 7.1µs ± 3% -33.91% (p=0.000 n=10+10) UnescapeDense-8 53.2µs ± 1% 53.5µs ± 1% ~ (p=0.143 n=10+10) Benchmark results of go1: name old time/op new time/op delta BinaryTree17-8 6.53s ± 0% 6.48s ± 2% ~ (p=0.190 n=4+5) Fannkuch11-8 6.35s ± 1% 6.35s ± 0% ~ (p=1.000 n=5+5) FmtFprintfEmpty-8 108ns ± 1% 101ns ± 2% -6.32% (p=0.008 n=5+5) FmtFprintfString-8 172ns ± 1% 182ns ± 2% +5.70% (p=0.008 n=5+5) FmtFprintfInt-8 207ns ± 0% 207ns ± 0% ~ (p=0.444 n=5+5) FmtFprintfIntInt-8 277ns ± 1% 276ns ± 1% ~ (p=0.873 n=5+5) FmtFprintfPrefixedInt-8 386ns ± 0% 382ns ± 1% -1.04% (p=0.024 n=5+5) FmtFprintfFloat-8 492ns ± 0% 492ns ± 1% ~ (p=0.571 n=4+5) FmtManyArgs-8 1.32µs ± 1% 1.33µs ± 0% ~ (p=0.087 n=5+5) GobDecode-8 16.8ms ± 2% 16.7ms ± 1% ~ (p=1.000 n=5+5) GobEncode-8 14.1ms ± 1% 14.0ms ± 1% ~ (p=0.056 n=5+5) Gzip-8 788ms ± 0% 802ms ± 0% +1.71% (p=0.008 n=5+5) Gunzip-8 83.6ms ± 0% 83.9ms ± 0% +0.40% (p=0.008 n=5+5) HTTPClientServer-8 120µs ± 0% 120µs ± 1% ~ (p=0.548 n=5+5) JSONEncode-8 33.2ms ± 0% 33.0ms ± 1% -0.71% (p=0.008 n=5+5) JSONDecode-8 152ms ± 1% 152ms ± 1% ~ (p=1.000 n=5+5) Mandelbrot200-8 10.0ms ± 0% 10.0ms ± 0% -0.05% (p=0.008 n=5+5) GoParse-8 7.97ms ± 0% 7.98ms ± 0% ~ (p=0.690 n=5+5) RegexpMatchEasy0_32-8 233ns ± 1% 206ns ± 0% -11.44% (p=0.016 n=5+4) RegexpMatchEasy0_1K-8 1.86µs ± 0% 0.77µs ± 1% -58.54% (p=0.008 n=5+5) RegexpMatchEasy1_32-8 250ns ± 0% 205ns ± 0% -18.07% (p=0.008 n=5+5) RegexpMatchEasy1_1K-8 2.28µs ± 0% 1.11µs ± 0% -51.09% (p=0.029 n=4+4) RegexpMatchMedium_32-8 332ns ± 1% 301ns ± 2% -9.45% (p=0.008 n=5+5) RegexpMatchMedium_1K-8 85.5µs ± 2% 78.8µs ± 0% -7.83% (p=0.008 n=5+5) RegexpMatchHard_32-8 4.34µs ± 1% 4.27µs ± 0% -1.49% (p=0.008 n=5+5) RegexpMatchHard_1K-8 130µs ± 1% 127µs ± 0% -2.53% (p=0.008 n=5+5) Revcomp-8 1.35s ± 1% 1.13s ± 1% -16.17% (p=0.008 n=5+5) Template-8 160ms ± 2% 162ms ± 2% ~ (p=0.222 n=5+5) TimeParse-8 795ns ± 2% 778ns ± 1% ~ (p=0.095 n=5+5) TimeFormat-8 782ns ± 0% 786ns ± 1% +0.59% (p=0.040 n=5+5) name old speed new speed delta GobDecode-8 45.8MB/s ± 2% 45.9MB/s ± 1% ~ (p=1.000 n=5+5) GobEncode-8 54.3MB/s ± 1% 55.0MB/s ± 1% ~ (p=0.056 n=5+5) Gzip-8 24.6MB/s ± 0% 24.2MB/s ± 0% -1.69% (p=0.008 n=5+5) Gunzip-8 232MB/s ± 0% 231MB/s ± 0% -0.40% (p=0.008 n=5+5) JSONEncode-8 58.4MB/s ± 0% 58.8MB/s ± 1% +0.71% (p=0.008 n=5+5) JSONDecode-8 12.8MB/s ± 1% 12.8MB/s ± 1% ~ (p=1.000 n=5+5) GoParse-8 7.27MB/s ± 0% 7.26MB/s ± 0% ~ (p=0.762 n=5+5) RegexpMatchEasy0_32-8 137MB/s ± 1% 155MB/s ± 0% +12.93% (p=0.008 n=5+5) RegexpMatchEasy0_1K-8 551MB/s ± 0% 1329MB/s ± 1% +141.11% (p=0.008 n=5+5) RegexpMatchEasy1_32-8 128MB/s ± 0% 156MB/s ± 0% +22.00% (p=0.008 n=5+5) RegexpMatchEasy1_1K-8 449MB/s ± 0% 920MB/s ± 0% +104.68% (p=0.016 n=4+5) RegexpMatchMedium_32-8 3.00MB/s ± 0% 3.32MB/s ± 2% +10.60% (p=0.016 n=4+5) RegexpMatchMedium_1K-8 12.0MB/s ± 2% 13.0MB/s ± 0% +8.48% (p=0.008 n=5+5) RegexpMatchHard_32-8 7.38MB/s ± 1% 7.49MB/s ± 0% +1.49% (p=0.008 n=5+5) RegexpMatchHard_1K-8 7.88MB/s ± 1% 8.08MB/s ± 0% +2.59% (p=0.008 n=5+5) Revcomp-8 188MB/s ± 1% 224MB/s ± 1% +19.29% (p=0.008 n=5+5) Template-8 12.2MB/s ± 2% 12.0MB/s ± 2% ~ (p=0.206 n=5+5) Change-Id: I94116620a287d173a6f60510684362e500f54887 Reviewed-on: https://go-review.googlesource.com/33597 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-11-15 02:58:03 +00:00
Ian Lance Taylor	a158382b1c	runtime: call amd64 VDSO entry points on large stack If the Linux kernel was built with CONFIG_OPTIMIZE_INLINING=n and was built with hardening options turned on, GCC will insert a stack probe in the VDSO function that requires a full page of stack space. The stack probe can corrupt memory if another thread is using it. Avoid sporadic crashes by calling the VDSO on the g0 or gsignal stack. While we're at it, align the stack as C code expects. We've been getting away with a misaligned stack, but it's possible that the VDSO code will change in the future to break that assumption. Benchmarks show a 11% hit on time.Now, but it's only 6ns. name old time/op new time/op delta AfterFunc-12 1.66ms ± 0% 1.66ms ± 1% ~ (p=0.905 n=9+10) After-12 1.90ms ± 6% 1.86ms ± 0% -2.05% (p=0.012 n=10+8) Stop-12 113µs ± 3% 115µs ± 2% +1.60% (p=0.017 n=9+10) SimultaneousAfterFunc-12 145µs ± 1% 144µs ± 0% -0.68% (p=0.002 n=10+8) StartStop-12 39.5µs ± 3% 40.4µs ± 5% +2.19% (p=0.023 n=10+10) Reset-12 10.2µs ± 0% 10.4µs ± 0% +2.45% (p=0.000 n=10+9) Sleep-12 190µs ± 1% 190µs ± 1% ~ (p=0.971 n=10+10) Ticker-12 4.68ms ± 2% 4.64ms ± 2% -0.83% (p=0.043 n=9+10) Now-12 48.4ns ±11% 54.0ns ±11% +11.42% (p=0.017 n=10+10) NowUnixNano-12 48.5ns ±13% 56.9ns ± 8% +17.30% (p=0.000 n=10+10) Format-12 489ns ±11% 504ns ± 6% ~ (p=0.289 n=10+10) FormatNow-12 436ns ±23% 480ns ±13% +10.25% (p=0.026 n=9+10) MarshalJSON-12 656ns ±14% 587ns ±24% ~ (p=0.063 n=10+10) MarshalText-12 647ns ± 7% 638ns ± 9% ~ (p=0.516 n=10+10) Parse-12 348ns ± 8% 328ns ± 9% -5.66% (p=0.030 n=10+10) ParseDuration-12 136ns ± 9% 140ns ±11% ~ (p=0.425 n=10+10) Hour-12 14.8ns ± 6% 15.6ns ±11% ~ (p=0.085 n=10+10) Second-12 14.0ns ± 6% 14.3ns ±12% ~ (p=0.443 n=10+10) Year-12 32.4ns ±11% 33.4ns ± 6% ~ (p=0.492 n=10+10) Day-12 41.5ns ± 9% 42.3ns ±12% ~ (p=0.239 n=10+10) Fixes #20427 Change-Id: Ia395cbb863215f4499b8e7ef95f4b99f51090911 Reviewed-on: https://go-review.googlesource.com/76990 Reviewed-by: Austin Clements <austin@google.com>	2017-11-14 23:51:19 +00:00
Fangming.Fang	66bfbd9ad7	internal/cpu: detect cpu features in internal/cpu package change hash/crc32 package to use cpu package instead of using runtime internal variables to check crc32 instruction Change-Id: I8f88d2351bde8ed4e256f9adf822a08b9a00f532 Reviewed-on: https://go-review.googlesource.com/76490 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2017-11-14 19:07:15 +00:00
Alex Brainman	cea92e8d13	runtime: make TestWindowsStackMemory build even with CGO_ENABLED=0 set Just copy some code to make TestWindowsStackMemory build when CGO_ENABLED is set to 0. Fixes #22680 Change-Id: I63f9b409a3a97b7718f5d37837ab706d8ed92e81 Reviewed-on: https://go-review.googlesource.com/77430 Reviewed-by: Chris Hines <chris.cs.guy@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-11-14 03:53:15 +00:00
Austin Clements	032678e0fb	runtime: don't elide wrapper functions that call panic or at TOS CL 45412 started hiding autogenerated wrapper functions from call stacks so that call stack semantics better matched language semantics. This is based on the theory that the wrapper function will call the "real" function and all the programmer knows about is the real function. However, this theory breaks down in two cases: 1. If the wrapper is at the top of the stack, then it didn't call anything. This can happen, for example, if the "stack" was actually synthesized by the user. 2. If the wrapper panics, for example by calling panicwrap or by dereferencing a nil pointer, then it didn't call the wrapped function and the user needs to see what panicked, even if we can't attribute it nicely. This commit modifies the traceback logic to include the wrapper function in both of these cases. Fixes #22231. Change-Id: I6e4339a652f73038bd8331884320f0b8edd86eb1 Reviewed-on: https://go-review.googlesource.com/76770 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-11-13 21:43:44 +00:00
Russ Cox	5993251c01	cmd/go: implement per-package asmflags, gcflags, ldflags, gccgoflags It has always been problematic that there was no way to specify tool flags that applied only to the build of certain packages; it was only to specify flags for all packages being built. The usual workaround was to install all dependencies of something, then build just that one thing with different flags. Since the dependencies appeared to be up-to-date, they were not rebuilt with the different flags. The new content-based staleness (up-to-date) checks see through this trick, because they detect changes in flags. This forces us to address the underlying problem of providing a way to specify per-package flags. The solution is to allow -gcflags=pattern=flags, which means that flags apply to packages matching pattern, in addition to the usual -gcflags=flags, which is now redefined to apply only to the packages named on the command line. See #22527 for discussion and rationale. Fixes #22527. Change-Id: I6716bed69edc324767f707b5bbf3aaa90e8e7302 Reviewed-on: https://go-review.googlesource.com/76551 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-11-09 15:04:04 +00:00
Austin Clements	f10d99f51d	runtime: flush assist credit on goroutine exit Currently dead goroutines retain their assist credit. This credit can be used if the goroutine gets recycled, but in general this can make assist pacing over-aggressive by hiding an amount of credit proportional to the number of exited (and not reused) goroutines. Fix this "hidden credit" by flushing assist credit to the global credit pool when a goroutine exits. Updates #14812. Change-Id: I65f7f75907ab6395c04aacea2c97aea963b60344 Reviewed-on: https://go-review.googlesource.com/24703 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-11-07 18:41:14 +00:00
Ian Lance Taylor	86cd9c1176	runtime: only call netpoll if netpollinited returns true This fixes a race on old Linux kernels, in which we might temporarily set epfd to an invalid value other than -1. It's also the right thing to do. No test because the problem only occurs on old kernels. Fixes #22606 Change-Id: Id84bdd6ae6d7c5d47c39e97b74da27576cb51a54 Reviewed-on: https://go-review.googlesource.com/76319 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2017-11-07 16:18:12 +00:00
Than McIntosh	83a1a2ba63	runtime/pprof: harden CPU profile test against smart backend A couple of the CPU profiling testpoints make calls to helper functions (cpuHog1, for example) where the computed value is always thrown away by the caller without being used. A smart compiler back end (in this case LLVM) can detect this fact and delete the contents of the called function, which can cause tests to fail. Harden the test slighly by passing in a value read from a global and insuring that the caller stores the value back to a global; this prevents any optimizer mischief. Change-Id: Icbd6e3e32ff299c68a6397dc1404a52b21eaeaab Reviewed-on: https://go-review.googlesource.com/76230 Run-TryBot: Than McIntosh <thanm@google.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2017-11-07 13:52:37 +00:00
Carlos Eduardo Seo	be943df588	runtime: improve IndexByte for ppc64x This change adds a better implementation of IndexByte in asm that uses the vector registers/instructions on ppc64x. benchmark old ns/op new ns/op delta BenchmarkIndexByte/10-8 9.70 9.37 -3.40% BenchmarkIndexByte/32-8 10.9 10.9 +0.00% BenchmarkIndexByte/4K-8 254 92.8 -63.46% BenchmarkIndexByte/4M-8 249246 118435 -52.48% BenchmarkIndexByte/64M-8 10737987 7383096 -31.24% benchmark old MB/s new MB/s speedup BenchmarkIndexByte/10-8 1030.63 1067.24 1.04x BenchmarkIndexByte/32-8 2922.69 2928.53 1.00x BenchmarkIndexByte/4K-8 16065.95 44156.45 2.75x BenchmarkIndexByte/4M-8 16827.96 35414.21 2.10x BenchmarkIndexByte/64M-8 6249.67 9089.53 1.45x Change-Id: I81dbdd620f7bb4e395ce4d1f2a14e8e91e39f9a1 Reviewed-on: https://go-review.googlesource.com/71710 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2017-11-06 21:56:18 +00:00
Alex Brainman	af015b1f21	runtime: skip flaky TestWindowsStackMemoryCgo Updates #22575 Change-Id: I1f848768934b7024d2ef01db13b9003e9ca608a0 Reviewed-on: https://go-review.googlesource.com/76030 Reviewed-by: Russ Cox <rsc@golang.org>	2017-11-04 03:10:01 +00:00
Russ Cox	0d18875252	cmd/go: run vet automatically during go test This CL adds an automatic, limited "go vet" to "go test". If the building of a test package fails, vet is not run. If vet fails, the test is not run. The goal is that users don't notice vet as part of the "go test" process at all, until vet speaks up and says something important. This should help users find real problems in their code faster (vet can just point to them instead of needing to debug a test failure) and expands the scope of what kinds of things vet can help with. The "go vet" runs in parallel with the linking of the test binary, so for incremental builds it typically does not slow the overall "go test" at all: there's spare machine capacity during the link. all.bash has less spare machine capacity. This CL increases the time for all.bash on my laptop from 4m41s to 4m48s (+2.5%) To opt out for a given run, use "go test -vet=off". The vet checks used during "go test" are a subset of the full set, restricted to ones that are 100% correct and therefore acceptable to make mandatory. In this CL, that set is atomic, bool, buildtags, nilfunc, and printf. Including printf is debatable, but I want to include it for now and find out what needs to be scaled back. (It already found one real problem in package os's tests that previous go vet os had not turned up.) Now that we can rely on type information it may be that printf should make its function-name-based heuristic less aggressive and have a whitelist of known print/printf functions. Determining the exact set for Go 1.10 is #18085. Running vet also means that programs now have to type-check with both cmd/compile and go/types in order to pass "go test". We don't start vet until cmd/compile has built the test package, so normally the added go/types check doesn't find anything. However, there is at least one instance where go/types is more precise than cmd/compile: declared and not used errors involving variables captured into closures. This CL includes a printf fix to os/os_test.go and many declared and not used fixes in the race detector tests. Fixes #18084. Change-Id: I353e00b9d1f9fec540c7557db5653e7501f5e1c9 Reviewed-on: https://go-review.googlesource.com/74356 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-11-03 22:09:38 +00:00
Hana (Hyang-Ah) Kim	f99d14e0de	runtime/pprof: use new profile format for block/mutex profiles Unlike the legacy text format that outputs the count and the number of cycles, the pprof tool expects contention profiles to include the count and the delay time measured in nanoseconds. printCountCycleProfile performs the conversion from cycles to nanoseconds. (See parseContention function in cmd/vendor/github.com/google/pprof/profile/legacy_profile.go) Fixes #21474 Change-Id: I8e8fb6ea803822d7eaaf9ecf1df3e236ad225a7b Reviewed-on: https://go-review.googlesource.com/64410 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-11-03 18:43:17 +00:00
Gabriel Aszalos	d6ebbef89d	runtime: clarify GOROOT return value in documentation The current GOROOT documentation could indicate that changing the environment variable at runtime would affect the return value of GOROOT. This is false as the returned value is the one used for the build. This CL aims to clarify the confusion. Fixes #22302 Change-Id: Ib68c30567ac864f152d2da31f001a98531fc9757 Reviewed-on: https://go-review.googlesource.com/75751 Reviewed-by: Russ Cox <rsc@golang.org>	2017-11-03 15:52:40 +00:00
Zhengyu He	eaf603601b	runtime: fix GNU/Linux getproccount if sched_getaffinity does not return a multiple of 8 The current code can potentially return a smaller processor count on a linux kernel when its cpumask_size (controlled by both kernel config and boot parameter) is not a multiple of the pointer size, because r/sys.PtrSize will be rounded down. Since sched_getaffinity returns the size in bytes, we can just allocate the buf as a byte array to avoid the extra calculation with the pointer size and roundups. Change-Id: I0c21046012b88d8a56b5dd3dde1d158d94f8eea9 Reviewed-on: https://go-review.googlesource.com/75591 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-11-03 01:55:16 +00:00
Alex Brainman	923299a6b8	cmd/link: restore windows stack commit size back to 4KB CL 49331 increased windows stack commit size to 2MB by mistake. Revert that change. Fixes #22439 Change-Id: I919e549e87da326f4ba45890b4d32f6d7046186f Reviewed-on: https://go-review.googlesource.com/74490 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-11-03 00:09:40 +00:00
Joe Tsai	08f19bbde1	go/printer: forbid empty line before first comment in block To improve readability when exported fields are removed, forbid the printer from emitting an empty line before the first comment in a const, var, or type block. Also, when printing the "Has filtered or unexported fields." message, add an empty line before it to separate the message from the struct or interfact contents. Before the change: <<< type NamedArg struct { // Name is the name of the parameter placeholder. // // If empty, the ordinal position in the argument list will be // used. // // Name must omit any symbol prefix. Name string // Value is the value of the parameter. // It may be assigned the same value types as the query // arguments. Value interface{} // contains filtered or unexported fields } >>> After the change: <<< type NamedArg struct { // Name is the name of the parameter placeholder. // // If empty, the ordinal position in the argument list will be // used. // // Name must omit any symbol prefix. Name string // Value is the value of the parameter. // It may be assigned the same value types as the query // arguments. Value interface{} // contains filtered or unexported fields } >>> Fixes #18264 Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88 Reviewed-on: https://go-review.googlesource.com/71990 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2017-11-02 18:17:22 +00:00
Martin Möhrmann	8585f9fdb1	runtime: refactor insertion slot tracking for fast hashmap functions * Avoid calculating insertk until needed. * Avoid a pointer into b.tophash and just track the insertion index. This avoids b.tophash being marked as escaping to heap. * Calculate val only once at the end of the mapassign functions. Function sizes decrease slightly, e.g. for mapassign_faststr: before "".mapassign_faststr STEXT size=1166 args=0x28 locals=0x78 after "".mapassign_faststr STEXT size=1080 args=0x28 locals=0x68 name old time/op new time/op delta MapAssign/Int32/256-4 19.4ns ± 4% 19.5ns ±11% ~ (p=0.973 n=20+20) MapAssign/Int32/65536-4 32.5ns ± 2% 32.4ns ± 3% ~ (p=0.078 n=20+19) MapAssign/Int64/256-4 20.3ns ± 6% 17.6ns ± 5% -13.01% (p=0.000 n=20+20) MapAssign/Int64/65536-4 33.3ns ± 2% 33.3ns ± 1% ~ (p=0.444 n=20+20) MapAssign/Str/256-4 22.3ns ± 3% 22.4ns ± 3% ~ (p=0.343 n=20+20) MapAssign/Str/65536-4 44.9ns ± 1% 43.9ns ± 1% -2.39% (p=0.000 n=20+19) Change-Id: I2627bb8a961d366d9473b5922fa129176319eb22 Reviewed-on: https://go-review.googlesource.com/74870 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-11-02 18:00:36 +00:00
Martin Möhrmann	fbfc2031a6	cmd/compile: specialize map creation for small hint sizes Handle make(map[any]any) and make(map[any]any, hint) where hint <= BUCKETSIZE special to allow for faster map initialization and to improve binary size by using runtime calls with fewer arguments. Given hint is smaller or equal to BUCKETSIZE in which case overLoadFactor(hint, 0) is false and no buckets would be allocated by makemap: * If hmap needs to be allocated on the stack then only hmap's hash0 field needs to be initialized and no call to makemap is needed. * If hmap needs to be allocated on the heap then a new special makehmap function will allocate hmap and intialize hmap's hash0 field. Reduces size of the godoc by ~36kb. AMD64 name old time/op new time/op delta NewEmptyMap 16.6ns ± 2% 5.5ns ± 2% -66.72% (p=0.000 n=10+10) NewSmallMap 64.8ns ± 1% 56.5ns ± 1% -12.75% (p=0.000 n=9+10) Updates #6853 Change-Id: I624e90da6775afaa061178e95db8aca674f44e9b Reviewed-on: https://go-review.googlesource.com/61190 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-11-02 17:03:45 +00:00
Tobias Klauser	2dd110f9a7	runtime/pprof: use switch for GOOS check in testCPUProfile Since CL 33071, testCPUProfile is only one user of the badOS map. Replace it by the corresponding switch, with the "plan9" case removed because it is already checked earlier in the same function. Change-Id: Id647b8ee1fd37516bb702b35b3c9296a4f56b61b Reviewed-on: https://go-review.googlesource.com/75110 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-11-02 07:21:28 +00:00
Martin Möhrmann	371a5b494a	runtime: protect growslice against newcap*et.size overflow The check of uintptr(newcap) > maxSliceCap(et.size) in addition to capmem > _MaxMem is needed to prevent a reproducible overflow on 32bit architectures. On 64bit platforms this problem is less likely to occur as allocation of a sufficiently large array or slice to be append is likely to already exhaust available memory before the call to append can be made. Example program that without the fix in this CL does segfault on 386: type T [1<<27 + 1]int64 var d T var s []T func main() { s = append(s, d, d, d, d) print(len(s), "\n") } Fixes #21586 Change-Id: Ib4185435826ef43df71ba0f789e19f5bf9a347e6 Reviewed-on: https://go-review.googlesource.com/55133 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-11-01 12:38:02 +00:00
Tobias Klauser	96c62b3b31	all: remove unnecessary return after skipping test testing.Skip{,f} will exit the test via runtime.Goexit. Thus, the successive return is never reached and can be removed. Change-Id: I1e399f3d5db753ece1ffba648850427e1b4be300 Reviewed-on: https://go-review.googlesource.com/74990 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2017-11-01 11:57:47 +00:00
Russ Cox	bf21c67b1e	cmd/go: trim objdir, not just workdir, from object files Otherwise the new numbered directories like b028/ appear in the objects, and they can change from run to run. Fixes #22514. Change-Id: I8d0cf65f3622e48b2547d5757febe0ee1301e2ed Reviewed-on: https://go-review.googlesource.com/74791 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-10-31 23:49:28 +00:00
Hana (Hyang-Ah) Kim	d58f4e9b7b	runtime/trace: fix corrupted trace during StartTrace Since Go1.8, different types of GC mark workers were annotated and the annotation strings were recorded during StartTrace. This change fixes two issues around the use of traceString from StartTrace here. 1) "failed to parse trace: no consistent ordering of events possible" This issue is a result of a missing 'batch' event entry. For efficient tracing, tracer maintains system allocated buffers and once a buffer is full, it is Flushed out for writing. Moreover, tracing assumes all the records in the same buffer (batch) are already ordered and implements more optimization in encoding and defers the completing order reconstruction till the trace parsing time. Thus, when a Flush happens and a new buffer is used, the new buffer should contain an event to indicate the start of a new batch. Before this CL, the batch entry was written only by traceEvent only when the buffer position is 0 and wasn't written when flush occurs during traceString. This CL fixes it by moving the batch entry write to the traceFlush. 2) crash during tracing due to invalid memory access, or during parsing due to duplicate string entries This issue is a result of memory allocation during traceString calls. Execution tracer traces some memory allocation activities. Before this CL, traceString took the buffer address (traceBuf) and mutated the buffer. If memory tracing occurs in the meantime from the same P, the allocation tracing (traceEvent) will take the same buffer address through the pointer to the buffer address (traceBuf), and mutate the buffer. As a result, one of the followings can happen: - the allocation record is overwritten by the following trace string record (data loss) - if buffer flush occurs during the allocation tracing, traceString will attempt to write the string record to the old buffer and eventually causes invalid memory access crash. - or flush on the same buffer can occur twice (once from the memory allocation, and once from the string record write), and in this case the trace can contain the same data twice and the parse will complain about duplicate string record entries. This CL fixes the second issue by making the traceString take traceBuf (traceBufPtr). Change-Id: I24f629758625b38e1916fbfc7d7be6ea210586af Reviewed-on: https://go-review.googlesource.com/50873 Run-TryBot: Austin Clements <austin@google.com> Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-31 22:03:30 +00:00
Austin Clements	af192a3e22	runtime: allow 5% mutator assist over 25% background mark Currently, both the background mark worker and the goal GC CPU are both fixed at 25%. The trigger controller's goal is to achieve the goal CPU usage, and with the previous commit it can actually achieve this. But this means there are no assists, which sounds ideal but actually causes problems for the trigger controller. Since the controller can't lower CPU usage below the background mark worker CPU, it saturates at the CPU goal and no longer gets feedback, which translates into higher variability in heap growth. This commit fixes this by allowing assists 5% CPU beyond the 25% fixed background mark. This avoids saturating the trigger controller, since it can now get feedback from both sides of the CPU goal. This leads to low variability in both CPU usage and heap growth, at the cost of reintroducing a low rate of mark assists. We also experimented with 20% background plus 5% assist, but 25%+5% clearly performed better in benchmarks. Updates #14951. Updates #14812. Updates #18534. Combined with the previous CL, this significantly improves tail mutator utilization in the x/bechmarks garbage benchmark. On a sample trace, it increased the 99.9%ile mutator utilization at 10ms from 26% to 59%, and at 5ms from 17% to 52%. It reduced the 99.9%ile zero utilization window from 2ms to 700µs. It also helps the mean mutator utilization: it increased the 10s mutator utilization from 83% to 94%. The minimum mutator utilization is also somewhat improved, though there is still some unknown artifact that causes a miniscule fraction of mutator assists to take 5--10ms (in fact, there was exactly one 10ms mutator assist in my sample trace). This has no significant effect on the throughput of the github.com/dr2chase/bent benchmarks-50. This has little effect on the go1 benchmarks (and the slight overall improvement makes up for the slight overall slowdown from the previous commit): name old time/op new time/op delta BinaryTree17-12 2.40s ± 0% 2.41s ± 1% +0.26% (p=0.010 n=18+19) Fannkuch11-12 2.95s ± 0% 2.93s ± 0% -0.62% (p=0.000 n=18+15) FmtFprintfEmpty-12 42.2ns ± 0% 42.3ns ± 1% +0.37% (p=0.001 n=15+14) FmtFprintfString-12 67.9ns ± 2% 67.2ns ± 3% -1.03% (p=0.002 n=20+18) FmtFprintfInt-12 75.6ns ± 3% 76.8ns ± 2% +1.59% (p=0.000 n=19+17) FmtFprintfIntInt-12 123ns ± 1% 124ns ± 1% +0.77% (p=0.000 n=17+14) FmtFprintfPrefixedInt-12 148ns ± 1% 150ns ± 1% +1.28% (p=0.000 n=20+20) FmtFprintfFloat-12 212ns ± 0% 211ns ± 1% -0.67% (p=0.000 n=16+17) FmtManyArgs-12 499ns ± 1% 500ns ± 0% +0.23% (p=0.004 n=19+16) GobDecode-12 6.49ms ± 1% 6.51ms ± 1% +0.32% (p=0.008 n=19+19) GobEncode-12 5.47ms ± 0% 5.43ms ± 1% -0.68% (p=0.000 n=19+20) Gzip-12 220ms ± 1% 216ms ± 1% -1.66% (p=0.000 n=20+19) Gunzip-12 38.8ms ± 0% 38.5ms ± 0% -0.80% (p=0.000 n=19+20) HTTPClientServer-12 78.5µs ± 1% 78.1µs ± 1% -0.53% (p=0.008 n=20+19) JSONEncode-12 12.2ms ± 0% 11.9ms ± 0% -2.38% (p=0.000 n=17+19) JSONDecode-12 52.3ms ± 0% 53.3ms ± 0% +1.84% (p=0.000 n=19+20) Mandelbrot200-12 3.69ms ± 0% 3.69ms ± 0% -0.19% (p=0.000 n=19+19) GoParse-12 3.17ms ± 1% 3.19ms ± 1% +0.61% (p=0.000 n=20+20) RegexpMatchEasy0_32-12 73.7ns ± 0% 73.2ns ± 1% -0.66% (p=0.000 n=17+20) RegexpMatchEasy0_1K-12 238ns ± 0% 239ns ± 0% +0.32% (p=0.000 n=17+16) RegexpMatchEasy1_32-12 69.1ns ± 1% 69.2ns ± 1% ~ (p=0.669 n=19+13) RegexpMatchEasy1_1K-12 365ns ± 1% 367ns ± 1% +0.49% (p=0.000 n=19+19) RegexpMatchMedium_32-12 104ns ± 1% 105ns ± 1% +1.33% (p=0.000 n=16+20) RegexpMatchMedium_1K-12 33.6µs ± 3% 34.1µs ± 4% +1.67% (p=0.001 n=20+20) RegexpMatchHard_32-12 1.67µs ± 1% 1.62µs ± 1% -2.78% (p=0.000 n=18+17) RegexpMatchHard_1K-12 50.3µs ± 2% 48.7µs ± 1% -3.09% (p=0.000 n=19+18) Revcomp-12 384ms ± 0% 386ms ± 0% +0.59% (p=0.000 n=19+19) Template-12 61.1ms ± 1% 60.5ms ± 1% -1.02% (p=0.000 n=19+20) TimeParse-12 307ns ± 0% 303ns ± 1% -1.23% (p=0.000 n=19+15) TimeFormat-12 323ns ± 0% 323ns ± 0% -0.12% (p=0.011 n=15+20) [Geo mean] 47.1µs 47.0µs -0.20% https://perf.golang.org/search?q=upload:20171030.4 It slightly improve the performance the x/benchmarks: name old time/op new time/op delta Garbage/benchmem-MB=1024-12 2.29ms ± 3% 2.22ms ± 2% -2.97% (p=0.000 n=18+18) Garbage/benchmem-MB=64-12 2.24ms ± 2% 2.21ms ± 2% -1.64% (p=0.000 n=18+18) HTTP-12 12.6µs ± 1% 12.6µs ± 1% ~ (p=0.690 n=19+17) JSON-12 11.3ms ± 2% 11.3ms ± 1% ~ (p=0.163 n=17+18) and fixes some of the heap size bloat caused by the previous commit: name old peak-RSS-bytes new peak-RSS-bytes delta Garbage/benchmem-MB=1024-12 1.88G ± 2% 1.77G ± 2% -5.52% (p=0.000 n=20+18) Garbage/benchmem-MB=64-12 248M ± 8% 226M ± 5% -8.93% (p=0.000 n=20+20) HTTP-12 47.0M ±27% 47.2M ±12% ~ (p=0.512 n=20+20) JSON-12 206M ±11% 206M ±10% ~ (p=0.841 n=20+20) https://perf.golang.org/search?q=upload:20171030.5 Combined with the change to add a soft goal in the previous commit, the achieves a decent performance improvement on the garbage benchmark: name old time/op new time/op delta Garbage/benchmem-MB=1024-12 2.40ms ± 4% 2.22ms ± 2% -7.40% (p=0.000 n=19+18) Garbage/benchmem-MB=64-12 2.23ms ± 1% 2.21ms ± 2% -1.06% (p=0.000 n=19+18) HTTP-12 12.5µs ± 1% 12.6µs ± 1% ~ (p=0.330 n=20+17) JSON-12 11.1ms ± 1% 11.3ms ± 1% +1.87% (p=0.000 n=16+18) https://perf.golang.org/search?q=upload:20171030.6 Change-Id: If04ddb57e1e58ef2fb9eec54c290eb4ae4bea121 Reviewed-on: https://go-review.googlesource.com/59971 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-31 21:59:11 +00:00
Austin Clements	03eb9483e3	runtime: separate soft and hard heap limits Currently, GC pacing is based on a single hard heap limit computed based on GOGC. In order to achieve this hard limit, assist pacing makes the conservative assumption that the entire heap is live. However, in the steady state (with GOGC=100), only half of the heap is live. As a result, the garbage collector works twice as hard as necessary and finishes half way between the trigger and the goal. Since this is a stable state for the trigger controller, this repeats from cycle to cycle. Matters are even worse if GOGC is higher. For example, if GOGC=200, only a third of the heap is live in steady state, so the GC will work three times harder than necessary and finish only a third of the way between the trigger and the goal. Since this causes the garbage collector to consume ~50% of the available CPU during marking instead of the intended 25%, about 25% of the CPU goes to mutator assists. This high mutator assist cost causes high mutator latency variability. This commit improves the situation by separating the heap goal into two goals: a soft goal and a hard goal. The soft goal is set based on GOGC, just like the current goal is, and the hard goal is set at a 10% larger heap than the soft goal. Prior to the soft goal, assist pacing assumes the heap is in steady state (e.g., only half of it is live). Between the soft goal and the hard goal, assist pacing switches to the current conservative assumption that the entire heap is live. In benchmarks, this nearly eliminates mutator assists. However, since background marking is fixed at 25% CPU, this causes the trigger controller to saturate, which leads to somewhat higher variability in heap size. The next commit will address this. The lower CPU usage of course leads to longer mark cycles, though really it means the mark cycles are as long as they should have been in the first place. This does, however, lead to two potential down-sides compared to the current pacing policy: 1. the total overhead of the write barrier is higher because it's enabled more of the time and 2. the heap size may be larger because there's more floating garbage. We addressed 1 by significantly improving the performance of the write barrier in the preceding commits. 2 can be demonstrated in intense GC benchmarks, but doesn't seem to be a problem in any real applications. Updates #14951. Updates #14812 (fixes?). Fixes #18534. This has no significant effect on the throughput of the github.com/dr2chase/bent benchmarks-50. This has little overall throughput effect on the go1 benchmarks: name old time/op new time/op delta BinaryTree17-12 2.41s ± 0% 2.40s ± 0% -0.22% (p=0.007 n=20+18) Fannkuch11-12 2.95s ± 0% 2.95s ± 0% +0.07% (p=0.003 n=17+18) FmtFprintfEmpty-12 41.7ns ± 3% 42.2ns ± 0% +1.17% (p=0.002 n=20+15) FmtFprintfString-12 66.5ns ± 0% 67.9ns ± 2% +2.16% (p=0.000 n=16+20) FmtFprintfInt-12 77.6ns ± 2% 75.6ns ± 3% -2.55% (p=0.000 n=19+19) FmtFprintfIntInt-12 124ns ± 1% 123ns ± 1% -0.98% (p=0.000 n=18+17) FmtFprintfPrefixedInt-12 151ns ± 1% 148ns ± 1% -1.75% (p=0.000 n=19+20) FmtFprintfFloat-12 210ns ± 1% 212ns ± 0% +0.75% (p=0.000 n=19+16) FmtManyArgs-12 501ns ± 1% 499ns ± 1% -0.30% (p=0.041 n=17+19) GobDecode-12 6.50ms ± 1% 6.49ms ± 1% ~ (p=0.234 n=19+19) GobEncode-12 5.43ms ± 0% 5.47ms ± 0% +0.75% (p=0.000 n=20+19) Gzip-12 216ms ± 1% 220ms ± 1% +1.71% (p=0.000 n=19+20) Gunzip-12 38.6ms ± 0% 38.8ms ± 0% +0.66% (p=0.000 n=18+19) HTTPClientServer-12 78.1µs ± 1% 78.5µs ± 1% +0.49% (p=0.035 n=20+20) JSONEncode-12 12.1ms ± 0% 12.2ms ± 0% +1.05% (p=0.000 n=18+17) JSONDecode-12 53.0ms ± 0% 52.3ms ± 0% -1.27% (p=0.000 n=19+19) Mandelbrot200-12 3.74ms ± 0% 3.69ms ± 0% -1.17% (p=0.000 n=18+19) GoParse-12 3.17ms ± 1% 3.17ms ± 1% ~ (p=0.569 n=19+20) RegexpMatchEasy0_32-12 73.2ns ± 1% 73.7ns ± 0% +0.76% (p=0.000 n=18+17) RegexpMatchEasy0_1K-12 239ns ± 0% 238ns ± 0% -0.27% (p=0.000 n=13+17) RegexpMatchEasy1_32-12 69.0ns ± 2% 69.1ns ± 1% ~ (p=0.404 n=19+19) RegexpMatchEasy1_1K-12 367ns ± 1% 365ns ± 1% -0.60% (p=0.000 n=19+19) RegexpMatchMedium_32-12 105ns ± 1% 104ns ± 1% -1.24% (p=0.000 n=19+16) RegexpMatchMedium_1K-12 34.1µs ± 2% 33.6µs ± 3% -1.60% (p=0.000 n=20+20) RegexpMatchHard_32-12 1.62µs ± 1% 1.67µs ± 1% +2.75% (p=0.000 n=18+18) RegexpMatchHard_1K-12 48.8µs ± 1% 50.3µs ± 2% +3.07% (p=0.000 n=20+19) Revcomp-12 386ms ± 0% 384ms ± 0% -0.57% (p=0.000 n=20+19) Template-12 59.9ms ± 1% 61.1ms ± 1% +2.01% (p=0.000 n=20+19) TimeParse-12 301ns ± 2% 307ns ± 0% +2.11% (p=0.000 n=20+19) TimeFormat-12 323ns ± 0% 323ns ± 0% ~ (all samples are equal) [Geo mean] 47.0µs 47.1µs +0.23% https://perf.golang.org/search?q=upload:20171030.1 Likewise, the throughput effect on the x/benchmarks is minimal (and reasonably positive on the garbage benchmark with a large heap): name old time/op new time/op delta Garbage/benchmem-MB=1024-12 2.40ms ± 4% 2.29ms ± 3% -4.57% (p=0.000 n=19+18) Garbage/benchmem-MB=64-12 2.23ms ± 1% 2.24ms ± 2% +0.59% (p=0.016 n=19+18) HTTP-12 12.5µs ± 1% 12.6µs ± 1% ~ (p=0.326 n=20+19) JSON-12 11.1ms ± 1% 11.3ms ± 2% +2.15% (p=0.000 n=16+17) It does increase the heap size of the garbage benchmarks, but seems to have relatively little impact on more realistic programs. Also, we'll gain some of this back with the next commit. name old peak-RSS-bytes new peak-RSS-bytes delta Garbage/benchmem-MB=1024-12 1.21G ± 1% 1.88G ± 2% +55.59% (p=0.000 n=19+20) Garbage/benchmem-MB=64-12 168M ± 3% 248M ± 8% +48.08% (p=0.000 n=18+20) HTTP-12 45.6M ± 9% 47.0M ±27% ~ (p=0.925 n=20+20) JSON-12 193M ±11% 206M ±11% +7.06% (p=0.001 n=20+20) https://perf.golang.org/search?q=upload:20171030.2 Change-Id: Ic78904135f832b4d64056cbe734ab979f5ad9736 Reviewed-on: https://go-review.googlesource.com/59970 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-31 21:59:08 +00:00
Austin Clements	52cf91a5d5	cmd/compile,runtime: update instrumentation comments The compiler's instrumentation pass has some out-of-date comments about the write barrier and some confusing comments about typedslicecopy. Update these comments and add a comment to typedslicecopy explaining why it's manually instrumented while none of the other operations are. Change-Id: I024e5361d53f1c3c122db0c85155368a30cabd6b Reviewed-on: https://go-review.googlesource.com/74430 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-31 14:03:10 +00:00
Russ Cox	2e2047a07f	runtime/race: install alternate packages to temp dir The content-based staleness code means that go run -gcflags=-l helloworld.go recompiles all of helloworld.go's dependencies with -gcflags=-l, whereas before it would have assumed installed packages were up-to-date. In this test, that means every race iteration rebuilds the runtime and maybe a few other packages. Instead, install them to a temporary location for reuse. This speeds the test from 17s to 9s on my MacBook Pro. Change-Id: Ied136ce72650261083bb19cc7dee38dac0ad05ca Reviewed-on: https://go-review.googlesource.com/73992 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-31 13:20:41 +00:00
Russ Cox	94471f6324	runtime: shorten tests in all.bash This cuts 23 seconds from all.bash on my MacBook Pro. Change-Id: Ibc4d7c01660b9e9ebd088dd55ba993f0d7ec6aa3 Reviewed-on: https://go-review.googlesource.com/73991 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-31 13:20:27 +00:00
Russ Cox	2beb173e98	all: respect $GO_GCFLAGS during run.bash If the go install doesn't use the same flags as the main build it can overwrite the installed standard library, leading to flakiness and slow future tests. Force uses of 'go install' etc to propagate $GO_GCFLAGS or disable them entirely, to avoid problems. As I understand it, the main place this happens is the ssacheck builder. If there are other uses that need to run some of the now-disabled tests we can reenable fixed tests in followup CLs. Change-Id: Ib860a253539f402f8a96a3c00ec34f0bbf137c9a Reviewed-on: https://go-review.googlesource.com/74470 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-10-31 13:19:15 +00:00
Bill O'Farrell	7fff1db060	runtime: remove unnecessary sync from publicationBarrier on s390x Memory accesses on z are at least as ordered as they are on AMD64. Change-Id: Ia515430e571ebd07e9314de05c54dc992ab76b95 Reviewed-on: https://go-review.googlesource.com/74010 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <mike.munday@ibm.com>	2017-10-30 23:42:27 +00:00
Austin Clements	877387e38a	runtime: use buffered write barrier for bulkBarrierPreWrite This modifies bulkBarrierPreWrite to use the buffered write barrier instead of the eager write barrier. This reduces the number of system stack switches and sanity checks by a factor of the buffer size (currently 256). This affects both typedmemmove and typedmemclr. Since this is purely a runtime change, it applies to all arches (unlike the pointer write barrier). name old time/op new time/op delta BulkWriteBarrier-12 7.33ns ± 6% 4.46ns ± 9% -39.10% (p=0.000 n=20+19) Updates #22460. Change-Id: I6a686a63bbf08be02b9b97250e37163c5a90cdd8 Reviewed-on: https://go-review.googlesource.com/73832 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-30 18:12:54 +00:00
Austin Clements	6a5f1e58ed	runtime: simplify and optimize typedslicecopy Currently, typedslicecopy meticulously performs a typedmemmove on every element of the slice. This probably used to be necessary because we only had an individual element's type, but now we use the heap bitmap, so we only need to know whether the type has any pointers and how big it is. Hence, this CL rewrites typedslicecopy to simply perform one bulk barrier and one memmove. This also has a side-effect of eliminating two unnecessary write barriers per slice element that were coming from updates to dstp and srcp, which were stored in the parent stack frame. However, most of the win comes from eliminating the loops. name old time/op new time/op delta BulkWriteBarrier-12 7.83ns ±10% 7.33ns ± 6% -6.45% (p=0.000 n=20+20) Updates #22460. Change-Id: Id3450e9f36cc8e0892f268319b136f0d8f5464b8 Reviewed-on: https://go-review.googlesource.com/73831 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-30 18:12:51 +00:00
Austin Clements	f96b95bcd1	runtime: benchmark for bulk write barriers This adds a benchmark of typedslicecopy and its bulk write barriers. For #22460. Change-Id: I439ca3b130bb22944468095f8f18b464e5bb43ca Reviewed-on: https://go-review.googlesource.com/74051 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-30 18:12:49 +00:00
Austin Clements	e9079a69f3	runtime: buffered write barrier implementation This implements runtime support for buffered write barriers on amd64. The buffered write barrier has a fast path that simply enqueues pointers in a per-P buffer. Unlike the current write barrier, this fast path is not a normal Go call and does not require the compiler to spill general-purpose registers or put arguments on the stack. When the buffer fills up, the write barrier takes the slow path, which spills all general purpose registers and flushes the buffer. We don't allow safe-points or stack splits while this frame is active, so it doesn't matter that we have no type information for the spilled registers in this frame. One minor complication is cgocheck=2 mode, which uses the write barrier to detect Go pointers being written to non-Go memory. We obviously can't buffer this, so instead we set the buffer to its minimum size, forcing the write barrier into the slow path on every call. For this specific case, we pass additional information as arguments to the flush function. This also requires enabling the cgo write barrier slightly later during runtime initialization, after Ps (and the per-P write barrier buffers) have been initialized. The code in this CL is not yet active. The next CL will modify the compiler to generate calls to the new write barrier. This reduces the average cost of the write barrier by roughly a factor of 4, which will pay for the cost of having it enabled more of the time after we make the GC pacer less aggressive. (Benchmarks will be in the next CL.) Updates #14951. Updates #22460. Change-Id: I396b5b0e2c5e5c4acfd761a3235fd15abadc6cb1 Reviewed-on: https://go-review.googlesource.com/73711 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-30 18:12:44 +00:00
Austin Clements	1e8ab99b37	runtime: add benchmark for write barriers For #22460. Change-Id: I798f26d45bbe1efd16b632e201413cb26cb3e6c7 Reviewed-on: https://go-review.googlesource.com/73811 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-30 18:12:41 +00:00
Austin Clements	15d6ab69fb	runtime: make systemstack tail call if already switched Currently systemstack always calls its argument, even if we're already on the system stack. Unfortunately, traceback with _TraceJump stops at the first systemstack it sees, which often cuts off runtime stacks early in profiles. Fix this by performing a tail call if we're already on the system stack. This eliminates it from the traceback entirely, so it won't stop prematurely (or all get mushed into a single node in the profile graph). Change-Id: Ibc69e8765e899f8d3806078517b8c7314da196f4 Reviewed-on: https://go-review.googlesource.com/74050 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-30 16:33:55 +00:00
Lynn Boger	58de9f3583	runtime: use -buildmode=pie in testCgoPprofPIE instead of -extldflags=-pie Errors occur in runtime test testCgoPprofPIE when the test is built by passing -pie to the external linker with code that was not built as PIC. This occurs on ppc64le because non-PIC is the default, and fails only on newer distros where the address range used for programs is high enough to cause relocation overflow. This test should be built with -buildmode=pie since that correctly generates PIC with -pie. Related issues are #21954 and #22126. Updates #22459 Change-Id: Ib641440bc9f94ad2b97efcda14a4b482647be8f7 Reviewed-on: https://go-review.googlesource.com/73970 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-30 12:59:31 +00:00
Austin Clements	164e1b8477	runtime: eliminate remaining recordspan write barriers recordspan has two remaining write barriers from writing to the pointer to the backing store of h.allspans. However, h.allspans is always backed by off-heap memory, so let the compiler know this. Unfortunately, this isn't quite as clean as most go:notinheap uses because we can't directly name the backing store of a slice, but we can get it done with some judicious casting. For #22460. Change-Id: I296f92fa41cf2cb6ae572b35749af23967533877 Reviewed-on: https://go-review.googlesource.com/73414 Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-29 20:22:00 +00:00
Austin Clements	3526d8031a	runtime: allow write barriers in gchelper We're about to start tracking nowritebarrierrec through systemstack calls, which detects that we're calling markroot (which has write barriers) from gchelper, which is called from the scheduler during STW apparently without a P. But it turns out that func helpgc, which wakes up blocked Ms to run gchelper, installs a P for gchelper to use. This means there is a P when gchelper runs, so it is allowed to have write barriers. Tell the compiler this by marking gchelper go:yeswritebarrierrec. Also, document the call to gchelper so I don't have to spend another half a day puzzling over how on earth this could possibly work before discovering the spooky action-at-a-distance in helpgc. Updates #22384. For #22460. Change-Id: I7394c9b4871745575f87a2d4fbbc5b8e54d669f7 Reviewed-on: https://go-review.googlesource.com/72772 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-29 17:56:21 +00:00
Austin Clements	d941b07558	runtime: eliminate write barriers from persistentalloc We're about to start tracking nowritebarrierrec through systemstack calls, which will reveal write barriers in persistentalloc prohibited by various callers. The pointers manipulated by persistentalloc are always to off-heap memory, so this removes these write barriers statically by introducing a new go:notinheap type to represent generic off-heap memory. Updates #22384. For #22460. Change-Id: Id449d9ebf145b14d55476a833e7f076b0d261d57 Reviewed-on: https://go-review.googlesource.com/72771 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-29 17:56:18 +00:00
Austin Clements	070cc8eb02	runtime: allow write barriers in startpanic_m We're about to start tracking nowritebarrierrec through systemstack calls, which will reveal write barriers in startpanic_m prohibited by various callers. We actually can allow write barriers here because the write barrier is a no-op when we're panicking. Let the compiler know. Updates #22384. For #22460. Change-Id: Ifb3a38d3dd9a4125c278c3680f8648f987a5b0b8 Reviewed-on: https://go-review.googlesource.com/72770 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-29 17:56:14 +00:00
Austin Clements	249b5cc945	runtime: mark gcWork methods nowritebarrierrec Currently most of these are marked go:nowritebarrier as a hint, but it's actually important that these not invoke write barriers recursively. The danger is that some gcWork method would invoke the write barrier while the gcWork is in an inconsistent state and that the write barrier would in turn invoke some other gcWork method, which would crash or permanently corrupt the gcWork. Simply marking the write barrier itself as go:nowritebarrierrec isn't sufficient to prevent this if the write barrier doesn't use the outer method. Thankfully, this doesn't cause any build failures, so we were getting this right. :) For #22460. Change-Id: I35a7292a584200eb35a49507cd3fe359ba2206f6 Reviewed-on: https://go-review.googlesource.com/72554 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-29 17:56:12 +00:00
Austin Clements	3beaf26e4f	runtime: remove write barriers from newstack, gogo Currently, newstack and gogo have write barriers for maintaining the context register saved in g.sched.ctxt. This is troublesome, because newstack can be called from go:nowritebarrierrec places that can't allow write barriers. It happens to be benign because g.sched.ctxt will always be nil on entry to newstack and it so happens the incoming ctxt will also always be nil in these contexts (I think/hope), but this is playing with fire. It's also desirable to mark newstack go:nowritebarrierrec to prevent any other, non-benign write barriers from creeping in, but we can't do that right now because of this one write barrier. Fix all of this by observing that g.sched.ctxt is really just a saved live pointer register. Hence, we can shade it when we scan g's stack and otherwise move it back and forth between the actual context register and g.sched.ctxt without write barriers. This means we can save it in morestack along with all of the other g.sched, eliminate the save from newstack along with its troublesome write barrier, and eliminate the shenanigans in gogo to invoke the write barrier when restoring it. Once we've done all of this, we can mark newstack go:nowritebarrierrec. Fixes #22385. For #22460. Change-Id: I43c24958e3f6785b53c1350e1e83c2844e0d1522 Reviewed-on: https://go-review.googlesource.com/72553 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-10-29 17:56:08 +00:00
Austin Clements	da95254d1a	runtime: "fix" non-preemptible loop in TestParallelRWMutexReaders TestParallelRWMutexReaders has a non-preemptible loop in it that can deadlock if GC triggers. "Fix" it like we've fixed similar tests. Updates #10958. Change-Id: I13618f522f5ef0c864e7171ad2f655edececacd7 Reviewed-on: https://go-review.googlesource.com/73710 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-26 20:38:48 +00:00
Russ Cox	986582126a	runtime: avoid monotonic time zero on systems with low-res timers Otherwise low-res timers cause problems at call sites that expect to be able to use 0 as meaning "no time set" and therefore expect that nanotime never returns 0 itself. For example, sched.lastpoll == 0 means no last poll. Fixes #22394. Change-Id: Iea28acfddfff6f46bc90f041ec173e0fea591285 Reviewed-on: https://go-review.googlesource.com/73410 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-25 17:10:20 +00:00
Wei Xiao	78ddf2741f	bytes: add optimized Equal for arm64 Use SIMD instructions when comparing chunks bigger than 16 bytes. Benchmark results of bytes: name old time/op new time/op delta Equal/0-8 6.52ns ± 1% 5.51ns ± 0% -15.43% (p=0.000 n=8+9) Equal/1-8 11.5ns ± 0% 10.5ns ± 0% -8.70% (p=0.000 n=10+10) Equal/6-8 19.0ns ± 0% 13.5ns ± 0% -28.95% (p=0.000 n=10+10) Equal/9-8 31.0ns ± 0% 13.5ns ± 0% -56.45% (p=0.000 n=10+10) Equal/15-8 40.0ns ± 0% 15.5ns ± 0% -61.25% (p=0.000 n=10+10) Equal/16-8 41.5ns ± 0% 14.5ns ± 0% -65.06% (p=0.000 n=10+10) Equal/20-8 47.5ns ± 0% 17.0ns ± 0% -64.21% (p=0.000 n=10+10) Equal/32-8 65.6ns ± 0% 17.0ns ± 0% -74.09% (p=0.000 n=10+10) Equal/4K-8 6.17µs ± 0% 0.57µs ± 1% -90.76% (p=0.000 n=10+10) Equal/4M-8 6.41ms ± 0% 1.11ms ±14% -82.71% (p=0.000 n=8+10) Equal/64M-8 104ms ± 0% 33ms ± 0% -68.64% (p=0.000 n=10+10) EqualPort/1-8 13.0ns ± 0% 13.0ns ± 0% ~ (all equal) EqualPort/6-8 22.0ns ± 0% 22.7ns ± 0% +3.06% (p=0.000 n=8+9) EqualPort/32-8 78.1ns ± 0% 78.1ns ± 0% ~ (all equal) EqualPort/4K-8 7.54µs ± 0% 7.61µs ± 0% +0.92% (p=0.000 n=10+8) EqualPort/4M-8 8.16ms ± 2% 8.05ms ± 1% -1.31% (p=0.023 n=10+10) EqualPort/64M-8 142ms ± 0% 142ms ± 0% +0.37% (p=0.000 n=10+10) CompareBytesEqual-8 39.0ns ± 0% 41.6ns ± 2% +6.67% (p=0.000 n=9+10) name old speed new speed delta Equal/1-8 86.9MB/s ± 0% 95.2MB/s ± 0% +9.53% (p=0.000 n=8+8) Equal/6-8 315MB/s ± 0% 444MB/s ± 0% +40.74% (p=0.000 n=9+10) Equal/9-8 290MB/s ± 0% 666MB/s ± 0% +129.63% (p=0.000 n=8+10) Equal/15-8 375MB/s ± 0% 967MB/s ± 0% +158.09% (p=0.000 n=10+10) Equal/16-8 385MB/s ± 0% 1103MB/s ± 0% +186.24% (p=0.000 n=10+9) Equal/20-8 421MB/s ± 0% 1175MB/s ± 0% +179.44% (p=0.000 n=9+10) Equal/32-8 488MB/s ± 0% 1881MB/s ± 0% +285.34% (p=0.000 n=10+8) Equal/4K-8 664MB/s ± 0% 7181MB/s ± 1% +981.32% (p=0.000 n=10+10) Equal/4M-8 654MB/s ± 0% 3822MB/s ±16% +484.15% (p=0.000 n=8+10) Equal/64M-8 645MB/s ± 0% 2056MB/s ± 0% +218.90% (p=0.000 n=10+10) EqualPort/1-8 76.8MB/s ± 0% 76.7MB/s ± 0% -0.09% (p=0.023 n=10+10) EqualPort/6-8 272MB/s ± 0% 264MB/s ± 0% -2.94% (p=0.000 n=8+10) EqualPort/32-8 410MB/s ± 0% 410MB/s ± 0% +0.01% (p=0.004 n=9+10) EqualPort/4K-8 543MB/s ± 0% 538MB/s ± 0% -0.91% (p=0.000 n=9+9) EqualPort/4M-8 514MB/s ± 2% 521MB/s ± 1% +1.31% (p=0.023 n=10+10) EqualPort/64M-8 473MB/s ± 0% 472MB/s ± 0% -0.37% (p=0.000 n=10+10) Benchmark results of go1: name old time/op new time/op delta BinaryTree17-8 6.53s ± 0% 6.52s ± 2% ~ (p=0.286 n=4+5) Fannkuch11-8 6.35s ± 1% 6.33s ± 0% ~ (p=0.690 n=5+5) FmtFprintfEmpty-8 108ns ± 1% 99ns ± 1% -8.31% (p=0.008 n=5+5) FmtFprintfString-8 172ns ± 1% 188ns ± 0% +9.43% (p=0.016 n=5+4) FmtFprintfInt-8 207ns ± 0% 202ns ± 0% -2.42% (p=0.008 n=5+5) FmtFprintfIntInt-8 277ns ± 1% 271ns ± 1% -2.02% (p=0.008 n=5+5) FmtFprintfPrefixedInt-8 386ns ± 0% 380ns ± 0% -1.55% (p=0.008 n=5+5) FmtFprintfFloat-8 492ns ± 0% 494ns ± 1% ~ (p=0.175 n=4+5) FmtManyArgs-8 1.32µs ± 1% 1.31µs ± 2% ~ (p=0.651 n=5+5) GobDecode-8 16.8ms ± 2% 16.9ms ± 1% ~ (p=0.310 n=5+5) GobEncode-8 14.1ms ± 1% 14.1ms ± 1% ~ (p=1.000 n=5+5) Gzip-8 788ms ± 0% 789ms ± 0% ~ (p=0.548 n=5+5) Gunzip-8 83.6ms ± 0% 83.6ms ± 0% ~ (p=0.548 n=5+5) HTTPClientServer-8 120µs ± 0% 120µs ± 1% ~ (p=0.690 n=5+5) JSONEncode-8 33.2ms ± 0% 33.6ms ± 0% +1.20% (p=0.008 n=5+5) JSONDecode-8 152ms ± 1% 146ms ± 1% -3.70% (p=0.008 n=5+5) Mandelbrot200-8 10.0ms ± 0% 10.0ms ± 0% ~ (p=0.151 n=5+5) GoParse-8 7.97ms ± 0% 8.06ms ± 0% +1.15% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 233ns ± 1% 239ns ± 4% ~ (p=0.135 n=5+5) RegexpMatchEasy0_1K-8 1.86µs ± 0% 1.86µs ± 0% ~ (p=0.167 n=5+5) RegexpMatchEasy1_32-8 250ns ± 0% 263ns ± 1% +5.28% (p=0.008 n=5+5) RegexpMatchEasy1_1K-8 2.28µs ± 0% 2.13µs ± 0% -6.64% (p=0.000 n=4+5) RegexpMatchMedium_32-8 332ns ± 1% 319ns ± 0% -3.97% (p=0.008 n=5+5) RegexpMatchMedium_1K-8 85.5µs ± 2% 79.1µs ± 1% -7.42% (p=0.008 n=5+5) RegexpMatchHard_32-8 4.34µs ± 1% 4.42µs ± 7% ~ (p=0.881 n=5+5) RegexpMatchHard_1K-8 130µs ± 1% 127µs ± 0% -2.18% (p=0.008 n=5+5) Revcomp-8 1.35s ± 1% 1.34s ± 0% -0.58% (p=0.016 n=5+4) Template-8 160ms ± 2% 158ms ± 1% ~ (p=0.222 n=5+5) TimeParse-8 795ns ± 2% 772ns ± 2% -2.87% (p=0.024 n=5+5) TimeFormat-8 782ns ± 0% 784ns ± 0% ~ (p=0.198 n=5+5) name old speed new speed delta GobDecode-8 45.8MB/s ± 2% 45.5MB/s ± 1% ~ (p=0.310 n=5+5) GobEncode-8 54.3MB/s ± 1% 54.4MB/s ± 1% ~ (p=0.984 n=5+5) Gzip-8 24.6MB/s ± 0% 24.6MB/s ± 0% ~ (p=0.540 n=5+5) Gunzip-8 232MB/s ± 0% 232MB/s ± 0% ~ (p=0.548 n=5+5) JSONEncode-8 58.4MB/s ± 0% 57.7MB/s ± 0% -1.19% (p=0.008 n=5+5) JSONDecode-8 12.8MB/s ± 1% 13.3MB/s ± 1% +3.85% (p=0.008 n=5+5) GoParse-8 7.27MB/s ± 0% 7.18MB/s ± 0% -1.13% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 137MB/s ± 1% 134MB/s ± 4% ~ (p=0.151 n=5+5) RegexpMatchEasy0_1K-8 551MB/s ± 0% 550MB/s ± 0% ~ (p=0.222 n=5+5) RegexpMatchEasy1_32-8 128MB/s ± 0% 121MB/s ± 1% -5.09% (p=0.008 n=5+5) RegexpMatchEasy1_1K-8 449MB/s ± 0% 481MB/s ± 0% +7.12% (p=0.016 n=4+5) RegexpMatchMedium_32-8 3.00MB/s ± 0% 3.13MB/s ± 0% +4.33% (p=0.016 n=4+5) RegexpMatchMedium_1K-8 12.0MB/s ± 2% 12.9MB/s ± 1% +7.98% (p=0.008 n=5+5) RegexpMatchHard_32-8 7.38MB/s ± 1% 7.25MB/s ± 7% ~ (p=0.952 n=5+5) RegexpMatchHard_1K-8 7.88MB/s ± 1% 8.05MB/s ± 0% +2.21% (p=0.008 n=5+5) Revcomp-8 188MB/s ± 1% 189MB/s ± 0% +0.58% (p=0.016 n=5+4) Template-8 12.2MB/s ± 2% 12.3MB/s ± 1% ~ (p=0.183 n=5+5) Change-Id: I65e79f3f8f8b2914678311c4f1b0a2d98459e220 Reviewed-on: https://go-review.googlesource.com/71110 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com>	2017-10-25 14:37:25 +00:00
Tobias Klauser	0c68b79e9c	runtime/internal/sys: use boolean constants for sys.BigEndian The BigEndian constant is only used in boolean context so assign it boolean constants. Change-Id: If19d61dd71cdfbffede1d98b401f11e6535fba59 Reviewed-on: https://go-review.googlesource.com/73270 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-25 14:22:53 +00:00
Ian Lance Taylor	d92aaa9707	runtime: unify arm entry point code Change-Id: Id51a2d63f7199b3ff71cedd415345ad20e5bd981 Reviewed-on: https://go-review.googlesource.com/70791 Reviewed-by: Austin Clements <austin@google.com>	2017-10-25 00:40:40 +00:00
Alex Brainman	4a0dcc2de1	runtime: make errno positive in netpollopen Make netpollopen return what Windows GetLastError API returns. It is probably copy / paste error from long time ago. Change-Id: I28f78718c15fef3e8b5f5d11a259533d7e9c6185 Reviewed-on: https://go-review.googlesource.com/72592 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2017-10-24 03:19:09 +00:00
Hugues Bruant	e769c9d6cf	runtime: more reliable mapdelete benchmark Increasing the map size with the benchmark iteration count introduced non-linearities and made benchmark runs slow when increasing benchtime. Rework the benchmark to use a map size independent of the iteration count and instead re-fill it when it becomes empty. Fixes #21546 Change-Id: Iafb6eb225e81830263f30b3aba0d449c361aec32 Reviewed-on: https://go-review.googlesource.com/57650 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-21 22:48:07 +00:00
Cherry Zhang	6fd1f825c1	runtime: support cgo traceback on PPC64LE Code essentially mirrors AMD64 implementation. Change-Id: I39f7f099ce11fdc3772df039998cc11947bb22a2 Reviewed-on: https://go-review.googlesource.com/72270 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-21 00:31:27 +00:00
Ian Lance Taylor	23aad448b1	runtime: for kqueue treat EVFILT_READ with EV_EOF as permitting a write On systems that use kqueue, we always register descriptors for both EVFILT_READ and EVFILT_WRITE. On at least FreeBSD and OpenBSD, when the write end of a pipe is registered for EVFILT_READ and EVFILT_WRITE events, and the read end of the pipe is closed, kqueue reports an EVFILT_READ event with EV_EOF set, but does not report an EVFILT_WRITE event. Since the write to the pipe is waiting for an EVFILT_WRITE event, closing the read end of a pipe can cause the write end to hang rather than attempt another write which will fail with EPIPE. Fix this by treating EVFILT_READ with EV_EOF set as making both reads and writes ready to proceed. The real test for this is in CL 71770, which tests using various timeouts with pipes. Updates #22114 Change-Id: Ib23fbaaddbccd8eee77bdf18f27a7f0aa50e2742 Reviewed-on: https://go-review.googlesource.com/71973 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-10-20 22:26:30 +00:00
Austin Clements	193088b246	runtime: separate error result for mmap Currently mmap returns an unsafe.Pointer that encodes OS errors as values less than 4096. In practice this is okay, but it borders on being really unsafe: for example, the value has to be checked immediately after return and if stack copying were ever to observe such a value, it would panic. It's also not remotely idiomatic. Fix this by making mmap return a separate pointer value and error, like a normal Go function. Updates #22218. Change-Id: Iefd965095ffc82cc91118872753a5d39d785c3a6 Reviewed-on: https://go-review.googlesource.com/71270 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-18 19:22:08 +00:00
Ian Lance Taylor	48754592e0	runtime: align stack in 386 lib startup before calling C function Fixes Darwin 386 build. It turns out that the Darwin pthread_create function saves the SSE registers, and therefore requires an aligned stack. This worked before https://golang.org/cl/70530 because the stack sizes were chosen to leave the stack aligned. Change-Id: I911a9e8dcde4e41e595d5ef9b9a1ca733e154de6 Reviewed-on: https://go-review.googlesource.com/71432 Reviewed-by: Robert Griesemer <gri@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-18 16:58:14 +00:00
David du Colombier	f4faca6013	runtime: don't terminate locked OS threads on Plan 9 CL 46037 and CL 46038 implemented termination of locked OS threads when the goroutine exits. However, this behavior leads to crashes of Go programs using runtime.LockOSThread on Plan 9. This is notably the case of the os/exec and net packages. This change disables termination of locked OS threads on Plan 9. Updates #22227. Change-Id: If9fa241bff1c0b68e7e9e321e06e5203b3923212 Reviewed-on: https://go-review.googlesource.com/71230 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-10-17 15:15:12 +00:00
David du Colombier	d155b32f8d	runtime: disable use of template thread on Plan 9 CL 46033 added a "template thread" mechanism to allow creation of thread with a known-good state from a thread of unknown state. However, we are experiencing issues on Plan 9 with programs using the os/exec and net package. These package are relying on runtime.LockOSThread. Updates #22227. Change-Id: I85b71580a41df9fe8b24bd8623c064b6773288b0 Reviewed-on: https://go-review.googlesource.com/70231 Run-TryBot: David du Colombier <0intro@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-17 15:15:07 +00:00
Wei Xiao	18508740b9	reflect: optimize CALLFN wrapper for arm64 Optimize arm64 CALLFN wrapper with LDP/STP instructions. This provides a significant speedup for big argument copy. Benchmark results for reflect: name old time/op new time/op delta Call-8 79.0ns ± 4% 73.6ns ± 4% -6.78% (p=0.000 n=10+10) CallArgCopy/size=128-8 80.5ns ± 0% 60.3ns ± 0% -25.06% (p=0.000 n=10+9) CallArgCopy/size=256-8 119ns ± 2% 67ns ± 1% -43.59% (p=0.000 n=8+10) CallArgCopy/size=1024-8 524ns ± 1% 99ns ± 1% -81.03% (p=0.000 n=10+10) CallArgCopy/size=4096-8 837ns ± 0% 231ns ± 1% -72.42% (p=0.000 n=9+9) CallArgCopy/size=65536-8 13.6µs ± 6% 3.1µs ± 1% -77.38% (p=0.000 n=10+10) PtrTo-8 12.9ns ± 0% 13.1ns ± 3% +1.86% (p=0.000 n=10+10) FieldByName1-8 28.7ns ± 2% 28.6ns ± 2% ~ (p=0.408 n=9+10) FieldByName2-8 928ns ± 4% 946ns ± 8% ~ (p=0.326 n=9+10) FieldByName3-8 5.35µs ± 5% 5.32µs ± 5% ~ (p=0.755 n=10+10) InterfaceBig-8 2.57ns ± 0% 2.57ns ± 0% ~ (all equal) InterfaceSmall-8 2.57ns ± 0% 2.57ns ± 0% ~ (all equal) New-8 9.09ns ± 1% 8.83ns ± 1% -2.81% (p=0.000 n=10+9) name old alloc/op new alloc/op delta Call-8 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta Call-8 0.00 0.00 ~ (all equal) name old speed new speed delta CallArgCopy/size=128-8 1.59GB/s ± 0% 2.12GB/s ± 1% +33.46% (p=0.000 n=10+9) CallArgCopy/size=256-8 2.14GB/s ± 2% 3.81GB/s ± 1% +78.02% (p=0.000 n=8+10) CallArgCopy/size=1024-8 1.95GB/s ± 1% 10.30GB/s ± 0% +427.99% (p=0.000 n=10+9) CallArgCopy/size=4096-8 4.89GB/s ± 0% 17.69GB/s ± 1% +261.87% (p=0.000 n=9+9) CallArgCopy/size=65536-8 4.84GB/s ± 6% 21.36GB/s ± 1% +341.67% (p=0.000 n=10+10) Change-Id: I775d88b30c43cb2eda1d0612ac15e6d283e70beb Reviewed-on: https://go-review.googlesource.com/70570 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-10-17 12:55:17 +00:00
Ian Lance Taylor	378de1ae43	runtime: unify 386 entry point code Unify the 386 entry point code as much as possible. The main function could not be unified because on Windows 386 it is called _main. Putting main in asm_386.s caused multiple definition errors when using the external linker. Add the _lib entry point to various operating systems. A future CL will enable c-archive/c-shared mode for those targets. Fix _rt0_386_windows_lib_go--it was passing arguments as though it were amd64. Change-Id: Ic73f1c95cdbcbea87f633f4a29bbc218a5db4f58 Reviewed-on: https://go-review.googlesource.com/70530 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-17 04:03:16 +00:00
Alessandro Arzilli	913fb18e7e	runtime/cgo: declare crosscall2 frame using TEXT for amd64 and 386 Use TEXT pseudo-instruction to adjust SP instead of a SUB instruction so that the assembler knows how to fill in the pcsp table and the frame description entry correctly. Updates #21569 Change-Id: I436c840b2af99bbb3042ecd38a7d7c1ab4d7372a Reviewed-on: https://go-review.googlesource.com/70937 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-16 21:17:25 +00:00
Ian Lance Taylor	b79e99bfb4	runtime: remove commented out code from ARM Linux boot The code was commented out by https://golang.org/cl/13234050 in 2013. Let's just remove it. Change-Id: I46ae1f07386719e991458e782d236214c40bdce1 Reviewed-on: https://go-review.googlesource.com/70770 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-16 21:12:48 +00:00
Ian Lance Taylor	5ddd3d588c	runtime: fix use of STREX in various exitThread implementations STREX does not permit using the same register for the value to store and the place where the result is returned. Also the code was wrong anyhow if the first store failed. Fixes #22248 Change-Id: I96013497410058514ffcb771c76c86faa1ec559b Reviewed-on: https://go-review.googlesource.com/70911 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-16 17:15:39 +00:00
Austin Clements	e09dbaa1de	runtime: schedule fractional workers on all Ps Currently only a single P can run a fractional mark worker at a time. This doesn't let us spread out the load, so it gets concentrated on whatever unlucky P picks up the token to run a fractional worker. This can significantly delay goroutines on that P. This commit changes this scheduling rule so each P separately schedules fractional workers. This can significantly reduce the load on any individual P and allows workers to self-preempt earlier. It does have the downside that it's possible for all Ps to be in fractional workers simultaneously (an effect STW). Updates #21698. Change-Id: Ia1e300c422043fa62bb4e3dd23c6232d81e4419c Reviewed-on: https://go-review.googlesource.com/68574 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-13 20:53:22 +00:00
Austin Clements	28e1a8e47a	runtime: preempt fractional worker after reaching utilization goal Currently fractional workers run until preempted by the scheduler, which means they typically run for 20ms. During this time, all other goroutines on that P are blocked, which can introduce significant latency variance. This modifies fractional workers to self-preempt shortly after achieving the fractional utilization goal. In practice this means they preempt much sooner, and the scale of their preemption is on the order of how often the user goroutine block (so, if the application is compute-bound, the fractional workers will also run for long times, but if the application blocks frequently, the fractional workers will also preempt quickly). Fixes #21698. Updates #18534. Change-Id: I03a5ab195dae93154a46c32083c4bb52415d2017 Reviewed-on: https://go-review.googlesource.com/68573 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-13 20:53:13 +00:00
Austin Clements	b783930e63	runtime: simplify fractional mark worker scheduler We haven't used non-zero gcForcePreemptNS for ages. Remove it and declutter the code. Change-Id: Id5cc62f526d21ca394d2b6ca17d34a72959535da Reviewed-on: https://go-review.googlesource.com/68572 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-13 20:53:03 +00:00
Austin Clements	315c28b788	runtime: use only dedicated mark workers at reasonable GOMAXPROCS When GOMAXPROCS is not small, fractional workers don't add much to throughput, but they do add to the latency of individual goroutines. In this case, it makes sense to just use dedicated workers, even if we can't exactly hit the 25% CPU goal with dedicated workers. This implements this logic by computing the number of dedicated mark workers that will us closest to the 25% target. We only fall back to fractional workers if that would be more than 30% off of the target (less than 17.5% or more than 32.5%, which in practice happens for GOMAXPROCS <= 3 and GOMAXPROCS == 6). Updates #21698. Change-Id: I484063adeeaa1190200e4ef210193a20e635d552 Reviewed-on: https://go-review.googlesource.com/68571 Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-13 20:52:55 +00:00
Austin Clements	27923482fa	runtime: separate GC background utilization from goal utilization Currently these are the same constant, but are separate concepts. Split them into two constants for easier experimentation and better documentation. Change-Id: I121854d4fd1a4a827f727c8e5153160c24aacda7 Reviewed-on: https://go-review.googlesource.com/68570 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-13 20:52:45 +00:00
Frank Somers	af40cbe83c	runtime: use vDSO on linux/386 to improve time.Now performance This change adds support for accelerating time.Now by using the __vdso_clock_gettime fast-path via the vDSO on linux/386 if it is available. When the vDSO path to the clocks is available, it is typically 5x-10x faster than the syscall path (see benchmark extract below). Two such calls are made for each time.Now() call on most platforms as of go 1.9. - Add vdso_linux_386.go, containing the ELF32 definitions for use by vdso_linux.go, the maximum array size, and the symbols to be located in the vDSO. - Modify runtime.walltime and runtime.nanotime to check for and use the vDSO fast-path if available, or fall back to the existing syscall path. - Reduce the stack reservations for runtime.walltime and runtime.monotime from 32 to 16 bytes. It appears the syscall path actually only needed 8 bytes, but 16 is now needed to cover the syscall and vDSO paths. - Remove clearing DX from the syscall paths as clock_gettime only takes 2 args (BX, CX in syscall calling convention), so there should be no need to clear DX. The included BenchmarkTimeNow was run with -cpu=1 -count=20 on an "Intel(R) Celeron(R) CPU J1900 @ 1.99GHz", comparing released go 1.9.1 vs this change. This shows a gain in performance on linux/386 (6.89x), and that no regression occurred on linux/amd64 due to this change. Kernel: linux/i686, GOOS=linux GOARCH=386 name old time/op new time/op delta TimeNow 978ns ± 0% 142ns ± 0% -85.48% (p=0.000 n=16+20) Kernel: linux/x86_64, GOOS=linux GOARCH=amd64 name old time/op new time/op delta TimeNow 125ns ± 0% 125ns ± 0% ~ (all equal) Gains are more dramatic in virtualized environments, presumably due to the overhead of virtualizing the syscall. Fixes #22190 Change-Id: I2f83ce60cb1b8b310c9ced0706bb463c1b3aedf8 Reviewed-on: https://go-review.googlesource.com/69390 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-13 14:41:04 +00:00
Frank Somers	c14dcfda6b	runtime: factor amd64 specifics from vdso_linux.go This is a preparation step for adding vDSO support on linux/386. This change relocates the elf64 and amd64 specifics from vdso_linux.go to a new vdso_linux_amd64.go. This should enable vdso_linux.go to be used for vDSO support on linux architectures other than amd64. - Relocate the elf64X structure definitions appropriate to amd64, and change their names to elfX so that the code in vdso_linux.go is ELFnn-agnostic. - Relocate the sym_keys and corresponding __vdso_* variables appropriate to amd64. - Provide an amd64-specific constant for the maximum byte size of an array, and use this in vdso_linux.go to compute constants for sizing the elf structure arrays traversed in the loaded vDSO. Change-Id: I1edb4e4ec9f2d79b7533aa95fbd09f771fa4edef Reviewed-on: https://go-review.googlesource.com/69391 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-13 02:04:20 +00:00
David Crawshaw	c58b98b2d6	cmd/link, runtime: put hasmain bit in moduledata Currently we look to see if the main.main symbol address is in the module data text range. This requires access to the main.main symbol, which usually the runtime has, but does not when building a plugin. To avoid a dynamic relocation to main.main (which I haven't worked out how to have the linker generate on darwin), stop using the symbol. Instead record a boolean in the moduledata if the module has the main function. Fixes #22175 Change-Id: If313a118f17ab499d0a760bbc2519771ed654530 Reviewed-on: https://go-review.googlesource.com/69370 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-13 01:13:33 +00:00
Austin Clements	77c27c3102	cmd/link: eliminate .debug_aranges The .debug_aranges section is an odd vestige of DWARF, since its contents are easy and efficient for a debugger to reconstruct from the attributes of the top-level compilation unit DIEs. Neither GCC nor clang emit it by default these days. GDB and Delve ignore it entirely. LLDB will use it if present, but is happy to construct the index from the compilation unit attributes (and, indeed, a remarkable variety of other ways if those aren't available either). We're about to split up the compilation units by package, which means they'll have discontiguous PC ranges, which is going to make .debug_aranges harder to construct (and larger). Rather than try to maintain this essentially unused code, let's simplify things and remove it. Change-Id: I8e0ccc033b583b5b8908cbb2c879b2f2d5f9a50b Reviewed-on: https://go-review.googlesource.com/69972 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Than McIntosh <thanm@google.com>	2017-10-12 18:56:18 +00:00
Elias Naur	764a6ac29e	runtime: don't restore the alternate signal stack on ios The alternative signal stack doesn't work on ios, so the setup of the alternative stack was skipped. The corresponding unminitSignals was effectively a no-op on ios until CL 70130. Skip unminitSignals on ios to restore the previous behaviour. For the ios builders. Change-Id: I5692ca7f5997e6b9d10cc5f2383a5a37c42b133c Reviewed-on: https://go-review.googlesource.com/70270 Run-TryBot: Elias Naur <elias.naur@gmail.com> Reviewed-by: Austin Clements <austin@google.com>	2017-10-12 16:59:32 +00:00
Austin Clements	58c7b1d160	runtime: fix dragonfly/amd64 CL 69292 unified the amd64 entry-points, but Dragonfly doesn't follow the same entry-point argument conventions as most other amd64 platforms. Fix the Dragonfly entry point. Change-Id: I0f84e2e4101ce68217af185ee9baaf455b8b6dad Reviewed-on: https://go-review.googlesource.com/70212 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-12 04:03:50 +00:00
David du Colombier	926373ea79	runtime: fix crash on Plan 9 Since CL 46037, the runtime is crashing after calling exitThread on Plan 9. The exitThread function shouldn't be called on Plan 9, because the system manages thread stacks. Fixes #22221. Change-Id: I5d61c9660a87dc27e4cfcb3ca3ddcb4b752f2397 Reviewed-on: https://go-review.googlesource.com/70190 Run-TryBot: David du Colombier <0intro@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-12 00:11:33 +00:00

... 3 4 5 6 7 ...

3346 Commits