1
0
mirror of https://github.com/golang/go synced 2024-11-08 03:36:12 -07:00
Commit Graph

3346 Commits

Author SHA1 Message Date
Cherry Zhang
1b6fec862c sync/atomic: redirect many functions to runtime/internal/atomic
The implementation of atomics are inherently tricky. It would
be good to have them implemented in a single place, instead of
multiple copies.

Mostly a simple redirect.

On 386, some functions in sync/atomic have better implementations,
which are moved to runtime/internal/atomic.

On ARM, some functions in sync/atomic have better implementations.
They are dropped by this CL, but restored with an improved
version in a follow-up CL. On linux/arm, 64-bit CAS kernel helper
is dropped, as we're trying to move away from kernel helpers.

Fixes #23778.

Change-Id: Icb9e1039acc92adbb2a371c34baaf0b79551c3ea
Reviewed-on: https://go-review.googlesource.com/93637
Reviewed-by: Austin Clements <austin@google.com>
2018-05-03 21:35:01 +00:00
Josh Bleecher Snyder
4d7cf3fedb runtime: convert g.waitreason from string to uint8
Every time I poke at #14921, the g.waitreason string
pointer writes show up.

They're not particularly important performance-wise,
but it'd be nice to clear the noise away.

And it does open up a few extra bytes in the g struct
for some future use.

This is a re-roll of CL 99078, which was rolled
back because of failures on s390x.
Those failures were apparently due to an old version of gdb.

Change-Id: Icc2c12f449b2934063fd61e272e06237625ed589
Reviewed-on: https://go-review.googlesource.com/111256
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2018-05-03 17:04:22 +00:00
Fangming.Fang
e8d417d272 runtime: enable memory sanitizer on arm64
Changes include:
1. open compilation option -msan for arm64
2. modify doc to explain -msan is also supported on linux/arm64
3. wrap msan lib API in msan_arm64.s
4. use libc for sigaction syscalls when cgo is enabled
5. use libc for mmap syscalls when cgo is enabled

Change-Id: I26ebe61ff7ce1906125f54a0182a720f9d58ec11
Reviewed-on: https://go-review.googlesource.com/109255
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-02 17:52:14 +00:00
Wei Xiao
20102594a0 cmd/compile: intrinsify runtime.getcallerpc on all link register architectures
Add a compiler intrinsic for getcallerpc on following architectures:
  arm
  mips mipsle mips64 mips64le
  ppc64 ppc64le
  s390x

Change-Id: I758f3d4742fc214b206bcd07d90408622c17dbef
Reviewed-on: https://go-review.googlesource.com/110835
Run-TryBot: Wei Xiao <Wei.Xiao@arm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-02 16:59:27 +00:00
Josh Bleecher Snyder
31cfa7f2f2 runtime: allow inlining of stackmapdata
Also do very minor code cleanup.

name                old time/op  new time/op  delta
StackCopyPtr-8      84.8ms ± 6%  82.9ms ± 5%  -2.19%  (p=0.000 n=95+94)
StackCopy-8         68.4ms ± 5%  65.3ms ± 4%  -4.54%  (p=0.000 n=99+99)
StackCopyNoCache-8   107ms ± 2%   105ms ± 2%  -2.13%  (p=0.000 n=91+95)

Change-Id: I2d85ede48bffada9584d437a08a82212c0da6d00
Reviewed-on: https://go-review.googlesource.com/109001
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-01 19:14:17 +00:00
Josh Bleecher Snyder
2f2e8f9c81 runtime: use staticbytes in intstring for small v
Triggers 21 times during make.bash.

Change-Id: I7efb34200439256151304bb66cd309913f7c9c9e
Reviewed-on: https://go-review.googlesource.com/110557
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-01 18:03:57 +00:00
Martin Möhrmann
d46980995b internal/cpu: remove platform specific prefix from cpu hwcap variables
Go runtime currently only populates hwcap for ppc64 and arm64.
While the interpretation of hwcap is platform specific the hwcap
information is generally available on linux.

Changing the runtime variable name to cpu_hwcap for cpu.hwcap makes it
consistent with the general naming of runtime variables that are linked
to other packages.

Change-Id: I1e1f932a73ed624a219b9298faafbb6355e47ada
Reviewed-on: https://go-review.googlesource.com/94757
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-01 15:50:19 +00:00
Michael Munday
5d9c78201f cmd/compile: allow R11 to be allocated on s390x
R11 is only used as a temporary by a very small set of instructions
(DIV, MOD, MULH and extended MVC/XC instructions). By marking these
instructions as clobbering R11 we can allocate R11 in the general
case.

Change-Id: I0d4ffe80e57c164d42a5ea5ef6308756a5b0f742
Reviewed-on: https://go-review.googlesource.com/110255
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-01 15:48:17 +00:00
Josh Bleecher Snyder
d91e9705f8 runtime: avoid unnecessary scanblock calls
This is the scanstack analog of CL 104737,
which made a similar change for copystack.

name         old time/op  new time/op  delta
ScanStack-8  41.1ms ± 6%  38.9ms ± 5%  -5.52%  (p=0.000 n=50+48)

Change-Id: I7427151dea2895ed3934f8a0f61d96b568019217
Reviewed-on: https://go-review.googlesource.com/105536
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-01 13:33:17 +00:00
Josh Bleecher Snyder
dda4591c8c runtime: add BenchmarkScanStack
There are many possible stack scanning benchmarks,
but this one is at least a start.

cpuprofiling shows about 75% of CPU in func scanstack.

Change-Id: I906b0493966f2165c1920636c4e057d16d6447e0
Reviewed-on: https://go-review.googlesource.com/105535
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-01 13:33:01 +00:00
Keith Randall
16d1a8e6e3 runtime: move open/close/read/write from syscall to libc on Darwin
Update #17490

Change-Id: Ia0bb0ba10dc0bbb299290a60b8228275d55125d7
Reviewed-on: https://go-review.googlesource.com/110438
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-01 04:19:18 +00:00
Keith Randall
ce5c3871a4 runtime: move more syscalls to libc on Darwin
Moving mmap, munmap, madvise, usleep.

Also introduce __error function to get at libc's errno variable.

Change-Id: Ic47ac1d9eb71c64ba2668ce304644dd7e5bdfb5a
Reviewed-on: https://go-review.googlesource.com/110437
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-01 04:19:12 +00:00
Matthew Dempsky
6658219b7e runtime: eliminate scase.receivedp
Make selectgo return recvOK as a result parameter instead.

Change-Id: Iffd436371d360bf666b76d4d7503e7c3037a9f1d
Reviewed-on: https://go-review.googlesource.com/37935
Reviewed-by: Austin Clements <austin@google.com>
2018-05-01 03:17:54 +00:00
Matthew Dempsky
004260afde cmd/compile: open code select{send,recv,default}
Registration now looks like:

        var cases [4]runtime.scases
        var order [8]uint16
	cases[0].kind = caseSend
	cases[0].c = c1
	cases[0].elem = &v1
	if raceenabled || msanenabled {
		selectsetpc(&cases[0])
	}
	cases[1].kind = caseRecv
	cases[1].c = c2
	cases[1].elem = &v2
	if raceenabled || msanenabled {
		selectsetpc(&cases[1])
	}
	...

Change-Id: Ib9bcf426a4797fe4bfd8152ca9e6e08e39a70b48
Reviewed-on: https://go-review.googlesource.com/37934
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-01 03:17:44 +00:00
Matthew Dempsky
3aa53b3135 runtime: eliminate runtime.hselect
Now the registration phase looks like:

    var cases [4]runtime.scases
    var order [8]uint16
    selectsend(&cases[0], c1, &v1)
    selectrecv(&cases[1], c2, &v2, nil)
    selectrecv(&cases[2], c3, &v3, &ok)
    selectdefault(&cases[3])
    chosen := selectgo(&cases[0], &order[0], 4)

Primarily, this is just preparation for having the compiler open-code
selectsend, selectrecv, and selectdefault.

As a minor benefit, order can now be layed out separately on the stack
in the pointer-free segment, so it won't take up space in the
function's stack pointer maps.

Change-Id: I5552ba594201efd31fcb40084da20b42ea569a45
Reviewed-on: https://go-review.googlesource.com/37933
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-01 03:17:31 +00:00
Elias Naur
c2fdb42b16 runtime: implement darwin raise with pthread_self and pthread_kill
Convert raise from raw syscalls to using the system pthread library.
As a bonus, raise will now target the current thread instead of the
process.

Updates #17490

Change-Id: I2e44f2000bf870e99a5b4dc5ff5e0799fba91bde
Reviewed-on: https://go-review.googlesource.com/110475
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-01 00:37:36 +00:00
Keith Randall
21656d09b7 runtime: convert exit to use pthread library on Darwin
Now we no longer need to mess with TLS on Darwin 386/amd64, we always
rely on the pthread library to set it up. We now just use one entry
in the TLS for the G.
Return from mstart to let the pthread library clean up the OS thread.

Change-Id: Iccf58049d545515d9b1d090b161f420e40ffd244
Reviewed-on: https://go-review.googlesource.com/110215
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-01 00:20:55 +00:00
Elias Naur
b1d1ec9183 runtime: perform crashes outside systemstack
CL 93658 moved stack trace printing inside a systemstack call to
sidestep complexity in case the runtime is in a inconsistent state.

Unfortunately, debuggers generating backtraces for a Go panic
will be confused and come up with a technical correct but useless
stack. This CL moves just the crash performing - typically a SIGABRT
signal - outside the systemstack call to improve backtraces.

Unfortunately, the crash function now needs to be marked nosplit and
that triggers the no split stackoverflow check. To work around that,
split fatalpanic in two: fatalthrow for runtime.throw and fatalpanic for
runtime.gopanic. Only Go panics really needs crashes on the right stack
and there is enough stack for gopanic.

Example program:

package main

import "runtime/debug"

func main() {
	debug.SetTraceback("crash")
	crash()
}

func crash() {
	panic("panic!")
}

Before:
(lldb) bt
* thread #1, name = 'simple', stop reason = signal SIGABRT
  * frame #0: 0x000000000044ffe4 simple`runtime.raise at <autogenerated>:1
    frame #1: 0x0000000000438cfb simple`runtime.dieFromSignal(sig=<unavailable>) at signal_unix.go:424
    frame #2: 0x0000000000438ec9 simple`runtime.crash at signal_unix.go:525
    frame #3: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758
    frame #4: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657
    frame #5: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1
    frame #6: 0x000000000042a980 simple at proc.go:1094
    frame #7: 0x0000000000438ec9 simple`runtime.crash at signal_unix.go:525
    frame #8: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758
    frame #9: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657
    frame #10: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1
    frame #11: 0x000000000042a980 simple at proc.go:1094
    frame #12: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758
    frame #13: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657
    frame #14: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1
    frame #15: 0x000000000042a980 simple at proc.go:1094
    frame #16: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657
    frame #17: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1

After:
(lldb) bt
* thread #7, stop reason = signal SIGABRT
  * frame #0: 0x0000000000450024 simple`runtime.raise at <autogenerated>:1
    frame #1: 0x0000000000438d1b simple`runtime.dieFromSignal(sig=<unavailable>) at signal_unix.go:424
    frame #2: 0x0000000000438ee9 simple`runtime.crash at signal_unix.go:525
    frame #3: 0x00000000004264e3 simple`runtime.fatalpanic(msgs=<unavailable>) at panic.go:664
    frame #4: 0x0000000000425f1b simple`runtime.gopanic(e=<unavailable>) at panic.go:537
    frame #5: 0x0000000000470c62 simple`main.crash at simple.go:11
    frame #6: 0x0000000000470c00 simple`main.main at simple.go:6
    frame #7: 0x0000000000427be7 simple`runtime.main at proc.go:198
    frame #8: 0x000000000044ef91 simple`runtime.goexit at <autogenerated>:1

Updates #22716

Change-Id: Ib5fa35c13662c1dac2f1eac8b59c4a5824b98d92
Reviewed-on: https://go-review.googlesource.com/110065
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-30 21:26:00 +00:00
Richard Musiol
e3c684777a all: skip unsupported tests for js/wasm
The general policy for the current state of js/wasm is that it only
has to support tests that are also supported by nacl.

The test nilptr3.go makes assumptions about which nil checks can be
removed. Since WebAssembly does not signal on reading a null pointer,
all nil checks have to be explicit.

Updates #18892

Change-Id: I06a687860b8d22ae26b1c391499c0f5183e4c485
Reviewed-on: https://go-review.googlesource.com/110096
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-30 19:39:18 +00:00
Keith Randall
eef27a8fa2 runtime: fix newosproc darwin+arm/arm64
Missed conversion of newosproc for the parts of darwin that
weren't affected by my previous change.

Update #25181

Change-Id: I81a2935e192b6d0df358c59b7e785eb03c504c23
Reviewed-on: https://go-review.googlesource.com/110123
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-04-30 16:12:24 +00:00
Martin Möhrmann
48bfc8db51 runtime,cmd/compile: adjust and correct path names in comments of map code
Some of the comments relative paths do not exist and
reflect does not define its own hmap structure.

Correct paths and consistently reference paths starting from the
go src directory.

Change-Id: I5204a3a98f77d65f17dcde98b847378cea05ad8a
Reviewed-on: https://go-review.googlesource.com/94758
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-04-30 13:42:26 +00:00
Wei Xiao
bd8a88729c cmd/compile: intrinsify runtime.getcallerpc on arm64
Add a compiler intrinsic for getcallerpc on arm64 for better code generation.

Change-Id: I897e670a2b8ffa1a8c2fdc638f5b2c44bda26318
Reviewed-on: https://go-review.googlesource.com/109276
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-30 13:29:14 +00:00
Keith Randall
b7f1777a70 runtime,cmd/ld: on darwin, create theads using libc
Replace thread creation with calls to the pthread
library in libc.

Update #17490

Change-Id: I1e19965c45255deb849b059231252fc6a7861d6c
Reviewed-on: https://go-review.googlesource.com/108679
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-30 02:41:03 +00:00
ChrisALiles
22ff9521da cmd/compile: pass arguments to convt2E/I integer functions by value
The motivation is avoid generating a pointer to the data being
converted so it can be garbage collected.
The change also slightly reduces binary size by shrinking call sites.

Fixes #24286

Benchmark results:
name                   old time/op  new time/op  delta
ConvT2ESmall-4         2.86ns ± 0%  2.80ns ± 1%  -2.12%  (p=0.000 n=29+28)
ConvT2EUintptr-4       2.88ns ± 1%  2.88ns ± 0%  -0.20%  (p=0.002 n=28+30)
ConvT2ELarge-4         19.6ns ± 0%  20.4ns ± 1%  +4.22%  (p=0.000 n=19+30)
ConvT2ISmall-4         3.01ns ± 0%  2.85ns ± 0%  -5.32%  (p=0.000 n=24+28)
ConvT2IUintptr-4       3.00ns ± 1%  2.87ns ± 0%  -4.44%  (p=0.000 n=29+25)
ConvT2ILarge-4         20.4ns ± 1%  21.3ns ± 1%  +4.41%  (p=0.000 n=30+26)
ConvT2Ezero/zero/16-4  2.84ns ± 1%  2.99ns ± 0%  +5.38%  (p=0.000 n=30+25)
ConvT2Ezero/zero/32-4  2.83ns ± 2%  3.00ns ± 0%  +5.91%  (p=0.004 n=27+3)

Change-Id: I65016ec94c53f97c52113121cab582d0c342b7a8
Reviewed-on: https://go-review.googlesource.com/102636
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-29 15:53:39 +00:00
Josh Bleecher Snyder
5af0b28a73 runtime: iterate over set bits in adjustpointers
There are several things combined in this change.

First, eliminate the gobitvector type in favor
of adding a ptrbit method to bitvector.
In non-performance-critical code, use that method.
In performance critical code, though, load the bitvector data
one byte at a time and iterate only over set bits.
To support that, add and use sys.Ctz8.

name                old time/op  new time/op  delta
StackCopyPtr-8      81.8ms ± 5%  78.9ms ± 3%   -3.58%  (p=0.000 n=97+96)
StackCopy-8         65.9ms ± 3%  62.8ms ± 3%   -4.67%  (p=0.000 n=96+92)
StackCopyNoCache-8   105ms ± 3%   102ms ± 3%   -3.38%  (p=0.000 n=96+95)

Change-Id: I00b80f45612708bd440b1a411a57fa6dfa24aa74
Reviewed-on: https://go-review.googlesource.com/109716
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-29 05:24:44 +00:00
Josh Bleecher Snyder
13cd006139 runtime: add fast version of getArgInfo
getArgInfo is called a lot during stack copying.
In the common case it doesn't do much work,
but it cannot be inlined.

This change works around that.

name                old time/op  new time/op  delta
StackCopyPtr-8       108ms ± 5%    96ms ± 4%  -10.40%  (p=0.000 n=20+20)
StackCopy-8         82.6ms ± 3%  78.4ms ± 6%   -5.15%  (p=0.000 n=19+20)
StackCopyNoCache-8   130ms ± 3%   122ms ± 3%   -6.44%  (p=0.000 n=20+20)

Change-Id: If7d8a08c50a4e2e76e4331b399396c5dbe88c2ce
Reviewed-on: https://go-review.googlesource.com/108945
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-29 03:33:09 +00:00
Austin Clements
0fd427fda7 runtime: use entry stack map at function entry
Currently, when the runtime looks up the stack map for a frame, it
uses frame.continpc - 1 unless continpc is the function entry PC, in
which case it uses frame.continpc. As a result, if continpc is the
function entry point (which happens for deferred frames), it will
actually look up the stack map *following* the first instruction.

I think, though I am not positive, that this is always okay today
because the first instruction of a function can never change the stack
map. It's usually not a CALL, so it doesn't have PCDATA. Or, if it is
a CALL, it has to have the entry stack map.

But we're about to start emitting stack maps at every instruction that
changes them, which means the first instruction can have PCDATA
(notably, in leaf functions that don't have a prologue).

To prepare for this, tweak how the runtime looks up stack map indexes
so that if continpc is the function entry point, it directly uses the
entry stack map.

For #24543.

Change-Id: I85aa818041cd26aff416f7b1fba186e9c8ca6568
Reviewed-on: https://go-review.googlesource.com/109349
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-04-29 00:03:04 +00:00
Milan Knezevic
2959128dc5 cmd/compile: add softfloat support to mips64{,le}
mips64 softfloat support is based on mips implementation and introduces
new enviroment variable GOMIPS64.

GOMIPS64 is a GOARCH=mips64{,le} specific option, for a choice between
hard-float and soft-float. Valid values are 'hardfloat' (default) and
'softfloat'. It is passed to the assembler as
'GOMIPS64_{hardfloat,softfloat}'.

Change-Id: I7f73078627f7cb37c588a38fb5c997fe09c56134
Reviewed-on: https://go-review.googlesource.com/108475
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-27 14:50:17 +00:00
Cherry Zhang
148a26539b runtime: remove stale comment about getcallerpc/sp
Getcallerpc/sp no longer takes argument.

Change-Id: I80b30020e798990c59c8ffd0a4e078af6a75aea0
Reviewed-on: https://go-review.googlesource.com/109696
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-26 22:06:33 +00:00
Cherry Zhang
22f4280b9a runtime: remove the dummy arg of getcallersp
getcallersp is intrinsified, and so the dummy arg is no longer
needed. Remove it, as well as a few dummy args that are solely
to feed getcallersp.

Change-Id: Ibb6c948ff9c56537042b380ac3be3a91b247aaa6
Reviewed-on: https://go-review.googlesource.com/109596
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-26 18:57:20 +00:00
Yuval Pavel Zholkover
58c231f244 runtime: FreeBSD fast clock_gettime HPET timecounter support
This is a followup for CL 93156.

Fixes #22942.

Change-Id: Ic6e2de44011d041b91454353a6f2e3b0cf590060
Reviewed-on: https://go-review.googlesource.com/108095
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-26 03:54:19 +00:00
Hana Kim
3bf1644428 runtime/trace: add simple benchmarks for user annotation
Also, avoid Region creation when tracing is disabled.
Unfortunate side-effect of this change is that we no longer trace
pre-existing regions in tracing, but we can add the feature in
the future when we find it useful and justifiable. Until then,
let's avoid the overhead from this low-level api use as much as
possible.

goos: linux
goarch: amd64
pkg: runtime/trace

// Trace disabled
BenchmarkStartRegion-12 2000000000	         0.66 ns/op	       0 B/op	       0 allocs/op
BenchmarkNewTask-12    	30000000	        40.4 ns/op	      56 B/op	       2 allocs/op

// Trace enabled, -trace=/dev/null
BenchmarkStartRegion-12  5000000	       287 ns/op	      32 B/op	       1 allocs/op
BenchmarkNewTask-12    	 5000000	       283 ns/op	      56 B/op	       2 allocs/op

Also, skip other tests if tracing is already enabled.

Change-Id: Id3028d60b5642fcab4b09a74fd7d79361a3861e5
Reviewed-on: https://go-review.googlesource.com/109115
Reviewed-by: Peter Weinberger <pjw@google.com>
2018-04-24 17:43:19 +00:00
Hana Kim
c2d1024368 runtime/trace: rename "Span" with "Region"
"Span" is a commonly used term in many distributed tracing systems
(Dapper, OpenCensus, OpenTracing, ...). They use it to refer to a
period of time, not necessarily tied into execution of underlying
processor, thread, or goroutine, unlike the "Span" of runtime/trace
package.

Since distributed tracing and go runtime execution tracing are
already similar enough to cause confusion, this CL attempts to avoid
using the same word if possible.

"Region" is being used in a certain tracing system to refer to a code
region which is pretty close to what runtime/trace.Span currently
refers to. So, replace that.
https://software.intel.com/en-us/itc-user-and-reference-guide-defining-and-recording-functions-or-regions

This CL also tweaks APIs a bit based on jbd and heschi's comments:

  NewContext -> NewTask
    and it now returns a Task object that exports End method.

  StartSpan -> StartRegion
    and it now returns a Region object that exports End method.

Also, changed WithSpan to WithRegion and it now takes func() with no
context. Another thought is to get rid of WithRegion. It is a nice
concept but in practice, it seems problematic (a lot of code churn,
and polluting stack trace). Already, the tracing concept is very low
level, and we hope this API to be used with great care.

Recommended usage will be
   defer trace.StartRegion(ctx, "someRegion").End()

Left old APIs untouched in this CL. Once the usage of them are cleaned
up, they will be removed in a separate CL.

Change-Id: I73880635e437f3aad51314331a035dd1459b9f3a
Reviewed-on: https://go-review.googlesource.com/108296
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: JBD <jbd@google.com>
2018-04-24 16:33:15 +00:00
Hana (Hyang-Ah) Kim
cd037bce09 runtime/pprof: introduce "allocs" profile
The Go's heap profile contains four kinds of samples
(inuse_space, inuse_objects, alloc_space, and alloc_objects).
The pprof tool by default chooses the inuse_space (the bytes
of live, in-use objects). When analyzing the current memory
usage the choice of inuse_space as the default may be useful,
but in some cases, users are more interested in analyzing the
total allocation statistics throughout the program execution.
For example, when we analyze the memory profile from benchmark
or program test run, we are more likely interested in the whole
allocation history than the live heap snapshot at the end of
the test or benchmark.

The pprof tool provides flags to control which sample type
to be used for analysis. However, it is one of the less-known
features of pprof and we believe it's better to choose the
right type of samples as the default when producing the profile.

This CL introduces a new type of profile, "allocs", which is
the same as the "heap" profile but marks the alloc_space
as the default type unlike heap profiles that use inuse_space
as the default type.

'go test -memprofile=...' command is changed to use the new
"allocs" profile type instead of the traditional "heap" profile.

Fixes #24443

Change-Id: I012dd4b6dcacd45644d7345509936b8380b6fbd9
Reviewed-on: https://go-review.googlesource.com/102696
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2018-04-24 16:11:41 +00:00
Wèi Cōngruì
cc8809238b runtime: fix errno sign for epollctl on mips, mips64 and ppc64
The caller of epollctl expects it to return a negative errno value,
but it returns a positive errno value on mips, mips64 and ppc64.
The change fixes this.

Updates #23446

Change-Id: Ie6372eca6c23de21964caaaa433c9a45ef93531e
Reviewed-on: https://go-review.googlesource.com/89235
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-24 14:10:43 +00:00
Ian Lance Taylor
665b9b3476 runtime: change GNU/Linux usleep to use nanosleep
Ever since we added sleep to the runtime back in 2008, we've
implemented it on GNU/Linux with the select (or pselect or pselect6)
system call. But the Linux kernel has a nanosleep system call,
which should be a tiny bit more efficient since it doesn't have to
check to see whether there are any file descriptors. So use it.

Change-Id: Icc3430baca46b082a4d33f97c6c47e25fa91cb9a
Reviewed-on: https://go-review.googlesource.com/108538
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-24 05:01:13 +00:00
Adam Azarchs
dfb1b69665 os/signal: add func Ignored(sig Signal) bool
Ignored reports whether sig is currently ignored.

This implementation only works applies on Unix systems for now.  However, at
the moment that is also the case for Ignore() and several other signal
interaction methods, so that seems fair.

Fixes #22497

Change-Id: I7c1b1a5e12373ca5da44709500ff5acedc6f1316
Reviewed-on: https://go-review.googlesource.com/108376
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-21 04:18:03 +00:00
Josh Bleecher Snyder
37dd7cd040 runtime: use sys.PtrSize in growslice
Minor cleanup.

Change-Id: I4175de392969bb6408081a75cebdaeadcef1e68c
Reviewed-on: https://go-review.googlesource.com/108576
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-21 01:05:32 +00:00
Aditya Mukerjee
3d8940a9ee runtime: specify behavior of SetMutexProfileFraction for negative values
Change-Id: Ie4da1a515d5405140d742bdcd55f54a73a7f71fe
Reviewed-on: https://go-review.googlesource.com/108175
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-19 17:03:45 +00:00
Yuval Pavel Zholkover
744ccbb852 runtime: fast clock_gettime call on FreeBSD
Use AT_TIMEKEEP ELF aux entry to access a kernel mapped ring of timehands structs.
The timehands are updated by the kernel periodically, but for accurate measure the
timecounter still needs to be queried.
Currently the fast path is used only when kern.timecounter.hardware==TSC-low
or kern.timecounter.hardware=='ARM MPCore Timecounter',
other timecounters revert back to regular system call.

TODO: add support for HPET timecounter on 386/amd64.

Change-Id: I321ca4e92be63ba21a2574b758ef5c1e729086ad
Reviewed-on: https://go-review.googlesource.com/93156
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 21:54:26 +00:00
Heschi Kreinick
acebfba755 runtime: use saved state in SIGPROF handler for vDSO calls
VDSO calls do manual stack alignment, which doesn't get tracked in the
pcsp table. Without accurate pcsp information, backtracing them is
dangerous, and causes a crash in the SIGPROF handler. Fortunately,
https://golang.org/cl/97315 saves a clean state in m.vdsoPC/SP. Change
to use those if they're present, without attempting a normal backtrace.

Fixes #24925

Change-Id: I4b8501ae73a9d18209e22f839773c4fe6102a509
Reviewed-on: https://go-review.googlesource.com/107778
Run-TryBot: Heschi Kreinick <heschi@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-18 17:35:42 +00:00
Yuval Pavel Zholkover
0a4b962c17 runtime/internal/atomic: don't use Cas in atomic.Load on ARM
Instead issue a memory barrier on ARMv7 after reading the address.

Fixes #23777

Change-Id: I7aff2ab0246af64b437ebe0b31d4b30d351890d8
Reviewed-on: https://go-review.googlesource.com/94275
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-18 15:34:14 +00:00
Cherry Zhang
f83e421268 cmd/internal/obj/arm, runtime: delete old ARM softfloat code
CL 106735 changed to the new softfloat support on GOARM=5.

ARM assembly code that uses FP instructions not guarded on GOARM,
if any, will break. The easiest way to fix is probably to use Go
implementation on GOARM=5, like

	MOVB	runtime·goarm(SB), R11
	CMP	$5, R11
	BEQ	arm5
	... FP instructions ...
	RET
arm5:
	CALL or JMP to Go implementation

Change-Id: I52fc76fac9c854ebe7c6c856c365fba35d3f560a
Reviewed-on: https://go-review.googlesource.com/107475
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-17 18:27:55 +00:00
Tobias Klauser
d0712096b3 runtime: use internal/cpu.X86.HasAVX2 instead of support_avx2
After CL 104636 cpu.X86.HasAVX is set early enough that it can be used
in runtime·memclrNoHeapPointers. Add an offset to use in assembly and
replace the only occurence of support_avx2.

Change-Id: Icada62efeb3e24d71251d55623a8a8602364c9a8
Reviewed-on: https://go-review.googlesource.com/106595
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-04-15 20:33:59 +00:00
Ilya Tocar
e53cc7ae93 runtime: avoid division in growslice
Add a special case for power-of-2 sized elements.
We can replace div/mul with left/right shift and avoid expensive operation.
growslice is hotter for short slices of small elements, such as int16, so
add an int16 version for GrowSlice benchmark.

name                   old time/op  new time/op  delta
GrowSlice/Byte-6       61.3ns ± 3%  60.5ns ± 4%  -1.33%  (p=0.002 n=30+30)
GrowSlice/Int16-6      94.0ns ± 4%  84.7ns ± 2%  -9.82%  (p=0.000 n=30+30)
GrowSlice/Int-6         100ns ± 1%    99ns ± 1%  -0.25%  (p=0.032 n=29+28)
GrowSlice/Ptr-6         197ns ± 2%   195ns ± 2%  -0.94%  (p=0.001 n=30+29)
GrowSlice/Struct/24-6   168ns ± 1%   166ns ± 2%  -1.09%  (p=0.000 n=25+30)
GrowSlice/Struct/32-6   187ns ± 2%   180ns ± 1%  -3.59%  (p=0.000 n=30+30)
GrowSlice/Struct/40-6   241ns ± 2%   238ns ± 2%  -1.41%  (p=0.000 n=30+30)

Change-Id: I31e8388d73fd9356e2dcc091d8d92eef3e3ccdbc
Reviewed-on: https://go-review.googlesource.com/102279
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-04-13 21:33:52 +00:00
Eric Daniels
d9b006a705 runtime/traceback: support tracking goroutine ancestor tracebacks with GODEBUG="tracebackancestors=N"
Currently, collecting a stack trace via runtime.Stack captures the stack for the
immediately running goroutines. This change extends those tracebacks to include
the tracebacks of their ancestors. This is done with a low memory cost and only
utilized when debug option tracebackancestors is set to a value greater than 0.

Resolves #22289

Change-Id: I7edacc62b2ee3bd278600c4a21052c351f313f3a
Reviewed-on: https://go-review.googlesource.com/70993
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-13 20:42:38 +00:00
Ian Lance Taylor
eb4f33243e runtime: document that LockOSThread in init locks to thread
This is more or less implied by the spec language on initialization,
but restate it for clarity.

Fixes #23112

Change-Id: Ibe5385acafe4eac38823de98a025cd37f7a77d3b
Reviewed-on: https://go-review.googlesource.com/103399
Reviewed-by: Austin Clements <austin@google.com>
2018-04-13 03:15:02 +00:00
Bryan C. Mills
8e43c37229 runtime: make Playground timestamps change when the stream fd changes
The process reading the output of the binary may read stderr and
stdout separately, and may interleave reads from the two streams
arbitrarily. Because we explicitly serialize writes on the writer
side, we can reuse a timestamp within a single stream without losing
information; however, if we use the same timestamp for write on both
streams, the reader can't tell how to interleave them.

This change ensures that every time we change between the two fds, we
also bump the timestamp. That way, writes within a stream continue to
show the same timestamp, but a sorted merge of the contents of the two
streams always interleaves them in the correct order.

This still requires a corresponding change to the Playground parser to
actually reconstruct the correct interleaving. It currently merges the
two streams without reordering them; it should instead buffer them
separately and perform a sorted merge. (See
https://golang.org/cl/105496.)

Updates golang/go#24615.
Updates golang/go#24659.

Change-Id: Id789dfcc02eb4247906c9ddad38dac50cf829979
Reviewed-on: https://go-review.googlesource.com/105235
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Yury Smolsky <yury@smolsky.by>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-11 18:12:02 +00:00
Tobias Klauser
9446eaa944 go/build, runtime/internal/sys: reserve RISC-V arch names
In #17528 it was discussed (off-topic to the actual issue) to reserve
GOARCH names for the RISC-V architecture. With the first RISC-V
Linux-capable development boards released (e.g. HiFive Unleashed),
Linux distributions being ported to RISC-V (e.g. Debian, Fedora) and
RISC-V support being added to gccgo (CL 96377), it becomes more likely
that Go software (and maybe Go itself) will be ported as well.

Add riscv and riscv64 (which is already used by gccgo), so Go 1.11 will
already recognize "*_riscv{,64}.go" as reserved files.

Change-Id: I042aab19c68751d82ea513e40f7b1d7e1ad924ea
Reviewed-on: https://go-review.googlesource.com/106256
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-11 15:51:18 +00:00
Tobias Klauser
e0ac5f540b runtime: use internal/cpu instead of support_avx
After CL 104636 cpu.X86.HasAVX is set early enough that it can be used
to determine useAVXmemmove. Use it and remove support_avx.

Change-Id: Ib7a627bede2bf96c92362507e742bd833cb42a74
Reviewed-on: https://go-review.googlesource.com/106235
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-11 13:50:04 +00:00
Keith Randall
b3a854c733 runtime: use fixed TLS offsets on darwin/amd64 and darwin/386
Fixes #23617

Note that this CL does not affect darwin/arm and darwin/arm64,
still TBD what, if anything, needs to be done for those.

This is a fix of CL 105975 which was reverted in CL 106155.
Needed to use movl instead of movq for 386.

Change-Id: I0db7f8087173869e60cc22c6c3124fa0a0739b46
Reviewed-on: https://go-review.googlesource.com/106156
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-11 00:04:30 +00:00
Keith Randall
8bb8eaff62 Revert "runtime: use fixed TLS offsets on darwin/amd64 and darwin/386"
This reverts commit 76e92d1c9e.

Reason for revert: Seems to have broken the darwin/386 builder, the toolchain is barfing on the new inline assembly.

Change-Id: Ic83fa3c85148946529c5fd47d1e1669898031ace
Reviewed-on: https://go-review.googlesource.com/106155
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-10 18:40:19 +00:00
Meng Zhuo
6b5236ae53 runtime: use internal/cpu in alginit
After CL 104636 the feature flags in internal/cpu are initialized before
alginit and can now be used for aeshash feature detection. Also remove
now unused runtime variables:
x86:
	support_ssse3
	support_sse42
	support_aes
arm64:
	support_aes

Change-Id: I2f64198d91750eaf3c6cf2aac6e9e17615811ec8
Reviewed-on: https://go-review.googlesource.com/106015
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-10 16:33:26 +00:00
Elias Naur
144fae8ed5 misc/ios,runtime/cgo: remove SIGINT handshake for the iOS exec wrapper
Once upon a time, the iOS exec wrapper needed to change the current
working directory for the binary being tested. To allow that, the
runtime raised a SIGINT signal that the wrapper caught, changed the
working directory and resumed the process.

These days, the current working directory is passed from the wrapper
to the runtime through a special entry in the app metadata and the
SIGINT handshake is not necessary anymore.

Remove the signaling from the runtime and the exec harness.

Change-Id: Ia53bcc9e4724d2ca00207e22b91ce80a05271b55
Reviewed-on: https://go-review.googlesource.com/106096
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2018-04-10 16:30:07 +00:00
Meng Zhuo
954f651ccc internal/cpu,runtime: call cpu.initialize before alginit
runtime.alginit needs runtime/support_{aes,ssse3,sse41} feature flag
to init aeshash function but internal/cpu.init not be called yet.
This CL will call internal/cpu.initialize before runtime.alginit, so
that we can move all cpu features related code to internal/cpu.

Change-Id: I00b8e403ace3553f8c707563d95f27dade0bc853
Reviewed-on: https://go-review.googlesource.com/104636
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-10 16:13:52 +00:00
Tobias Klauser
ace5fa1a60 runtime: remove support_bmi{1,2}
The code reading these variables was removed in CL 41476. They are only
set but never read now, so remove them.

Change-Id: I6b0b8d813e9a3ec2a13586ff92746e00ad1b5bf0
Reviewed-on: https://go-review.googlesource.com/106095
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-10 14:30:10 +00:00
Keith Randall
76e92d1c9e runtime: use fixed TLS offsets on darwin/amd64 and darwin/386
Fixes #23617

Note that this CL does not affect darwin/arm and darwin/arm64,
still TBD what, if anything, needs to be done for those.

Change-Id: Ie1ee02a9f4d4d1fb9cd5dc432d900f926cc157db
Reviewed-on: https://go-review.googlesource.com/105975
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-10 14:28:39 +00:00
Egon Elbre
cd682f3832 runtime: improve Windows gdb tests
This ensures that gdb tests run on Windows by ignoring any line ending.

Works with gdb 7.7, however with gdb 7.9 and 7.12 gets an error
error:

    internal-error: buildsym_init: Assertion `free_pendings == NULL'
    failed.

Updates #21380

Change-Id: I6a6e5b2a1b5efdca4dfce009fcb9c134c87497d6
Reviewed-on: https://go-review.googlesource.com/102419
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2018-04-07 08:29:15 +00:00
Austin Clements
fcb7488add runtime: factor waiting on mark phase
There are three places where we wait for the GC mark phase to
complete. Factor these all into a single helper function.

Fixes #24362.

Change-Id: I47f6a7147974f5b9a2869c527a024519070ba6f3
Reviewed-on: https://go-review.googlesource.com/102605
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-04-06 19:10:48 +00:00
Austin Clements
e13a213c7f runtime: distinguish semaphore wait from sync.Cond.Wait
Updates #24362.

Change-Id: Ided1ab31792f05d9d7a86f17c1bcbd9e9b80052c
Reviewed-on: https://go-review.googlesource.com/102606
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-06 18:09:38 +00:00
Josh Bleecher Snyder
2e7e57770c runtime: avoid calling adjustpointers unnecessarily
adjustpointers loops over a bitmap.
If the length of that bitmap is zero,
we can skip making the call entirely.
This speeds up stack copying when there are
no pointers present in either args or locals.

name                old time/op  new time/op  delta
StackCopyPtr-8       101ms ± 4%    90ms ± 4%  -10.95%  (p=0.000 n=87+93)
StackCopy-8         80.1ms ± 4%  72.6ms ± 4%   -9.41%  (p=0.000 n=98+100)
StackCopyNoCache-8   121ms ± 3%   113ms ± 3%   -6.57%  (p=0.000 n=98+97)

Change-Id: I7a272e19bc9a14fa3e3318771ebd082dc6247d25
Reviewed-on: https://go-review.googlesource.com/104737
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-05 21:43:23 +00:00
Joel Sing
a7bb8d3eb8 runtime: fix/improve exitThread on openbsd
OpenBSD's __threxit syscall takes a pointer to a 32-bit value that will be
zeroed immediately before the thread exits. Make use of this instead of
zeroing freeWait from the exitThread assembly and using hacks like switching
to a static stack, so this works on 386.

Change-Id: I3ec5ead82b6496404834d148f713794d5d9da723
Reviewed-on: https://go-review.googlesource.com/105055
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-05 19:29:35 +00:00
Robert Griesemer
542ea5ad91 go/printer, gofmt: tuned table alignment for better results
The go/printer (and thus gofmt) uses a heuristic to determine
whether to break alignment between elements of an expression
list which is spread across multiple lines. The heuristic only
kicked in if the entry sizes (character length) was above a
certain threshold (20) and the ratio between the previous and
current entry size was above a certain value (4).

This heuristic worked reasonably most of the time, but also
led to unfortunate breaks in many cases where a single entry
was suddenly much smaller (or larger) then the previous one.

The behavior of gofmt was sufficiently mysterious in some of
these situations that many issues were filed against it.

The simplest solution to address this problem is to remove
the heuristic altogether and have a programmer introduce
empty lines to force different alignments if it improves
readability. The problem with that approach is that the
places where it really matters, very long tables with many
(hundreds, or more) entries, may be machine-generated and
not "post-processed" by a human (e.g., unicode/utf8/tables.go).

If a single one of those entries is overlong, the result
would be that the alignment would force all comments or
values in key:value pairs to be adjusted to that overlong
value, making the table hard to read (e.g., that entry may
not even be visible on screen and all other entries seem
spaced out too wide).

Instead, we opted for a slightly improved heuristic that
behaves much better for "normal", human-written code.

1) The threshold is increased from 20 to 40. This disables
the heuristic for many common cases yet even if the alignment
is not "ideal", 40 is not that many characters per line with
todays screens, making it very likely that the entire line
remains "visible" in an editor.

2) Changed the heuristic to not simply look at the size ratio
between current and previous line, but instead considering the
geometric mean of the sizes of the previous (aligned) lines.
This emphasizes the "overall picture" of the previous lines,
rather than a single one (which might be an outlier).

3) Changed the ratio from 4 to 2.5. Now that we ignore sizes
below 40, a ratio of 4 would mean that a new entry would have
to be 4 times bigger (160) or smaller (10) before alignment
would be broken. A ratio of 2.5 seems more sensible.

Applied updated gofmt to all of src and misc. Also tested
against several former issues that complained about this
and verified that the output for the given examples is
satisfactory (added respective test cases).

Some of the files changed because they were not gofmt-ed
in the first place.

For #644.
For #7335.
For #10392.
(and probably more related issues)

Fixes #22852.

Change-Id: I5e48b3d3b157a5cf2d649833b7297b33f43a6f6e
2018-04-04 13:39:34 -07:00
Austin Clements
4946d9e87b runtime: stop when we run out of hints in race mode
Currently, the runtime falls back to asking for any address the OS can
offer for the heap when it runs out of hint addresses. However, the
race detector assumes the heap lives in [0x00c000000000,
0x00e000000000), and will fail in a non-obvious way if we go outside
this region.

Fix this by actively throwing a useful error if we run out of heap
hints in race mode.

This problem is currently being triggered by TestArenaCollision, which
intentionally triggers this fallback behavior. Fix the test to look
for the new panic message in race mode.

Fixes #24670.
Updates #24133.

Change-Id: I57de6d17a3495dc1f1f84afc382cd18a6efc2bf7
Reviewed-on: https://go-review.googlesource.com/104717
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-04 18:08:04 +00:00
Meng Zhuo
73e0c302b2 runtime: implement aeshash for arm64 platform
Fix #10109

name                  old time/op    new time/op    delta
Hash5                   72.3ns ± 0%    51.5ns ± 0%   -28.71%  (p=0.000 n=4+5)
Hash16                  78.8ns ± 0%    48.7ns ± 0%      ~     (p=0.079 n=4+5)
Hash64                   196ns ±25%      73ns ±16%   -62.68%  (p=0.008 n=5+5)
Hash1024                1.57µs ± 0%    0.27µs ± 1%   -82.90%  (p=0.000 n=5+4)
Hash65536               96.5µs ± 0%    14.3µs ± 0%   -85.14%  (p=0.016 n=5+4)
HashStringSpeed          156ns ± 6%     129ns ± 3%   -17.56%  (p=0.008 n=5+5)
HashBytesSpeed           227ns ± 1%     200ns ± 1%   -11.98%  (p=0.008 n=5+5)
HashInt32Speed           116ns ± 2%     102ns ± 0%   -11.92%  (p=0.016 n=5+4)
HashInt64Speed           120ns ± 3%     101ns ± 2%   -15.55%  (p=0.008 n=5+5)
HashStringArraySpeed     342ns ± 0%     306ns ± 2%   -10.58%  (p=0.008 n=5+5)
FastrandHashiter         217ns ± 1%     217ns ± 1%      ~     (p=1.000 n=5+5)

name                  old speed      new speed      delta
Hash5                 69.1MB/s ± 0%  97.0MB/s ± 0%   +40.32%  (p=0.008 n=5+5)
Hash16                 203MB/s ± 0%   329MB/s ± 0%   +61.76%  (p=0.016 n=4+5)
Hash64                 332MB/s ±21%   881MB/s ±14%  +165.66%  (p=0.008 n=5+5)
Hash1024               651MB/s ± 0%  3652MB/s ±17%  +460.73%  (p=0.008 n=5+5)
Hash65536              679MB/s ± 0%  4570MB/s ± 0%  +572.85%  (p=0.016 n=5+4)

Change-Id: I573028979f84cf2e0e087951271d5de8865dbf04
Reviewed-on: https://go-review.googlesource.com/89755
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-04 16:25:39 +00:00
Josh Bleecher Snyder
00fab20582 runtime: improve StackCopy benchmarks
Make the StackCopyNoCache test easier to read.

Add a StackCopyPtr test that actually has some pointers
that need adjusting.

Change-Id: I5b07c26f40cb485c9de97ed63fac89a9e6f36650
Reviewed-on: https://go-review.googlesource.com/104195
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-04 04:45:12 +00:00
Tobias Klauser
01237b1362 runtime: parse auxv for page size and executable name on Solaris
Decode AT_PAGESZ to determine physPageSize and AT_SUN_EXECNAME for
os.Executable.

Change-Id: I6ff774ad9d76c68fc61eb307df58217c17fd578d
Reviewed-on: https://go-review.googlesource.com/104375
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-03 15:49:45 +00:00
Richard Musiol
80e69220c8 go/build, go/types, cmd/dist: add js/wasm architecture
This is the first commit of a series that will add WebAssembly
as an architecture target. The design document can be found at
https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4.

The GOARCH name "wasm" is the official abbreviation of WebAssembly.
The GOOS name "js" got chosen because initially the host environment
that executes WebAssembly bytecode will be web browsers and Node.js,
which both use JavaScript to embed WebAssembly. Other GOOS values
may be possible later, see:
https://github.com/WebAssembly/design/blob/master/NonWeb.md

Updates #18892

Change-Id: Ia25b4fa26bba8029c25923f48ad009fd3681933a
Reviewed-on: https://go-review.googlesource.com/102835
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-30 21:34:18 +00:00
Tobias Klauser
9364c13d09 runtime: parse auxv for page size on dragonfly
Decode AT_PAGESZ to determine physPageSize on dragonfly.

Change-Id: I7236d7cbe43433f16dffddad19c1655bc0c7f31d
Reviewed-on: https://go-review.googlesource.com/103257
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-29 14:19:13 +00:00
Yuval Pavel Zholkover
0a5be12f5c cmd/internal/obj/arm: add DMB instruction
Change-Id: Ib67a61d5b37af210ff15d60d72bd5238b9c2d0ca
Reviewed-on: https://go-review.googlesource.com/94815
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-27 19:54:44 +00:00
Tobias Klauser
4ff4e50725 runtime: parse auxv for page size on netbsd
Decode AT_PAGESZ to determine physPageSize on netbsd.

Also rename vdso_none.go to auxv_none.go which matches its purpose more
closely.

Akin to CL 99780 which did the same for freebsd.

Change-Id: Iea4322f861ff0f3515e9051585dbb442f024326b
Reviewed-on: https://go-review.googlesource.com/102677
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-27 15:21:51 +00:00
Meng Zhuo
ea59ebd338 runtime: use vDSO for clock_gettime on linux/arm64
Use the __vdso_clock_gettime fast path via the vDSO on linux/arm64 to
speed up nanotime and walltime. This results in the following
performance improvement for time.Now on Cavium ThunderX:

name     old time/op  new time/op  delta
TimeNow   442ns ± 0%   163ns ± 0%  -63.16%  (p=0.000 n=10+10)

And benchmarks on VDSO

BenchmarkClockVDSOAndFallbackPaths/vDSO         10000000 166 ns/op
BenchmarkClockVDSOAndFallbackPaths/Fallback     3000000 456 ns/op

Change-Id: I326118c6dff865eaa0569fc45d1fc1ff95cb74f6
Reviewed-on: https://go-review.googlesource.com/99855
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-27 13:21:27 +00:00
Tim Wright
131901e80d cmd/go, cmd/link, runtime: enable PIE build mode, cgo race tests on FreeBSD
Fixes #24546

Change-Id: I99ebd5bc18e5c5e42eee4689644a7a8b02405f31
Reviewed-on: https://go-review.googlesource.com/102616
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-27 02:50:29 +00:00
Zhou Peng
3412baaa02 runtime: fix comment typo
This was a typo mistake according to if cond and runtime/mheap.go:323

Change-Id: Id046d4afbfe0ea43cb29e1a9f400e1f130de221d
Reviewed-on: https://go-review.googlesource.com/102575
Reviewed-by: Austin Clements <austin@google.com>
2018-03-26 17:40:46 +00:00
Daniel Martí
8da180f6ca all: remove some unused return parameters
As found by unparam. Picked the low-hanging fruit, consisting only of
errors that were always nil and results that were never used. Left out
those that were useful for consistency with other func signatures.

Change-Id: I06b52bbd3541f8a5d66659c909bd93cb3e172018
Reviewed-on: https://go-review.googlesource.com/102418
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-24 19:44:47 +00:00
Tobias Klauser
786899a72a runtime: adjust GOARM floating point compatibility error message
As pointed out by Josh Bleecher Snyder in CL 99780.

The check is for GOARM > 6, so suggest to recompile with either GOARM=5
or GOARM=6.

Change-Id: I6a97e87bdc17aa3932f5c8cb598bba85c3cf4be9
Reviewed-on: https://go-review.googlesource.com/101936
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-03-24 04:47:27 +00:00
Yuval Pavel Zholkover
6f47fa2d6c runtime: fix AT_HWCAP auxv parsing on freebsd
AT_HWCAP is not available on FreeBSD-11.1-RELEASE or earlier and the wrong const was used.
Use the correct value, and initialize hwcap with ^uint32(0) inorder not to fail the VFP tests.

Fixes #24507.

Change-Id: I5c3eed57bb53bf992b7de0eec88ea959806306b9
Reviewed-on: https://go-review.googlesource.com/102355
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-23 19:51:25 +00:00
Carlos Eduardo Seo
6633bb2aa7 cmd/compile/internal/ppc64, runtime internal/atomic, sync/atomic: implement faster atomics for ppc64x
This change implements faster atomics for ppc64x based on the ISA 2.07B,
Appendix B.2 recommendations, replacing SYNC/ISYNC by LWSYNC in some
cases.

Updates #21348

name                                           old time/op new time/op    delta
Cond1-16                                           955ns     856ns      -10.33%
Cond2-16                                          2.38µs    2.03µs      -14.59%
Cond4-16                                          5.90µs    5.44µs       -7.88%
Cond8-16                                          12.1µs    11.1µs       -8.42%
Cond16-16                                         27.0µs    25.1µs       -7.04%
Cond32-16                                         59.1µs    55.5µs       -6.14%
LoadMostlyHits/*sync_test.DeepCopyMap-16          22.1ns    24.1ns       +9.02%
LoadMostlyHits/*sync_test.RWMutexMap-16            252ns     249ns       -1.20%
LoadMostlyHits/*sync.Map-16                       16.2ns    16.3ns         ~
LoadMostlyMisses/*sync_test.DeepCopyMap-16        22.3ns    22.6ns         ~
LoadMostlyMisses/*sync_test.RWMutexMap-16          249ns     247ns       -0.51%
LoadMostlyMisses/*sync.Map-16                     12.7ns    12.7ns         ~
LoadOrStoreBalanced/*sync_test.RWMutexMap-16      1.27µs    1.17µs       -7.54%
LoadOrStoreBalanced/*sync.Map-16                  1.12µs    1.10µs       -2.35%
LoadOrStoreUnique/*sync_test.RWMutexMap-16        1.75µs    1.68µs       -3.84%
LoadOrStoreUnique/*sync.Map-16                    2.07µs    1.97µs       -5.13%
LoadOrStoreCollision/*sync_test.DeepCopyMap-16    15.8ns    15.9ns         ~
LoadOrStoreCollision/*sync_test.RWMutexMap-16      496ns     424ns      -14.48%
LoadOrStoreCollision/*sync.Map-16                 6.07ns    6.07ns         ~
Range/*sync_test.DeepCopyMap-16                   1.65µs    1.64µs         ~
Range/*sync_test.RWMutexMap-16                     278µs     288µs       +3.75%
Range/*sync.Map-16                                2.00µs    2.01µs         ~
AdversarialAlloc/*sync_test.DeepCopyMap-16        3.45µs    3.44µs         ~
AdversarialAlloc/*sync_test.RWMutexMap-16          226ns     227ns         ~
AdversarialAlloc/*sync.Map-16                     1.09µs    1.07µs       -2.36%
AdversarialDelete/*sync_test.DeepCopyMap-16        553ns     550ns       -0.57%
AdversarialDelete/*sync_test.RWMutexMap-16         273ns     274ns         ~
AdversarialDelete/*sync.Map-16                     247ns     249ns         ~
UncontendedSemaphore-16                           79.0ns    65.5ns      -17.11%
ContendedSemaphore-16                              112ns      97ns      -13.77%
MutexUncontended-16                               3.34ns    2.51ns      -24.69%
Mutex-16                                           266ns     191ns      -28.26%
MutexSlack-16                                      226ns     159ns      -29.55%
MutexWork-16                                       377ns     338ns      -10.14%
MutexWorkSlack-16                                  335ns     308ns       -8.20%
MutexNoSpin-16                                     196ns     184ns       -5.91%
MutexSpin-16                                       710ns     666ns       -6.21%
Once-16                                           1.29ns    1.29ns         ~
Pool-16                                           8.64ns    8.71ns         ~
PoolOverflow-16                                   1.60µs    1.44µs      -10.25%
SemaUncontended-16                                5.39ns    4.42ns      -17.96%
SemaSyntNonblock-16                                539ns     483ns      -10.42%
SemaSyntBlock-16                                   413ns     354ns      -14.20%
SemaWorkNonblock-16                                305ns     258ns      -15.36%
SemaWorkBlock-16                                   266ns     229ns      -14.06%
RWMutexUncontended-16                             12.9ns     9.7ns      -24.80%
RWMutexWrite100-16                                 203ns     147ns      -27.47%
RWMutexWrite10-16                                  177ns     119ns      -32.74%
RWMutexWorkWrite100-16                             435ns     403ns       -7.39%
RWMutexWorkWrite10-16                              642ns     611ns       -4.79%
WaitGroupUncontended-16                           4.67ns    3.70ns      -20.92%
WaitGroupAddDone-16                                402ns     355ns      -11.54%
WaitGroupAddDoneWork-16                            208ns     250ns      +20.09%
WaitGroupWait-16                                  1.21ns    1.21ns         ~
WaitGroupWaitWork-16                              5.91ns    5.87ns       -0.81%
WaitGroupActuallyWait-16                          92.2ns    85.8ns       -6.91%

Updates #21348

Change-Id: Ibb9b271d11b308264103829e176c6d9fe8f867d3
Reviewed-on: https://go-review.googlesource.com/95175
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2018-03-22 14:13:01 +00:00
Tim Wright
88129f0cb2 all: enable c-shared/c-archive support for freebsd/amd64
Fixes #14327
Much of the code is based on the linux/amd64 code that implements these
build modes, and code is shared where possible.

Change-Id: Ia510f2023768c0edbc863aebc585929ec593b332
Reviewed-on: https://go-review.googlesource.com/93875
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-21 21:56:20 +00:00
isharipo
ff5cf43df5 runtime,sync/atomic: replace asm BYTEs with insts for x86
For each replacement, test case is added to new 386enc.s file
with exception of EMMS, SYSENTER, MFENCE and LFENCE as they
are already covered in amd64enc.s (same on amd64 and 386).

The replacement became less obvious after go vet suggested changes
Before:
	BYTE $0x0f; BYTE $0x7f; BYTE $0x44; BYTE $0x24; BYTE $0x08
Changed to MOVQ (this form is being tested):
	MOVQ M0, 8(SP)
Refactored to FP-relative access (go vet advice):
	MOVQ M0, val+4(FP)

Change-Id: I56b87cf3371b6ad81ad0cd9db2033aee407b5818
Reviewed-on: https://go-review.googlesource.com/101475
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-03-21 20:51:04 +00:00
Tobias Klauser
2e84dc2596 runtime: parse auxv on freebsd
Decode AT_PAGESZ to determine physPageSize on freebsd/{386,amd64,arm}
and AT_HWCAP for hwcap and hardDiv on freebsd/arm. Also use hwcap to
perform the FP checks in checkgoarm akin to the linux/arm
implementation.

Change-Id: I532810a1581efe66277e4305cb234acdc79ee91e
Reviewed-on: https://go-review.googlesource.com/99780
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-21 15:40:01 +00:00
Vladimir Kuzmin
c12b185a6e cmd/compile: avoid mapaccess at m[k]=append(m[k]..
Currently rvalue m[k] is transformed during walk into:

        tmp1 := *mapaccess(m, k)
        tmp2 := append(tmp1, ...)
        *mapassign(m, k) = tmp2

However, this is suboptimal, as we could instead produce just:
        tmp := mapassign(m, k)
        *tmp := append(*tmp, ...)

Optimization is possible only if during Order it may tell that m[k] is
exactly the same at left and right part of assignment. It doesn't work:
1) m[f(k)] = append(m[f(k)], ...)
2) sink, m[k] = sink, append(m[k]...)
3) m[k] = append(..., m[k],...)

Benchmark:
name                           old time/op    new time/op    delta
MapAppendAssign/Int32/256-8      33.5ns ± 3%    22.4ns ±10%  -33.24%  (p=0.000 n=16+18)
MapAppendAssign/Int32/65536-8    68.2ns ± 6%    48.5ns ±29%  -28.90%  (p=0.000 n=20+20)
MapAppendAssign/Int64/256-8      34.3ns ± 4%    23.3ns ± 5%  -32.23%  (p=0.000 n=17+18)
MapAppendAssign/Int64/65536-8    65.9ns ± 7%    61.2ns ±19%   -7.06%  (p=0.002 n=18+20)
MapAppendAssign/Str/256-8         116ns ±12%      79ns ±16%  -31.70%  (p=0.000 n=20+19)
MapAppendAssign/Str/65536-8       134ns ±15%     111ns ±45%  -16.95%  (p=0.000 n=19+20)

name                           old alloc/op   new alloc/op   delta
MapAppendAssign/Int32/256-8       47.0B ± 0%     46.0B ± 0%   -2.13%  (p=0.000 n=19+18)
MapAppendAssign/Int32/65536-8     27.0B ± 0%     20.7B ±30%  -23.33%  (p=0.000 n=20+20)
MapAppendAssign/Int64/256-8       47.0B ± 0%     46.0B ± 0%   -2.13%  (p=0.000 n=20+17)
MapAppendAssign/Int64/65536-8     27.0B ± 0%     27.0B ± 0%     ~     (all equal)
MapAppendAssign/Str/256-8         94.0B ± 0%     78.0B ± 0%  -17.02%  (p=0.000 n=20+16)
MapAppendAssign/Str/65536-8       54.0B ± 0%     54.0B ± 0%     ~     (all equal)

Fixes #24364
Updates #5147

Change-Id: Id257d052b75b9a445b4885dc571bf06ce6f6b409
Reviewed-on: https://go-review.googlesource.com/100838
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-20 01:47:07 +00:00
Matthew Dempsky
86a338960d reflect: sort exported methods first
By moving exported methods to the front of method lists, filtering
down to only the exported methods just needs a count of how many
exported methods exist, which the compiler can statically
provide. This allows getting rid of the exported method cache.

For #22075.

Change-Id: I8eeb274563a2940e1347c34d673f843ae2569064
Reviewed-on: https://go-review.googlesource.com/100846
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-15 21:56:08 +00:00
David Chase
1c24ffbf93 cmd/compile: turn on DWARF locations lists for ssa vars
This changes the default setting for -dwarflocationlists
from false to true, removes the flag from ssa/debug_test.go,
and updates runtime/runtime-gdb_test.go to match a change
in debugging output for composite variables.

Current benchmarks (perflock, -count 10)

benchstat -geomean before.log after.log
name        old time/op     new time/op     delta
Template        175ms ± 0%      182ms ± 1%   +3.68%  (p=0.000 n=8+9)
Unicode        82.0ms ± 2%     82.8ms ± 1%   +0.96%  (p=0.019 n=9+9)
GoTypes         590ms ± 1%      611ms ± 1%   +3.42%  (p=0.000 n=9+10)
Compiler        2.85s ± 0%      2.95s ± 1%   +3.60%  (p=0.000 n=9+10)
SSA             6.42s ± 1%      6.70s ± 1%   +4.31%  (p=0.000 n=10+9)
Flate           113ms ± 2%      117ms ± 1%   +3.11%  (p=0.000 n=10+9)
GoParser        140ms ± 1%      145ms ± 1%   +3.47%  (p=0.000 n=10+9)
Reflect         384ms ± 0%      398ms ± 1%   +3.56%  (p=0.000 n=8+9)
Tar             165ms ± 1%      171ms ± 1%   +3.33%  (p=0.000 n=9+9)
XML             207ms ± 2%      214ms ± 1%   +3.41%  (p=0.000 n=9+9)
StdCmd          11.8s ± 2%      12.4s ± 2%   +4.41%  (p=0.000 n=10+9)
[Geo mean]      489ms           506ms        +3.38%

name        old user-ns/op  new user-ns/op  delta
Template         247M ± 4%       254M ± 4%   +2.76%  (p=0.040 n=10+10)
Unicode          118M ±16%       121M ±11%     ~     (p=0.364 n=10+10)
GoTypes          805M ± 2%       824M ± 2%   +2.37%  (p=0.003 n=9+8)
Compiler        3.92G ± 2%      4.01G ± 2%   +2.20%  (p=0.001 n=9+9)
SSA             9.63G ± 4%     10.00G ± 2%   +3.81%  (p=0.000 n=10+9)
Flate            155M ±10%       154M ± 7%     ~     (p=0.718 n=9+10)
GoParser         184M ±11%       190M ± 7%     ~     (p=0.220 n=10+9)
Reflect          506M ± 4%       528M ± 2%   +4.27%  (p=0.000 n=10+10)
Tar              224M ± 4%       227M ± 5%     ~     (p=0.207 n=10+9)
XML              272M ± 7%       286M ± 4%   +5.23%  (p=0.010 n=10+9)
[Geo mean]       489M            502M        +2.76%

name        old text-bytes  new text-bytes  delta
HelloSize        672k ± 0%       672k ± 0%     ~     (all equal)
CmdGoSize       7.21M ± 0%      7.21M ± 0%     ~     (all equal)
[Geo mean]      2.20M           2.20M        +0.00%

name        old data-bytes  new data-bytes  delta
HelloSize       9.88k ± 0%      9.88k ± 0%     ~     (all equal)
CmdGoSize        248k ± 0%       248k ± 0%     ~     (all equal)
[Geo mean]      49.5k           49.5k        +0.00%

name        old bss-bytes   new bss-bytes   delta
HelloSize        125k ± 0%       125k ± 0%     ~     (all equal)
CmdGoSize        144k ± 0%       144k ± 0%     ~     (all equal)
[Geo mean]       135k            135k        +0.00%

name        old exe-bytes   new exe-bytes   delta
HelloSize       1.10M ± 0%      1.30M ± 0%  +17.82%  (p=0.000 n=10+10)
CmdGoSize       11.6M ± 0%      13.5M ± 0%  +16.90%  (p=0.000 n=10+10)
[Geo mean]      3.57M           4.19M       +17.36%

Change-Id: I250055813cadd25cebee8da1f9a7f995a6eae432
Reviewed-on: https://go-review.googlesource.com/100738
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-15 21:34:17 +00:00
Keith Randall
9d4215311b runtime: identify special functions by flag instead of address
When there are plugins, there may not be a unique copy of runtime
functions like goexit, mcall, etc.  So identifying them by entry
address is problematic.  Instead, keep track of each special function
using a field in the symbol table.  That way, multiple copies of
the same runtime function will be treated identically.

Fixes #24351
Fixes #23133

Change-Id: Iea3232df8a6af68509769d9ca618f530cc0f84fd
Reviewed-on: https://go-review.googlesource.com/100739
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-15 17:31:57 +00:00
Tobias Klauser
5fcfe6b6ae runtime: use Android O friendly faccessat syscall on linux/amd64
The Android O seccomp policy disallows the access syscall on amd64, see
https://android.googlesource.com/platform/bionic/+/android-4.2.2_r1.2/libc/SYSCALLS.TXT

Use the faccessat syscall with AT_FDCWD instead to achieve the same
behavior.

Updates #24403

Change-Id: I9db847c1c0f33987a3479b3f96e721fb9588cde2
Reviewed-on: https://go-review.googlesource.com/100877
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-15 10:07:19 +00:00
Josh Bleecher Snyder
c830e05a20 runtime: fix another typo in runtime-gdb.py
tuple, touple,
gdb, gdv,
let's call the whole thing off.

Change-Id: I72d12f6c75061777474e7dec2c90d2a8a3715da6
Reviewed-on: https://go-review.googlesource.com/100836
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-15 02:04:45 +00:00
Josh Bleecher Snyder
183fd6f19b runtime: print goid when throwing for split stack overflow
Change-Id: I66515156c2fc6886312c0eccb86d7ceaf7947042
Reviewed-on: https://go-review.googlesource.com/100465
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-15 00:18:23 +00:00
Josh Bleecher Snyder
ef400ed20a runtime: refactor gdb PC parsing
Change-Id: I91607edaf9c256e6723eb3d6e18c8210eb86b704
Reviewed-on: https://go-review.googlesource.com/100464
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-15 00:18:19 +00:00
Balaram Makam
b46d398887 runtime: improve arm64 memclr implementation
Improve runtime memclr_arm64.s using ZVA feature to zero out memory when n
is at least 64 bytes.

Also add DCZID_EL0 system register to use in MRS instruction.

    Benchmark results of runtime/Memclr on Amberwing:
name          old time/op    new time/op    delta
Memclr/5        12.7ns ± 0%    12.7ns ± 0%      ~     (all equal)
Memclr/16       12.7ns ± 0%    12.2ns ± 1%    -4.13%  (p=0.000 n=7+8)
Memclr/64       14.0ns ± 0%    14.6ns ± 1%    +4.29%  (p=0.000 n=7+8)
Memclr/256      23.7ns ± 0%    25.7ns ± 0%    +8.44%  (p=0.000 n=8+7)
Memclr/4096      204ns ± 0%      74ns ± 0%   -63.71%  (p=0.000 n=8+8)
Memclr/65536    2.89µs ± 0%    0.84µs ± 0%   -70.91%  (p=0.000 n=8+8)
Memclr/1M       45.9µs ± 0%    17.0µs ± 0%   -62.88%  (p=0.000 n=8+8)
Memclr/4M        184µs ± 0%      77µs ± 4%   -57.94%  (p=0.001 n=6+8)
Memclr/8M        367µs ± 0%     144µs ± 1%   -60.72%  (p=0.000 n=7+8)
Memclr/16M       734µs ± 0%     293µs ± 1%   -60.09%  (p=0.000 n=8+8)
Memclr/64M      2.94ms ± 0%    1.23ms ± 0%   -58.06%  (p=0.000 n=7+8)
GoMemclr/5      8.00ns ± 0%    8.79ns ± 0%    +9.83%  (p=0.000 n=8+8)
GoMemclr/16     8.00ns ± 0%    7.60ns ± 0%    -5.00%  (p=0.000 n=8+8)
GoMemclr/64     10.8ns ± 0%    10.4ns ± 0%    -3.70%  (p=0.000 n=8+8)
GoMemclr/256    20.4ns ± 0%    21.2ns ± 0%    +3.92%  (p=0.000 n=8+8)

name          old speed      new speed      delta
Memclr/5       394MB/s ± 0%   393MB/s ± 0%    -0.28%  (p=0.006 n=8+8)
Memclr/16     1.26GB/s ± 0%  1.31GB/s ± 1%    +4.07%  (p=0.000 n=7+8)
Memclr/64     4.57GB/s ± 0%  4.39GB/s ± 2%    -3.91%  (p=0.000 n=7+8)
Memclr/256    10.8GB/s ± 0%  10.0GB/s ± 0%    -7.95%  (p=0.001 n=7+6)
Memclr/4096   20.1GB/s ± 0%  55.3GB/s ± 0%  +175.46%  (p=0.000 n=8+8)
Memclr/65536  22.6GB/s ± 0%  77.8GB/s ± 0%  +243.63%  (p=0.000 n=7+8)
Memclr/1M     22.8GB/s ± 0%  61.5GB/s ± 0%  +169.38%  (p=0.000 n=8+8)
Memclr/4M     22.8GB/s ± 0%  54.3GB/s ± 4%  +137.85%  (p=0.001 n=6+8)
Memclr/8M     22.8GB/s ± 0%  58.1GB/s ± 1%  +154.56%  (p=0.000 n=7+8)
Memclr/16M    22.8GB/s ± 0%  57.2GB/s ± 1%  +150.54%  (p=0.000 n=8+8)
Memclr/64M    22.8GB/s ± 0%  54.4GB/s ± 0%  +138.42%  (p=0.000 n=7+8)
GoMemclr/5     625MB/s ± 0%   569MB/s ± 0%    -8.90%  (p=0.000 n=7+8)
GoMemclr/16   2.00GB/s ± 0%  2.10GB/s ± 0%    +5.26%  (p=0.000 n=8+8)
GoMemclr/64   5.92GB/s ± 0%  6.15GB/s ± 0%    +3.83%  (p=0.000 n=7+8)
GoMemclr/256  12.5GB/s ± 0%  12.1GB/s ± 0%    -3.77%  (p=0.000 n=8+7)

    Benchmark results of runtime/Memclr on Amberwing without ZVA:
name          old time/op    new time/op    delta
Memclr/5        12.7ns ± 0%    12.8ns ± 0%   +0.79%  (p=0.008 n=5+5)
Memclr/16       12.7ns ± 0%    12.7ns ± 0%     ~     (p=0.444 n=5+5)
Memclr/64       14.0ns ± 0%    14.4ns ± 0%   +2.86%  (p=0.008 n=5+5)
Memclr/256      23.7ns ± 1%    19.2ns ± 0%  -19.06%  (p=0.008 n=5+5)
Memclr/4096      203ns ± 0%     119ns ± 0%  -41.38%  (p=0.008 n=5+5)
Memclr/65536    2.89µs ± 0%    1.66µs ± 0%  -42.76%  (p=0.008 n=5+5)
Memclr/1M       45.9µs ± 0%    26.2µs ± 0%  -42.82%  (p=0.008 n=5+5)
Memclr/4M        184µs ± 0%     105µs ± 0%  -42.81%  (p=0.008 n=5+5)
Memclr/8M        367µs ± 0%     210µs ± 0%  -42.76%  (p=0.008 n=5+5)
Memclr/16M       734µs ± 0%     420µs ± 0%  -42.74%  (p=0.008 n=5+5)
Memclr/64M      2.94ms ± 0%    1.69ms ± 0%  -42.46%  (p=0.008 n=5+5)
GoMemclr/5      8.00ns ± 0%    8.40ns ± 0%   +5.00%  (p=0.008 n=5+5)
GoMemclr/16     8.00ns ± 0%    8.40ns ± 0%   +5.00%  (p=0.008 n=5+5)
GoMemclr/64     10.8ns ± 0%     9.6ns ± 0%  -11.02%  (p=0.008 n=5+5)
GoMemclr/256    20.4ns ± 0%    17.2ns ± 0%  -15.69%  (p=0.008 n=5+5)

name          old speed      new speed      delta
Memclr/5       393MB/s ± 0%   391MB/s ± 0%   -0.64%  (p=0.008 n=5+5)
Memclr/16     1.26GB/s ± 0%  1.26GB/s ± 0%   -0.55%  (p=0.008 n=5+5)
Memclr/64     4.57GB/s ± 0%  4.44GB/s ± 0%   -2.79%  (p=0.008 n=5+5)
Memclr/256    10.8GB/s ± 0%  13.3GB/s ± 0%  +23.07%  (p=0.016 n=4+5)
Memclr/4096   20.1GB/s ± 0%  34.3GB/s ± 0%  +70.91%  (p=0.008 n=5+5)
Memclr/65536  22.7GB/s ± 0%  39.6GB/s ± 0%  +74.65%  (p=0.008 n=5+5)
Memclr/1M     22.8GB/s ± 0%  40.0GB/s ± 0%  +74.88%  (p=0.008 n=5+5)
Memclr/4M     22.8GB/s ± 0%  39.9GB/s ± 0%  +74.84%  (p=0.008 n=5+5)
Memclr/8M     22.9GB/s ± 0%  39.9GB/s ± 0%  +74.71%  (p=0.008 n=5+5)
Memclr/16M    22.9GB/s ± 0%  39.9GB/s ± 0%  +74.64%  (p=0.008 n=5+5)
Memclr/64M    22.8GB/s ± 0%  39.7GB/s ± 0%  +73.79%  (p=0.008 n=5+5)
GoMemclr/5     625MB/s ± 0%   595MB/s ± 0%   -4.77%  (p=0.000 n=4+5)
GoMemclr/16   2.00GB/s ± 0%  1.90GB/s ± 0%   -4.77%  (p=0.008 n=5+5)
GoMemclr/64   5.92GB/s ± 0%  6.66GB/s ± 0%  +12.48%  (p=0.016 n=4+5)
GoMemclr/256  12.5GB/s ± 0%  14.9GB/s ± 0%  +18.95%  (p=0.008 n=5+5)

Fixes #22948

Change-Id: Iaae4e22391e25b54d299821bb7f8a81ac3986b93
Reviewed-on: https://go-review.googlesource.com/82055
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-14 18:20:40 +00:00
Tobias Klauser
f0939ba5b1 runtime, syscall: add RawSyscall6 on Solaris and make it panic
The syscall package currently declares RawSyscall6 for every GOOS, but
does not define it on Solaris. This leads to code using said function
to compile but it will not link. Fix it by adding RawSyscall6 and make
it panic.

Also remove the obsolete comment above runtime.syscall_syscall as
pointed out by Aram.

Updates #24357

Change-Id: I1b1423121d1c99de2ecc61cd9a935dba9b39e3a4
Reviewed-on: https://go-review.googlesource.com/100655
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
2018-03-14 16:01:39 +00:00
David du Colombier
523f2ea77b runtime: don't use floating point in findnull on Plan 9
In CL 98015, findnull was rewritten so it uses bytes.IndexByte.

This broke the build on plan9/amd64 because the implementation
of bytes.IndexByte on AMD64 relies on SSE instructions while
floating point instructions are not allowed in the note handler.

This change fixes findnull by using the former implementation
on Plan 9, so it doesn't use bytes.IndexByte.

Fixes #24387.

Change-Id: I084d1a44d38d9f77a6c1ad492773f0a98226be16
Reviewed-on: https://go-review.googlesource.com/100577
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2018-03-14 13:15:52 +00:00
Josh Bleecher Snyder
4d38d3ae33 runtime: fix typo in gdb script
Change-Id: I9d4b3e25b00724f0e4870c6082671b4f14cc18fc
Reviewed-on: https://go-review.googlesource.com/100463
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-14 05:49:00 +00:00
Josh Bleecher Snyder
6cb064c9c4 Revert "runtime: convert g.waitreason from string to uint8"
This reverts commit 4eea887fd4.

Reason for revert: broke s390x build

Change-Id: Id6c2b6a7319273c4d21f613d4cdd38b00d49f847
Reviewed-on: https://go-review.googlesource.com/100375
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-03-13 15:21:21 +00:00
Josh Bleecher Snyder
4eea887fd4 runtime: convert g.waitreason from string to uint8
Every time I poke at #14921, the g.waitreason string
pointer writes show up.

They're not particularly important performance-wise,
but it'd be nice to clear the noise away.

And it does open up a few extra bytes in the g struct
for some future use.

Change-Id: I7ffbd52fbc2a286931a2218038fda52ed6473cc9
Reviewed-on: https://go-review.googlesource.com/99078
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-12 21:56:50 +00:00
Tobias Klauser
025134b0d1 runtime: simplify range expressions in tests
Generated by running gofmt -s on the files in question.

Change-Id: If6578b150e1bfced8657196d2af01f5d36879f93
Reviewed-on: https://go-review.googlesource.com/100135
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-12 19:58:42 +00:00
Vladimir Kuzmin
7395083136 cmd/compile: avoid extra mapaccess in "m[k] op= r"
Currently, order desugars map assignment operations like

    m[k] op= r

into

    m[k] = m[k] op r

which in turn is transformed during walk into:

    tmp := *mapaccess(m, k)
    tmp = tmp op r
    *mapassign(m, k) = tmp

However, this is suboptimal, as we could instead produce just:

    *mapassign(m, k) op= r

One complication though is if "r == 0", then "m[k] /= r" and "m[k] %=
r" will panic, and they need to do so *before* calling mapassign,
otherwise we may insert a new zero-value element into the map.

It would be spec compliant to just emit the "r != 0" check before
calling mapassign (see #23735), but currently these checks aren't
generated until SSA construction. For now, it's simpler to continue
desugaring /= and %= into two map indexing operations.

Fixes #23661.

Change-Id: I46e3739d9adef10e92b46fdd78b88d5aabe68952
Reviewed-on: https://go-review.googlesource.com/91557
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-12 19:27:44 +00:00
Austin Clements
0def0f2e99 runtime: fix abort handling on arm64
The implementation of runtime.abort on arm64 currently branches to
address 0, which results in a signal from PC 0, rather than from
runtime.abort, so the runtime fails to recognize it as an abort.

Fix runtime.abort on arm64 to read from address 0 like what other
architectures do and recognize this in the signal handler.

Should fix the linux/arm64 build.

Change-Id: I960ab630daaeadc9190287604d4d8337b1ea3853
Reviewed-on: https://go-review.googlesource.com/99895
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-09 22:17:04 +00:00
Ilya Tocar
91102bf723 runtime: use bytes.IndexByte in findnull
bytes.IndexByte is heavily optimized. Use it in findnull.
This is second attempt, similar to CL97523.
In this version we never call IndexByte on region of memory,
that crosses page boundary. A bit slower than CL97523,
but still fast:

name        old time/op  new time/op  delta
GoString-6   164ns ± 2%   118ns ± 0%  -28.00%  (p=0.000 n=10+6)

findnull is also used in gostringnocopy,
which is used in many hot spots in the runtime.

Fixes #23830

Change-Id: Id843dd4f65a34309d92bdd8df229e484d26b0cb2
Reviewed-on: https://go-review.googlesource.com/98015
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-09 19:37:39 +00:00
Ian Lance Taylor
4902778607 runtime: set libcall values for Solaris system calls
This lets SIGPROF signals get a useful traceback.
Without it we just see sysvicallN calling asmcgocall.

Updates #24142

Change-Id: I5dfe3add51f0c3a4cb1c98acb7738be6396214bc
Reviewed-on: https://go-review.googlesource.com/99617
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-09 19:27:20 +00:00
Josh Bleecher Snyder
031f71efdf runtime: add TestSizeof
Borrowed from cmd/compile, TestSizeof ensures
that the size of important types doesn't change unexpectedly.
It also helps reviewers see the impact of intended changes.

Change-Id: If57955f0c3e66054de3f40c6bba585b88694c7be
Reviewed-on: https://go-review.googlesource.com/99837
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-09 17:03:25 +00:00
Tobias Klauser
91f74069ef runtime: fix comment for hwcap on linux/arm
hwcap is set in archauxv, setup_auxv no longer exists.

Change-Id: I0fc9393e0c1c45192e0eff4715e9bdd69fab2653
Reviewed-on: https://go-review.googlesource.com/99779
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-09 16:02:38 +00:00
Austin Clements
5d22cebb12 runtime: explain and enforce that _panic values live on the stack
It's a bit mysterious that _defer.sp is a uintptr that gets
stack-adjusted explicitly while _panic.argp is an unsafe.Pointer that
doesn't, but turns out to be critically important when a deferred
function grows the stack before doing a recover.

Add a comment explaining that this works because _panic values live on
the stack. Enforce this by marking _panic go:notinheap.

Change-Id: I9ca49e84ee1f86d881552c55dccd0662b530836b
Reviewed-on: https://go-review.googlesource.com/99735
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-03-08 23:35:46 +00:00
Austin Clements
60a9e5d613 runtime: ensure abort actually crashes the process
On all non-x86 arches, runtime.abort simply reads from nil.
Unfortunately, if this happens on a user stack, the signal handler
will dutifully turn this into a panicmem, which lets user defers run
and which user code can even recover from.

To fix this, add an explicit check to the signal handler that turns
faults in abort into hard crashes directly in the signal handler. This
has the added benefit of giving a register dump at the abort point.

Change-Id: If26a7f13790745ee3867db7f53b72d8281176d70
Reviewed-on: https://go-review.googlesource.com/93661
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:55:55 +00:00
Austin Clements
c950a90d72 runtime: call abort instead of raw INT $3 or bad MOV
Everything except for amd64, amd64p32, and 386 currently defines and
uses an abort function. This CL makes these match. The next CL will
recognize the abort function to make this more useful.

Change-Id: I7c155871ea48919a9220417df0630005b444f488
Reviewed-on: https://go-review.googlesource.com/93660
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:55:54 +00:00
Austin Clements
7f1b2738bb runtime: make throw safer to call
Currently, throw may grow the stack, which means whenever we call it
from a context where it's not safe to grow the stack, we first have to
switch to the system stack. This is pretty easy to get wrong.

Fix this by making throw switch to the system stack so it doesn't grow
the stack and is hence safe to call without a system stack switch at
the call site.

The only thing this complicates is badsystemstack itself, which would
now go into an infinite loop before printing anything (previously it
would also go into an infinite loop, but would at least print the
error first). Fix this by making badsystemstack do a direct write and
then crash hard.

Change-Id: Ic5b4a610df265e47962dcfa341cabac03c31c049
Reviewed-on: https://go-review.googlesource.com/93659
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:55:52 +00:00
Austin Clements
9d59234cbe runtime: move unrecoverable panic handling to the system stack
Currently parts of unrecoverable panic handling (notably, printing
panic messages) can happen on the user stack. This may grow the stack,
which is generally fine, but if we're handling a runtime panic, it's
better to do as little as possible in case the runtime is in an
inconsistent state.

Hence, this commit rearranges the handling of unrecoverable panics so
that it's done entirely on the system stack.

This is mostly a matter of shuffling code a bit so everything can move
into a systemstack block. The one slight subtlety is in the "panic
during panic" case, where we now depend on startpanic_m's caller to
print the stack rather than startpanic_m itself. To make this work,
startpanic_m now returns a boolean indicating that the caller should
avoid trying to print any panic messages and get right to the stack
trace. Since the caller is already in a position to do this, this
actually simplifies things a little.

Change-Id: Id72febe8c0a9fb31d9369b600a1816d65a49bfed
Reviewed-on: https://go-review.googlesource.com/93658
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:55:51 +00:00
Ian Lance Taylor
3d69ef37b8 runtime: use systemstack around throw in sysSigaction
Try to fix the build on ppc64-linux and ppc64le-linux, avoiding:

--- FAIL: TestInlinedRoutineRecords (2.12s)
	dwarf_test.go:97: build: # command-line-arguments
		runtime.systemstack: nosplit stack overflow
			752	assumed on entry to runtime.sigtrampgo (nosplit)
			480	after runtime.sigtrampgo (nosplit) uses 272
			400	after runtime.sigfwdgo (nosplit) uses 80
			264	after runtime.setsig (nosplit) uses 136
			208	after runtime.sigaction (nosplit) uses 56
			136	after runtime.sysSigaction (nosplit) uses 72
			88	after runtime.throw (nosplit) uses 48
			16	after runtime.dopanic (nosplit) uses 72
			-16	after runtime.systemstack (nosplit) uses 32

	dwarf_test.go:98: build error: exit status 2
--- FAIL: TestAbstractOriginSanity (10.22s)
	dwarf_test.go:97: build: # command-line-arguments
		runtime.systemstack: nosplit stack overflow
			752	assumed on entry to runtime.sigtrampgo (nosplit)
			480	after runtime.sigtrampgo (nosplit) uses 272
			400	after runtime.sigfwdgo (nosplit) uses 80
			264	after runtime.setsig (nosplit) uses 136
			208	after runtime.sigaction (nosplit) uses 56
			136	after runtime.sysSigaction (nosplit) uses 72
			88	after runtime.throw (nosplit) uses 48
			16	after runtime.dopanic (nosplit) uses 72
			-16	after runtime.systemstack (nosplit) uses 32

	dwarf_test.go:98: build error: exit status 2
FAIL
FAIL	cmd/link/internal/ld	13.404s

Change-Id: I4840604adb0e9f68a8d8e24f2f2a1a17d1634a58
Reviewed-on: https://go-review.googlesource.com/99415
Reviewed-by: Austin Clements <austin@google.com>
2018-03-08 16:35:53 +00:00
Ian Lance Taylor
419c06455a runtime: get traceback from VDSO code
Currently if a profiling signal arrives while executing within a VDSO
the profiler will report _ExternalCode, which is needlessly confusing
for a pure Go program. Change the VDSO calling code to record the
caller's PC/SP, so that we can do a traceback from that point. If that
fails for some reason, report _VDSO rather than _ExternalCode, which
should at least point in the right direction.

This adds some instructions to the code that calls the VDSO, but the
slowdown is reasonably negligible:

name                                  old time/op  new time/op  delta
ClockVDSOAndFallbackPaths/vDSO-8      40.5ns ± 2%  41.3ns ± 1%  +1.85%  (p=0.002 n=10+10)
ClockVDSOAndFallbackPaths/Fallback-8  41.9ns ± 1%  43.5ns ± 1%  +3.84%  (p=0.000 n=9+9)
TimeNow-8                             41.5ns ± 3%  41.5ns ± 2%    ~     (p=0.723 n=10+10)

Fixes #24142

Change-Id: Iacd935db3c4c782150b3809aaa675a71799b1c9c
Reviewed-on: https://go-review.googlesource.com/97315
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-07 23:35:25 +00:00
Ian Lance Taylor
c2f28de732 runtime: change from rt_sigaction to sigaction
This normalizes the Linux code to act like other targets. The size
argument to the rt_sigaction system call is pushed to a single
function, sysSigaction.

This is intended as a simplification step for CL 93875 for #14327.

Change-Id: I594788e235f0da20e16e8a028e27ac8c883907c4
Reviewed-on: https://go-review.googlesource.com/99077
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-07 23:30:02 +00:00
Elias Naur
7a2a96d6ad runtime/cgo: make sure nil is undefined before defining it
While working on standalone builds of gomobile bindings, I ran into
errors on the form:

gcc_darwin_arm.c:30:31: error: ambiguous expansion of macro 'nil' [-Werror,-Wambiguous-macro]
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS11.2.sdk/usr/include/MacTypes.h:94:15: note: expanding this definition of 'nil'

Fix it by undefining nil before defining it in libcgo.h.

Change-Id: I8e9660a68c6c351e592684d03d529f0d182c0493
Reviewed-on: https://go-review.googlesource.com/99215
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-07 21:08:19 +00:00
Yuval Pavel Zholkover
083f3957b8 runtime: add missing build constraints to os_linux_{be64,noauxv,novdso,ppc64x}.go files
They do not match the file name patterns of
  *_GOOS
  *_GOARCH
  *_GOOS_GOARCH
therefore the implicit linux constraint was not being added.

Change-Id: Ie506c51cee6818db445516f96fffaa351df62cf5
Reviewed-on: https://go-review.googlesource.com/99116
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-07 14:26:19 +00:00
Matthew Dempsky
2c0c68d621 cmd/compile: fix miscompilation of "defer delete(m, k)"
Previously, for slow map key types (i.e., any type other than a 32-bit
or 64-bit plain memory type), we would rewrite

    defer delete(m, k)

into

    ktmp := k
    defer delete(m, &ktmp)

However, if the defer statement was inside a loop, we would end up
reusing the same ktmp value for all of the deferred deletes.

We already rewrite

    defer print(x, y, z)

into

    defer func(a1, a2, a3) {
        print(a1, a2, a3)
    }(x, y, z)

This CL generalizes this rewrite to also apply for slow map deletes.

This could be extended to apply even more generally to other builtins,
but as discussed on #24259, there are cases where we must *not* do
this (e.g., "defer recover()"). However, if we elect to do this more
generally, this CL should still make that easier.

Lastly, while here, fix a few isues in wrapCall (nee walkprintfunc):

1) lookupN appends the generation number to the symbol anyway, so "%d"
was being literally included in the generated function names.

2) walkstmt will be called when the function is compiled later anyway,
so no need to do it now.

Fixes #24259.

Change-Id: I70286867c64c69c18e9552f69e3f4154a0fc8b04
Reviewed-on: https://go-review.googlesource.com/99017
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-06 23:33:28 +00:00
Josh Bleecher Snyder
f7739c07c8 runtime: skip pointless writes in freedefer
Change-Id: I501a0e5c87ec88616c7dcdf1b723758b6df6c088
Reviewed-on: https://go-review.googlesource.com/98758
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-06 18:58:57 +00:00
Tobias Klauser
9745397e1d runtime: fix stack switch check in walltime/nanotime on linux/arm
CL 98095 got the check wrong. We should be testing
'getg() == getg().m.curg', not 'getg().m == getg().m.curg'.

Change-Id: I32f6238b00409b67afa8efe732513d542aec5bc7
Reviewed-on: https://go-review.googlesource.com/98855
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-06 14:24:19 +00:00
Meng Zhuo
8916773a3d runtime, cmd/compile: use ldp for DUFFCOPY on ARM64
name         old time/op  new time/op  delta
CopyFat8     2.15ns ± 1%  2.19ns ± 6%     ~     (p=0.171 n=8+9)
CopyFat12    2.15ns ± 0%  2.17ns ± 2%     ~     (p=0.137 n=8+10)
CopyFat16    2.17ns ± 3%  2.15ns ± 0%     ~     (p=0.211 n=10+10)
CopyFat24    2.16ns ± 1%  2.15ns ± 0%     ~     (p=0.087 n=10+10)
CopyFat32    11.5ns ± 0%  12.8ns ± 2%  +10.87%  (p=0.000 n=8+10)
CopyFat64    20.2ns ± 2%  12.9ns ± 0%  -36.11%  (p=0.000 n=10+10)
CopyFat128   37.2ns ± 0%  21.5ns ± 0%  -42.20%  (p=0.000 n=10+10)
CopyFat256   71.6ns ± 0%  38.7ns ± 0%  -45.95%  (p=0.000 n=10+10)
CopyFat512    140ns ± 0%    73ns ± 0%  -47.86%  (p=0.000 n=10+9)
CopyFat520    142ns ± 0%    74ns ± 0%  -47.54%  (p=0.000 n=10+10)
CopyFat1024   277ns ± 0%   141ns ± 0%  -49.10%  (p=0.000 n=10+10)

Change-Id: If54bc571add5db674d5e081579c87e80153d0a5a
Reviewed-on: https://go-review.googlesource.com/97395
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-06 04:14:59 +00:00
Hana Kim
d3946f75d3 internal/trace: remove backlinks from span/task end to start
This is an updated version of golang.org/cl/96395, with the fix to
TestUserSpan.

This reverts commit 7b6f6267e90a8e4eab37a3f2164ba882e6222adb.

Change-Id: I31eec8ba0997f9178dffef8dac608e731ab70872
Reviewed-on: https://go-review.googlesource.com/98236
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-05 20:10:22 +00:00
Ian Lance Taylor
7178267b59 runtime: rename vdso symbols to use camel case
This was originally C code using names with underscores, which were
retained when the code was rewritten into Go. Change the code to use
Go-like camel case names.

The names that come from the ELF ABI are left unchanged.

Change-Id: I181bc5dd81284c07bc67b7df4635f4734b41d646
Reviewed-on: https://go-review.googlesource.com/98520
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-05 19:12:32 +00:00
Tobias Klauser
5f80e70912 runtime: remove unused SYS_* definitions on Linux
Also fix the indentation of the SYS_* definitions in sys_linux_mipsx.s
and order them numerically.

Change-Id: I0c454301c329a163e7db09dcb25d4e825149858c
Reviewed-on: https://go-review.googlesource.com/98448
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-05 18:32:08 +00:00
Keith Randall
ee58eccc56 internal/bytealg: move short string Index implementations into bytealg
Also move the arm64 CountByte implementation while we're here.

Fixes #19792

Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e
Reviewed-on: https://go-review.googlesource.com/98518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 19:49:44 +00:00
Keith Randall
f6332bb84a internal/bytealg: move compare functions to bytealg
Move bytes.Compare and runtime·cmpstring to bytealg.

Update #19792

Change-Id: I139e6d7c59686bef7a3017e3dec99eba5fd10447
Reviewed-on: https://go-review.googlesource.com/98515
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 17:49:39 +00:00
Keith Randall
45964e4f9c internal/bytealg: move Count to bytealg
Move bytes.Count and strings.Count to bytealg.

Update #19792

Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb
Reviewed-on: https://go-review.googlesource.com/98495
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 17:49:25 +00:00
Tobias Klauser
51b027116c runtime: use vDSO for clock_gettime on linux/arm
Use the __vdso_clock_gettime fast path via the vDSO on linux/arm to
speed up nanotime and walltime. This results in the following
performance improvement for time.Now on a RaspberryPi 3 (running
32bit Raspbian, i.e. GOOS=linux/GOARCH=arm):

name     old time/op  new time/op  delta
TimeNow  0.99µs ± 0%  0.39µs ± 1%  -60.74%  (p=0.000 n=12+20)

Change-Id: I3598278a6c88d7f6a6ce66c56b9d25f9dd2f4c9a
Reviewed-on: https://go-review.googlesource.com/98095
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-03 12:12:58 +00:00
Tobias Klauser
c69f60d071 runtime: remove unused __vdso_time_sym
It's unused since https://golang.org/cl/99320043

Change-Id: I74d69ff894aa2fb556f1c2083406c118c559d91b
Reviewed-on: https://go-review.googlesource.com/98195
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-03 12:11:38 +00:00
Keith Randall
1dfa380e3d internal/bytealg: move equal functions to bytealg
Move bytes.Equal, runtime.memequal, and runtime.memequal_varlen
to the bytealg package.

Update #19792

Change-Id: Ic4175e952936016ea0bda6c7c3dbb33afdc8e4ac
Reviewed-on: https://go-review.googlesource.com/98355
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-03 04:18:27 +00:00
Keith Randall
403ab0f221 internal/bytealg: move IndexByte asssembly to the new bytealg package
Move the IndexByte function from the runtime to a new bytealg package.
The new package will eventually hold all the optimized assembly for
groveling through byte slices and strings. It seems a better home for
this code than randomly keeping it in runtime.

Once this is in, the next step is to move the other functions
(Compare, Equal, ...).

Update #19792

This change seems complicated enough that we might just declare
"not worth it" and abandon.  Opinions welcome.

The core assembly is all unchanged, except minor modifications where
the code reads cpu feature bits.

The wrapper functions have been cleaned up as they are now actually
checked by vet.

Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796
Reviewed-on: https://go-review.googlesource.com/98016
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 22:46:15 +00:00
Zhou Peng
b77aad0891 runtime: fix typo, func comments should start with function name
Change-Id: I289af4884583537639800e37928c22814d38cba9
Reviewed-on: https://go-review.googlesource.com/98115
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2018-03-02 12:03:30 +00:00
Brad Fitzpatrick
1fadbc1a76 Revert "runtime: use bytes.IndexByte in findnull"
This reverts commit 7365fac2db.

Reason for revert: breaks the build on some architectures, reading unmapped pages?

Change-Id: I3a8c02dc0b649269faacea79ecd8213defa97c54
Reviewed-on: https://go-review.googlesource.com/97995
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-01 22:22:51 +00:00
Balaram Makam
213a75171d runtime: improve arm64 memmove implementation
Improve runtime memmove_arm64.s specializing for small copies and
processing 32 bytes per iteration for 32 bytes or more.

Benchmark results of runtime/Memmove on Amberwing:
name                      old time/op    new time/op     delta
Memmove/0                   7.61ns ± 0%     7.20ns ± 0%     ~     (p=0.053 n=5+7)
Memmove/1                   9.28ns ± 0%     8.80ns ± 0%   -5.17%  (p=0.000 n=4+8)
Memmove/2                   9.65ns ± 0%     9.20ns ± 0%   -4.68%  (p=0.000 n=5+8)
Memmove/3                   10.0ns ± 0%      9.2ns ± 0%   -7.83%  (p=0.000 n=5+8)
Memmove/4                   10.6ns ± 0%      9.2ns ± 0%  -13.21%  (p=0.000 n=5+8)
Memmove/5                   11.0ns ± 0%      9.2ns ± 0%  -16.36%  (p=0.000 n=5+8)
Memmove/6                   12.4ns ± 0%      9.2ns ± 0%  -25.81%  (p=0.000 n=5+8)
Memmove/7                   13.1ns ± 0%      9.2ns ± 0%  -29.56%  (p=0.000 n=5+8)
Memmove/8                   9.10ns ± 1%     9.20ns ± 0%   +1.08%  (p=0.002 n=5+8)
Memmove/9                   9.67ns ± 0%     9.20ns ± 0%   -4.88%  (p=0.000 n=5+8)
Memmove/10                  10.4ns ± 0%      9.2ns ± 0%  -11.54%  (p=0.000 n=5+8)
Memmove/11                  10.9ns ± 0%      9.2ns ± 0%  -15.60%  (p=0.000 n=5+8)
Memmove/12                  11.5ns ± 0%      9.2ns ± 0%  -20.00%  (p=0.000 n=5+8)
Memmove/13                  12.4ns ± 0%      9.2ns ± 0%  -25.81%  (p=0.000 n=5+8)
Memmove/14                  13.1ns ± 0%      9.2ns ± 0%  -29.77%  (p=0.000 n=5+8)
Memmove/15                  13.8ns ± 0%      9.2ns ± 0%  -33.33%  (p=0.000 n=5+8)
Memmove/16                  9.70ns ± 0%     9.20ns ± 0%   -5.19%  (p=0.000 n=5+8)
Memmove/32                  10.6ns ± 0%      9.2ns ± 0%  -13.21%  (p=0.000 n=4+8)
Memmove/64                  13.4ns ± 0%     10.2ns ± 0%  -23.88%  (p=0.000 n=4+8)
Memmove/128                 18.1ns ± 1%     13.2ns ± 0%  -26.99%  (p=0.000 n=5+8)
Memmove/256                 25.2ns ± 0%     16.4ns ± 0%  -34.92%  (p=0.000 n=5+8)
Memmove/512                 36.4ns ± 0%     22.8ns ± 0%  -37.36%  (p=0.000 n=5+8)
Memmove/1024                70.1ns ± 0%     36.8ns ±11%  -47.49%  (p=0.002 n=5+8)
Memmove/2048                 121ns ± 0%       61ns ± 0%     ~     (p=0.053 n=5+7)
Memmove/4096                 224ns ± 0%      120ns ± 0%  -46.43%  (p=0.000 n=5+8)
MemmoveUnalignedDst/0       8.40ns ± 0%     8.00ns ± 0%   -4.76%  (p=0.000 n=5+8)
MemmoveUnalignedDst/1       9.87ns ± 1%    10.00ns ± 0%     ~     (p=0.070 n=5+8)
MemmoveUnalignedDst/2       10.6ns ± 0%     10.4ns ± 0%   -1.89%  (p=0.000 n=5+8)
MemmoveUnalignedDst/3       10.8ns ± 0%     10.4ns ± 0%   -3.70%  (p=0.000 n=5+8)
MemmoveUnalignedDst/4       10.9ns ± 0%     10.3ns ± 0%     ~     (p=0.053 n=5+7)
MemmoveUnalignedDst/5       11.5ns ± 0%     10.3ns ± 1%  -10.22%  (p=0.000 n=4+8)
MemmoveUnalignedDst/6       13.2ns ± 0%     10.4ns ± 1%  -21.50%  (p=0.000 n=5+8)
MemmoveUnalignedDst/7       13.7ns ± 0%     10.3ns ± 1%  -24.64%  (p=0.000 n=4+8)
MemmoveUnalignedDst/8       10.1ns ± 0%     10.4ns ± 0%   +2.97%  (p=0.002 n=5+8)
MemmoveUnalignedDst/9       10.7ns ± 0%     10.4ns ± 0%   -2.80%  (p=0.000 n=5+8)
MemmoveUnalignedDst/10      11.2ns ± 1%     10.4ns ± 0%   -6.81%  (p=0.000 n=5+8)
MemmoveUnalignedDst/11      11.6ns ± 0%     10.4ns ± 0%  -10.34%  (p=0.000 n=5+8)
MemmoveUnalignedDst/12      12.5ns ± 2%     10.4ns ± 0%  -16.53%  (p=0.000 n=5+8)
MemmoveUnalignedDst/13      13.7ns ± 0%     10.4ns ± 0%  -24.09%  (p=0.000 n=5+8)
MemmoveUnalignedDst/14      14.0ns ± 0%     10.4ns ± 0%  -25.71%  (p=0.000 n=5+8)
MemmoveUnalignedDst/15      14.6ns ± 0%     10.4ns ± 0%  -28.77%  (p=0.000 n=5+8)
MemmoveUnalignedDst/16      10.5ns ± 0%     10.4ns ± 0%   -0.95%  (p=0.000 n=5+8)
MemmoveUnalignedDst/32      12.4ns ± 0%     11.6ns ± 0%   -6.05%  (p=0.000 n=5+8)
MemmoveUnalignedDst/64      15.2ns ± 0%     12.3ns ± 0%  -19.08%  (p=0.000 n=5+8)
MemmoveUnalignedDst/128     18.7ns ± 0%     15.2ns ± 0%  -18.72%  (p=0.000 n=5+8)
MemmoveUnalignedDst/256     25.1ns ± 0%     18.6ns ± 0%  -25.90%  (p=0.000 n=5+8)
MemmoveUnalignedDst/512     37.8ns ± 0%     24.4ns ± 0%  -35.45%  (p=0.000 n=5+8)
MemmoveUnalignedDst/1024    74.6ns ± 0%     40.4ns ± 0%     ~     (p=0.053 n=5+7)
MemmoveUnalignedDst/2048     133ns ± 0%       75ns ± 0%  -43.91%  (p=0.000 n=5+8)
MemmoveUnalignedDst/4096     247ns ± 0%      141ns ± 0%  -42.91%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/0       8.40ns ± 0%     8.00ns ± 0%   -4.76%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/1       9.81ns ± 0%    10.00ns ± 0%   +1.98%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/2       10.5ns ± 0%     10.0ns ± 0%   -4.76%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/3       10.7ns ± 1%     10.0ns ± 0%   -6.89%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/4       11.3ns ± 0%     10.0ns ± 0%  -11.50%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/5       11.6ns ± 0%     10.0ns ± 0%  -13.79%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/6       13.6ns ± 0%     10.0ns ± 0%  -26.47%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/7       14.4ns ± 0%     10.0ns ± 0%  -30.75%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/8       9.87ns ± 1%    10.00ns ± 0%     ~     (p=0.070 n=5+8)
MemmoveUnalignedSrc/9       10.4ns ± 0%     10.0ns ± 0%   -3.85%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/10      11.2ns ± 0%     10.0ns ± 0%  -10.71%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/11      11.8ns ± 0%     10.0ns ± 0%  -15.25%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/12      12.1ns ± 0%     10.0ns ± 0%  -17.36%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/13      13.6ns ± 0%     10.0ns ± 0%  -26.47%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/14      14.7ns ± 0%     10.0ns ± 0%  -31.79%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/15      14.4ns ± 0%     10.0ns ± 0%  -30.56%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/16      11.0ns ± 0%     10.0ns ± 0%   -9.09%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/32      11.5ns ± 0%     10.0ns ± 0%  -13.04%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/64      14.9ns ± 0%     11.2ns ± 0%  -24.83%  (p=0.000 n=4+8)
MemmoveUnalignedSrc/128     19.5ns ± 0%     15.2ns ± 0%  -22.05%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/256     27.3ns ± 2%     19.2ns ± 0%  -29.62%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/512     40.4ns ± 0%     27.2ns ± 0%  -32.67%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/1024    75.4ns ± 0%     44.4ns ± 0%  -41.15%  (p=0.000 n=5+8)
MemmoveUnalignedSrc/2048     131ns ± 0%       77ns ± 3%  -41.56%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/4096     248ns ± 0%      145ns ± 0%  -41.53%  (p=0.000 n=5+8)

name                      old speed      new speed       delta
Memmove/1                  108MB/s ± 0%    114MB/s ± 0%   +5.37%  (p=0.004 n=4+8)
Memmove/2                  207MB/s ± 0%    217MB/s ± 0%   +4.85%  (p=0.002 n=5+8)
Memmove/3                  301MB/s ± 0%    326MB/s ± 0%   +8.45%  (p=0.002 n=5+8)
Memmove/4                  377MB/s ± 0%    435MB/s ± 0%  +15.31%  (p=0.004 n=4+8)
Memmove/5                  455MB/s ± 0%    543MB/s ± 0%  +19.46%  (p=0.002 n=5+8)
Memmove/6                  483MB/s ± 0%    652MB/s ± 0%  +34.88%  (p=0.003 n=5+7)
Memmove/7                  537MB/s ± 0%    761MB/s ± 0%  +41.71%  (p=0.002 n=5+8)
Memmove/8                  879MB/s ± 1%    869MB/s ± 0%   -1.15%  (p=0.000 n=5+7)
Memmove/9                  931MB/s ± 0%    978MB/s ± 0%   +5.05%  (p=0.002 n=5+8)
Memmove/10                 960MB/s ± 0%   1086MB/s ± 0%  +13.13%  (p=0.002 n=5+8)
Memmove/11                1.00GB/s ± 0%   1.20GB/s ± 0%  +18.92%  (p=0.003 n=5+7)
Memmove/12                1.04GB/s ± 0%   1.30GB/s ± 0%  +25.40%  (p=0.002 n=5+8)
Memmove/13                1.05GB/s ± 0%   1.41GB/s ± 0%  +34.87%  (p=0.002 n=5+8)
Memmove/14                1.07GB/s ± 0%   1.52GB/s ± 0%  +42.14%  (p=0.002 n=5+8)
Memmove/15                1.09GB/s ± 0%   1.63GB/s ± 0%  +49.91%  (p=0.002 n=5+8)
Memmove/16                1.65GB/s ± 0%   1.74GB/s ± 0%   +5.40%  (p=0.003 n=5+7)
Memmove/32                3.01GB/s ± 0%   3.48GB/s ± 0%  +15.58%  (p=0.003 n=5+7)
Memmove/64                4.76GB/s ± 0%   6.27GB/s ± 0%  +31.75%  (p=0.003 n=5+7)
Memmove/128               7.08GB/s ± 1%   9.69GB/s ± 0%  +36.96%  (p=0.002 n=5+8)
Memmove/256               10.2GB/s ± 0%   15.6GB/s ± 0%  +53.58%  (p=0.002 n=5+8)
Memmove/512               14.1GB/s ± 0%   22.4GB/s ± 0%  +59.57%  (p=0.003 n=5+7)
Memmove/1024              14.6GB/s ± 0%   27.9GB/s ±10%  +91.00%  (p=0.002 n=5+8)
Memmove/2048              16.9GB/s ± 0%   33.4GB/s ± 0%  +98.32%  (p=0.003 n=5+7)
Memmove/4096              18.3GB/s ± 0%   33.9GB/s ± 0%  +85.80%  (p=0.002 n=5+8)
MemmoveUnalignedDst/1      101MB/s ± 1%    100MB/s ± 0%     ~     (p=0.586 n=5+8)
MemmoveUnalignedDst/2      189MB/s ± 0%    192MB/s ± 0%   +1.82%  (p=0.002 n=5+8)
MemmoveUnalignedDst/3      278MB/s ± 0%    288MB/s ± 0%   +3.88%  (p=0.003 n=5+7)
MemmoveUnalignedDst/4      368MB/s ± 0%    387MB/s ± 0%   +5.41%  (p=0.003 n=5+7)
MemmoveUnalignedDst/5      434MB/s ± 0%    484MB/s ± 0%  +11.52%  (p=0.002 n=5+8)
MemmoveUnalignedDst/6      454MB/s ± 0%    580MB/s ± 0%  +27.62%  (p=0.002 n=5+8)
MemmoveUnalignedDst/7      509MB/s ± 0%    677MB/s ± 0%  +33.01%  (p=0.002 n=5+8)
MemmoveUnalignedDst/8      792MB/s ± 0%    770MB/s ± 0%   -2.77%  (p=0.002 n=5+8)
MemmoveUnalignedDst/9      841MB/s ± 0%    866MB/s ± 0%   +2.92%  (p=0.002 n=5+8)
MemmoveUnalignedDst/10     896MB/s ± 0%    962MB/s ± 0%   +7.35%  (p=0.003 n=5+7)
MemmoveUnalignedDst/11     947MB/s ± 0%   1058MB/s ± 0%  +11.80%  (p=0.002 n=5+8)
MemmoveUnalignedDst/12     962MB/s ± 2%   1154MB/s ± 0%  +19.97%  (p=0.002 n=5+8)
MemmoveUnalignedDst/13     947MB/s ± 0%   1251MB/s ± 0%  +32.08%  (p=0.002 n=5+8)
MemmoveUnalignedDst/14    1.00GB/s ± 0%   1.35GB/s ± 0%  +34.55%  (p=0.002 n=5+8)
MemmoveUnalignedDst/15    1.03GB/s ± 0%   1.44GB/s ± 0%  +40.50%  (p=0.002 n=5+8)
MemmoveUnalignedDst/16    1.53GB/s ± 0%   1.54GB/s ± 0%   +0.77%  (p=0.002 n=5+8)
MemmoveUnalignedDst/32    2.58GB/s ± 0%   2.75GB/s ± 0%   +6.52%  (p=0.003 n=5+7)
MemmoveUnalignedDst/64    4.21GB/s ± 0%   5.19GB/s ± 0%  +23.40%  (p=0.004 n=5+6)
MemmoveUnalignedDst/128   6.86GB/s ± 0%   8.42GB/s ± 0%  +22.78%  (p=0.003 n=5+7)
MemmoveUnalignedDst/256   10.2GB/s ± 0%   13.8GB/s ± 0%  +35.15%  (p=0.002 n=5+8)
MemmoveUnalignedDst/512   13.5GB/s ± 0%   21.0GB/s ± 0%  +54.90%  (p=0.002 n=5+8)
MemmoveUnalignedDst/1024  13.7GB/s ± 0%   25.3GB/s ± 0%  +84.61%  (p=0.003 n=5+7)
MemmoveUnalignedDst/2048  15.3GB/s ± 0%   27.5GB/s ± 0%  +79.52%  (p=0.002 n=5+8)
MemmoveUnalignedDst/4096  16.5GB/s ± 0%   28.9GB/s ± 0%  +74.74%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/1      102MB/s ± 0%    100MB/s ± 0%   -2.02%  (p=0.000 n=5+7)
MemmoveUnalignedSrc/2      191MB/s ± 0%    200MB/s ± 0%   +4.78%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/3      279MB/s ± 0%    300MB/s ± 0%   +7.45%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/4      354MB/s ± 0%    400MB/s ± 0%  +13.10%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/5      431MB/s ± 0%    500MB/s ± 0%  +16.02%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/6      441MB/s ± 0%    600MB/s ± 0%  +36.03%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/7      485MB/s ± 0%    700MB/s ± 0%  +44.29%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/8      811MB/s ± 1%    800MB/s ± 0%   -1.36%  (p=0.016 n=5+8)
MemmoveUnalignedSrc/9      864MB/s ± 0%    900MB/s ± 0%   +4.07%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/10     893MB/s ± 0%    999MB/s ± 0%  +11.97%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/11     932MB/s ± 0%   1099MB/s ± 0%  +18.01%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/12     988MB/s ± 0%   1199MB/s ± 0%  +21.35%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/13     955MB/s ± 0%   1299MB/s ± 0%  +36.02%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/14     955MB/s ± 0%   1399MB/s ± 0%  +46.52%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/15    1.04GB/s ± 0%   1.50GB/s ± 0%  +44.18%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/16    1.45GB/s ± 0%   1.60GB/s ± 0%  +10.14%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/32    2.78GB/s ± 0%   3.20GB/s ± 0%  +15.16%  (p=0.003 n=5+7)
MemmoveUnalignedSrc/64    4.30GB/s ± 0%   5.72GB/s ± 0%  +32.90%  (p=0.003 n=5+7)
MemmoveUnalignedSrc/128   6.57GB/s ± 0%   8.42GB/s ± 0%  +28.06%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/256   9.39GB/s ± 1%  13.33GB/s ± 0%  +41.96%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/512   12.7GB/s ± 0%   18.8GB/s ± 0%  +48.53%  (p=0.003 n=5+7)
MemmoveUnalignedSrc/1024  13.6GB/s ± 0%   23.0GB/s ± 0%  +69.82%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/2048  15.6GB/s ± 0%   26.8GB/s ± 3%  +71.37%  (p=0.002 n=5+8)
MemmoveUnalignedSrc/4096  16.5GB/s ± 0%   28.2GB/s ± 0%  +71.40%  (p=0.002 n=5+8)

Fixes #22925

Change-Id: I38c1a9ad5c6e3f4f95fc521c4b7e3140b58b4737
Reviewed-on: https://go-review.googlesource.com/83799
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-01 20:34:11 +00:00
Josh Bleecher Snyder
7365fac2db runtime: use bytes.IndexByte in findnull
bytes.IndexByte is heavily optimized.
Use it in findnull.

name        old time/op  new time/op  delta
GoString-8  65.5ns ± 1%  40.2ns ± 1%  -38.62%  (p=0.000 n=19+19)

findnull is also used in gostringnocopy,
which is used in many hot spots in the runtime.

Fixes #23830

Change-Id: I2e6cb279c7d8078f8844065de684cc3567fe89d7
Reviewed-on: https://go-review.googlesource.com/97523
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 20:34:07 +00:00
Hana Kim
e75f805e6f runtime/trace: skip TestUserTaskSpan upon timestamp error
Change-Id: I030baaa0a0abf1e43449faaf676d389a28a868a3
Reviewed-on: https://go-review.googlesource.com/97857
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Peter Weinberger <pjw@google.com>
2018-03-01 18:38:49 +00:00
Josh Bleecher Snyder
9372e3f5ef runtime: don't allocate to build strings of length 1
Use staticbytes instead.
Instrumenting make.bash shows approx 0.5%
of all slicebytetostrings have a buffer of length 1.

name                     old time/op  new time/op  delta
SliceByteToString/1-8    14.1ns ± 1%   4.1ns ± 1%  -71.13%  (p=0.000 n=17+20)
SliceByteToString/2-8    15.5ns ± 2%  15.5ns ± 1%     ~     (p=0.061 n=20+18)
SliceByteToString/4-8    14.9ns ± 1%  15.0ns ± 2%   +1.25%  (p=0.000 n=20+20)
SliceByteToString/8-8    17.1ns ± 1%  17.5ns ± 1%   +2.16%  (p=0.000 n=19+19)
SliceByteToString/16-8   23.6ns ± 1%  23.9ns ± 1%   +1.41%  (p=0.000 n=20+18)
SliceByteToString/32-8   26.0ns ± 1%  25.8ns ± 0%   -1.05%  (p=0.000 n=19+16)
SliceByteToString/64-8   30.0ns ± 0%  30.2ns ± 0%   +0.56%  (p=0.000 n=16+18)
SliceByteToString/128-8  38.9ns ± 0%  39.0ns ± 0%   +0.23%  (p=0.019 n=19+15)

Fixes #24172

Change-Id: I3dfa14eefbf9fb4387114e20c9cb40e186abe962
Reviewed-on: https://go-review.googlesource.com/97717
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 17:38:06 +00:00
Josh Bleecher Snyder
aa9c1a8f80 runtime: fix amd64p32 indexbytes in presence of overflow
When the slice/string length is very large,
probably artifically large as in CL 97523,
adding BX (length) to R11 (pointer) overflows.
As a result, checking DI < R11 yields the wrong result.
Since they will be equal when the loop is done,
just check DI != R11 instead.
Yes, the pointer itself could overflow, but if that happens,
something else has gone pretty wrong; not our concern here.

Fixes #24187

Change-Id: I2f60fc6ccae739345d01bc80528560726ad4f8c6
Reviewed-on: https://go-review.googlesource.com/97802
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 16:53:33 +00:00
Tobias Klauser
c7c01efd96 runtime: clean up libc_* definitions on Solaris
All functions defined in syscall2_solaris.go have the respective libc_*
var in syscall_solaris.go, except for libc_close. Move it from
os3_solaris.go

Remove unused libc_fstat.

Order go:cgo_import_dynamic and go:linkname lists in
syscall2_solaris.go alphabetically.

Change-Id: I9f12fa473cf1ae351448ac45597c82a67d799c31
Reviewed-on: https://go-review.googlesource.com/97736
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-01 07:31:53 +00:00
Richard Miller
c2cdfbd1a7 runtime: don't try to shrink address space with brk in Plan 9
Plan 9 won't let brk shrink the data segment if it's shared with
other processes (which it is in the go runtime).  So we keep track
of the notional end of the segment as it moves up and down, and
call brk only when it grows.

Corrects CL 94776.

Updates #23860.
Fixes #24013.

Change-Id: I754232decab81dfd71d690f77ee6097a17d9be11
Reviewed-on: https://go-review.googlesource.com/97595
Reviewed-by: David du Colombier <0intro@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 15:57:10 +00:00
Keith Randall
2413b54888 cmd/compile: mark the first word of an interface as a uintptr
The first word of an interface is a pointer, but for the purposes
of GC we don't need to treat it as such.
 1. If it is a non-empty interface, the pointer points to an itab
    which is always in persistentalloc space.
 2. If it is an empty interface, the pointer points to a _type.
   a. If it is a compile-time-allocated type, it points into
      the read-only data section.
   b. If it is a reflect-allocated type, it points into the Go heap.
      Reflect is responsible for keeping a reference to
      the underlying type so it won't be GCd.

If we ever have a moving GC, we need to change this for 2b (as
well as scan itabs to update their itab._type fields).

Write barriers on the first word of interfaces have already been removed.

Change-Id: I643e91d7ac4de980ac2717436eff94097c65d959
Reviewed-on: https://go-review.googlesource.com/97518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-02-27 22:58:32 +00:00
Tobias Klauser
5b21bf6f81 runtime: simplify walltime/nanotime on linux/{386,amd64}
Avoid an unnecessary MOVL/MOVQ.

Follow CL 97377

Change-Id: Ic43976d6b0cece3ed455496d18aedd67e0337d3f
Reviewed-on: https://go-review.googlesource.com/97358
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-27 18:42:41 +00:00
Josh Bleecher Snyder
c5d6c42d35 runtime: improve 386/amd64 systemstack
Minor improvements, noticed while investigating other things.

Shorten the prologue.

Make branch direction better for static branch prediction;
the most common case by far is switching stacks (g==curg).

Change-Id: Ib2211d3efecb60446355cda56194221ccb78057d
Reviewed-on: https://go-review.googlesource.com/97377
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-27 18:10:38 +00:00
Josh Bleecher Snyder
486caa26d7 runtime: short-circuit typedmemmove when dst==src
Change-Id: I855268a4c0d07ad602ec90f5da66422d3d87c5f2
Reviewed-on: https://go-review.googlesource.com/94595
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 00:56:18 +00:00
Ian Lance Taylor
804e3e565e runtime: don't check for String/Error methods in printany
They have either already been called by preprintpanics, or they can
not be called safely because of the various conditions checked at the
start of gopanic.

Fixes #24059

Change-Id: I4a6233d12c9f7aaaee72f343257ea108bae79241
Reviewed-on: https://go-review.googlesource.com/96755
Reviewed-by: Austin Clements <austin@google.com>
2018-02-23 22:39:46 +00:00
Austin Clements
788464724c runtime: reduce arena size to 4MB on 64-bit Windows
Currently, we use 64MB heap arenas on 64-bit platforms. This works
well on UNIX-like OSes because they treat untouched pages as
essentially free. However, on Windows, committed memory is charged
against a process whether or not it has demand-faulted physical pages
in. Hence, on Windows, even a process with a tiny heap will commit
64MB for one heap arena, plus another 32MB for the arena map. Things
are much worse under the race detector, which increases the heap
commitment by a factor of 5.5X, leading to 384MB of committed memory
at runtime init.

Fix this by reducing the heap arena size to 4MB on Windows.

To counterbalance the effect of increasing the arena map size by a
factor of 16, and to further reduce the impact of the commitment for
the arena map, we switch from a single entry L1 arena map to a 64
entry L1 arena map.

Compared to the original arena design, this slows down the
x/benchmarks garbage benchmark by 0.49% (the slow down of this commit
alone is 1.59%, but the previous commit bought us a 1% speed-up):

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.28ms ± 1%  2.29ms ± 1%  +0.49%  (p=0.000 n=17+18)

(https://perf.golang.org/search?q=upload:20180223.1)

(This was measured on linux/amd64 by modifying its arena configuration
as above.)

Fixes #23900.

Change-Id: I6b7fa5ecebee2947bf20cfeb78c248809469c6b1
Reviewed-on: https://go-review.googlesource.com/96780
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-02-23 21:59:51 +00:00
Austin Clements
ec25210564 runtime: support a two-level arena map
Currently, the heap arena map is a single, large array that covers
every possible arena frame in the entire address space. This is
practical up to about 48 bits of address space with 64 MB arenas.

However, there are two problems with this:

1. mips64, ppc64, and s390x support full 64-bit address spaces (though
   on Linux only s390x has kernel support for 64-bit address spaces).
   On these platforms, it would be good to support these larger
   address spaces.

2. On Windows, processes are charged for untouched memory, so for
   processes with small heaps, the mostly-untouched 32 MB arena map
   plus a 64 MB arena are significant overhead. Hence, it would be
   good to reduce both the arena map size and the arena size, but with
   a single-level arena, these are inversely proportional.

This CL adds support for a two-level arena map. Arena frame numbers
are now divided into arenaL1Bits of L1 index and arenaL2Bits of L2
index.

At the moment, arenaL1Bits is always 0, so we effectively have a
single level map. We do a few things so that this has no cost beyond
the current single-level map:

1. We embed the L2 array directly in mheap, so if there's a single
   entry in the L2 array, the representation is identical to the
   current representation and there's no extra level of indirection.

2. Hot code that accesses the arena map is structured so that it
   optimizes to nearly the same machine code as it does currently.

3. We make some small tweaks to hot code paths and to the inliner
   itself to keep some important functions inlined despite their
   now-larger ASTs. In particular, this is necessary for
   heapBitsForAddr and heapBits.next.

Possibly as a result of some of the tweaks, this actually slightly
improves the performance of the x/benchmarks garbage benchmark:

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.28ms ± 1%  2.26ms ± 1%  -1.07%  (p=0.000 n=17+19)

(https://perf.golang.org/search?q=upload:20180223.2)

For #23900.

Change-Id: If5164e0961754f97eb9eca58f837f36d759505ff
Reviewed-on: https://go-review.googlesource.com/96779
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-02-23 21:59:50 +00:00
Austin Clements
33b76920ec runtime: rename "arena index" to "arena map"
There are too many places where I want to talk about "indexing into
the arena index". Make this less awkward and ambiguous by calling it
the "arena map" instead.

Change-Id: I726b0667bb2139dbc006175a0ec09a871cdf73f9
Reviewed-on: https://go-review.googlesource.com/96777
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-02-23 21:59:48 +00:00
Austin Clements
9680980efe runtime: don't assume arena is in address order
On amd64, the arena is no longer in address space order, but currently
the heap dumper assumes that it is. Fix this assumption.

Change-Id: Iab1953cd36b359d0fb78ed49e5eb813116a18855
Reviewed-on: https://go-review.googlesource.com/96776
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-02-23 21:59:47 +00:00
mingrammer
fceaa2e242 runtime: rename the TestGcHashmapIndirection to TestGcMapIndirection
There was still the word 'Hashmap' in gc_test.go, so I renamed it to just 'Map'

Previous renaming commit: https://golang.org/cl/90336

Change-Id: I5b0e5c2229d1c30937c7216247f4533effb81ce7
Reviewed-on: https://go-review.googlesource.com/96675
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-02-23 16:48:01 +00:00
Jerrin Shaji George
5b3cd56038 runtime: fix a few typos in comments
Change-Id: I07a1eb02ffc621c5696b49491181300bf411f822
Reviewed-on: https://go-review.googlesource.com/96475
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-02-23 00:17:20 +00:00
Austin Clements
ea8d7a370d runtime: clarify address space limit constants and comments
Now that we support the full non-contiguous virtual address space of
amd64 hardware, some of the comments and constants related to this are
out of date.

This renames memLimitBits to heapAddrBits because 1<<memLimitBits is
no longer the limit of the address space and rewrites the comment to
focus first on hardware limits (which span OSes) and then discuss
kernel limits.

Second, this eliminates the memLimit constant because there's no
longer a meaningful "highest possible heap pointer value" on amd64.

Updates #23862.

Change-Id: I44b32033d2deb6b69248fb8dda14fc0e65c47f11
Reviewed-on: https://go-review.googlesource.com/95498
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-02-21 20:32:36 +00:00
Austin Clements
ed1959c6e6 runtime: offset the heap arena index by 2^47 on amd64
On amd64, the virtual address space, when interpreted as signed
values, is [-2^47, 2^47). Currently, we only support heap addresses in
the "positive" half of this, [0, 2^47). This suffices for linux/amd64
and windows/amd64, but solaris/amd64 can map user addresses in the
negative part of this range. Specifically, addresses
0xFFFF8000'00000000 to 0xFFFFFD80'00000000 are part of user space.
This leads to "memory allocated by OS not in usable address space"
panic, since we don't map heap arena index space for these addresses.

Fix this by offsetting addresses when computing arena indexes so that
arena entry 0 corresponds to address -2^47 on amd64. We already map
enough arena space for 2^48 heap addresses on 64-bit (because arm64's
virtual address space is [0, 2^48)), so we don't need to grow any
structures to support this.

A different approach would be to simply mask out the top 16 bits.
However, there are two advantages to the offset approach: 1) invalid
heap addresses continue to naturally map to invalid arena indexes so
we don't need extra checks and 2) it perturbs the mapping of addresses
to arena indexes more, which helps check that we don't accidentally
compute incorrect arena indexes somewhere that happen to be right most
of the time.

Several comments and constant names are now somewhat misleading. We'll
fix that in the next CL. This CL is the core change the arena
indexing.

Fixes #23862.

Change-Id: Idb8e299fded04593a286b01a9582da6ddbac2f9a
Reviewed-on: https://go-review.googlesource.com/95497
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-02-21 20:32:35 +00:00
Austin Clements
e9db7b9dd1 runtime: abstract indexing of arena index
Accessing the arena index is about to get slightly more complicated.
Abstract this away into a set of functions for going back and forth
between addresses and arena slice indexes.

For #23862.

Change-Id: I0b20e74ef47a07b78ed0cf0a6128afe6f6e40f4b
Reviewed-on: https://go-review.googlesource.com/95496
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-02-21 20:32:34 +00:00
Austin Clements
3e214e5693 runtime: simplify bulkBarrierPreWrite
Currently, bulkBarrierPreWrite uses inheap to decide whether the
destination is in the heap or whether to check for stack or global
data. However, this isn't the best question to ask.

Instead, get the span directly and query its state. This lets us
directly determine whether this might be a global, or is stack memory,
or is heap memory.

At this point, inheap is no longer used in the hot path, so drop it
from the must-be-inlined list and substitute spanOf.

This will help in a circuitous way with #23862, since fixing that is
going to push inheap very slightly over the inline-able threshold on a
few platforms.

Change-Id: I5360fc1181183598502409f12979899e1e4d45f7
Reviewed-on: https://go-review.googlesource.com/95495
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-02-21 20:32:33 +00:00