1
0
mirror of https://github.com/golang/go synced 2024-11-19 16:44:43 -07:00
Commit Graph

61 Commits

Author SHA1 Message Date
Shenghou Ma
3583a44ed2 runtime: check that masks and shifts are correct aligned
We need a runtime check because the original issue is encountered
when running cross compiled windows program from linux. It's better
to give a meaningful crash message earlier than to segfault later.

The added test should not impose any measurable overhead to Go
programs.

For #12415.

Change-Id: Ib4a24ef560c09c0585b351d62eefd157b6b7f04c
Reviewed-on: https://go-review.googlesource.com/14207
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-25 00:46:57 +00:00
Michael Matloob
67faca7d9c runtime: break atomics out into package runtime/internal/atomic
This change breaks out most of the atomics functions in the runtime
into package runtime/internal/atomic. It adds some basic support
in the toolchain for runtime packages, and also modifies linux/arm
atomics to remove the dependency on the runtime's mutex. The mutexes
have been replaced with spinlocks.

all trybots are happy!
In addition to the trybots, I've tested on the darwin/arm64 builder,
on the darwin/arm builder, and on a ppc64le machine.

Change-Id: I6698c8e3cf3834f55ce5824059f44d00dc8e3c2f
Reviewed-on: https://go-review.googlesource.com/14204
Run-TryBot: Michael Matloob <matloob@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-10 17:38:04 +00:00
Michael Hudson-Doyle
1b4d28f8cf cmd/link, runtime: arm implementation of addmoduledata
Change-Id: I3975e10c2445e23c2798a7203a877ff2de3427c7
Reviewed-on: https://go-review.googlesource.com/14189
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-08 21:46:17 +00:00
David Crawshaw
21f35b33c2 runtime: use a 64kb system stack on arm
I went looking for an arm system whose stacks are by default smaller
than 64KB. In fact the smallest common linux target I could find was
Android, which like iOS uses 1MB stacks.

Fixes #11873

Change-Id: Ieeb66ad095b3da18d47ba21360ea75152a4107c6
Reviewed-on: https://go-review.googlesource.com/14602
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Minux Ma <minux@golang.org>
2015-10-26 15:10:34 +00:00
Michael Hudson-Doyle
31322996fd runtime: add stub sigreturn on arm
When building a shared library, all functions that are declared must actually
be defined.

Change-Id: I1488690cecfb66e62d9fdb3b8d257a4dc31d202a
Reviewed-on: https://go-review.googlesource.com/14187
Reviewed-by: Dave Cheney <dave@cheney.net>
2015-09-07 07:49:09 +00:00
Dave Cheney
1135b9d671 runtime: check pointer equality in arm cmpbody
Updates #11336

Follow the lead of amd64 do a pointer equality check
before comparing string/byte contents on arm.

BenchmarkCompareBytesEqual-4               208             211             +1.44%
BenchmarkCompareBytesToNil-4               83.6            81.8            -2.15%
BenchmarkCompareBytesEmpty-4               80.2            75.2            -6.23%
BenchmarkCompareBytesIdentical-4           208             75.2            -63.85%
BenchmarkCompareBytesSameLength-4          126             128             +1.59%
BenchmarkCompareBytesDifferentLength-4     128             130             +1.56%
BenchmarkCompareBytesBigUnaligned-4        14192804        14060971        -0.93%
BenchmarkCompareBytesBig-4                 12277313        12128193        -1.21%
BenchmarkCompareBytesBigIdentical-4        9385046         78.5            -100.00%

Change-Id: I5b24620018688c5fe04b6ff6743a24c4ce225788
Reviewed-on: https://go-review.googlesource.com/13881
Reviewed-by: Keith Randall <khr@golang.org>
2015-08-24 21:18:33 +00:00
Russ Cox
4a19081358 runtime: run on GOARM=5 and GOARM=6 uniprocessor freebsd/arm systems
Also, crash early on non-Linux SMP ARM systems when GOARM < 7;
without the proper synchronization, SMP cannot work.

Linux is okay because we call kernel-provided routines for
synchronization and barriers, and the kernel takes care of
providing the right routines for the current system.
On non-Linux systems we are left to fend for ourselves.

It is possible to use different synchronization on GOARM=6,
but it's too late to do that in the Go 1.5 cycle.
We don't believe there are any non-Linux SMP GOARM=6 systems anyway.

Fixes #12067.

Change-Id: I771a556e47893ed540ec2cd33d23c06720157ea3
Reviewed-on: https://go-review.googlesource.com/13363
Reviewed-by: Austin Clements <austin@google.com>
2015-08-07 17:39:07 +00:00
Russ Cox
108ec5f75a runtime: fix systemstack tracebacks on nacl/arm
For #11956.

Change-Id: Ic9b57cafa197953cc7f435941e44d42b60b3ddf0
Reviewed-on: https://go-review.googlesource.com/13011
Reviewed-by: Dave Cheney <dave@cheney.net>
2015-07-31 04:35:38 +00:00
Russ Cox
4bd8040d47 runtime, sync/atomic: add memory barriers in arm cas routines
This only triggers on ARMv7+.
If there are important SMP ARMv6 machines we can reconsider.

Makes TestLFStress tests pass and sync/atomic tests not time out
on Apple iPad Mini 3.

Fixes #7977.
Fixes #10189.

Change-Id: Ie424dea3765176a377d39746be9aa8265d11bec4
Reviewed-on: https://go-review.googlesource.com/12950
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-07-30 20:11:11 +00:00
Russ Cox
c9d2c7f0d2 runtime: replace divide with multiply in runtime.usleep on arm
We want to adjust the DIV calling convention to use m,
and usleep can be called without an m, so switch to a
multiplication by the reciprocal (and test).

Step toward a fix for #6699 and #10486.

Change-Id: Iccf76a18432d835e48ec64a2fa34a0e4d6d4b955
Reviewed-on: https://go-review.googlesource.com/12898
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-07-30 15:48:29 +00:00
Austin Clements
f5d494bbdf runtime: ensure GC sees type-safe memory on weak machines
Currently its possible for the garbage collector to observe
uninitialized memory or stale heap bitmap bits on weakly ordered
architectures such as ARM and PPC. On such architectures, the stores
that zero newly allocated memory and initialize its heap bitmap may
move after a store in user code that makes the allocated object
observable by the garbage collector.

To fix this, add a "publication barrier" (also known as an "export
barrier") before returning from mallocgc. This is a store/store
barrier that ensures any write done by user code that makes the
returned object observable to the garbage collector will be ordered
after the initialization performed by mallocgc. No barrier is
necessary on the reading side because of the data dependency between
loading the pointer and loading the contents of the object.

Fixes one of the issues raised in #9984.

Change-Id: Ia3d96ad9c5fc7f4d342f5e05ec0ceae700cd17c8
Reviewed-on: https://go-review.googlesource.com/11083
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Martin Capitanio <capnm9@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-06-19 15:29:50 +00:00
Alex Brainman
9d968cb47b runtime: rename cgocall_errno and asmcgocall_errno into cgocall and asmcgocall
Change-Id: I5917bea8bb35b0e725dcc56a68f3a70137cfc180
Reviewed-on: https://go-review.googlesource.com/9387
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-06-19 01:47:11 +00:00
Alex Brainman
2858b73843 runtime: remove cgocall and asmcgocall
In preparation for rename of cgocall_errno into cgocall and
asmcgocall_errno into asmcgocall in the fllowinng CL.
rsc requested CL 9387 to be split into two parts. This is first part.

Change-Id: I7434f0e4b44dd37017540695834bfcb1eebf0b2f
Reviewed-on: https://go-review.googlesource.com/11166
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-06-18 04:42:53 +00:00
Josh Bleecher Snyder
5353cde080 runtime, cmd/internal/obj/arm: improve arm function prologue
When stack growth is not needed, as it usually is not,
execute only a single conditional branch
rather than three conditional instructions.
This adds 4 bytes to every function,
but might speed up execution in the common case.

Sample disassembly for

func f() {
	_ = [128]byte{}
}

Before:

TEXT main.f(SB) x.go
	x.go:3	0x2000	e59a1008	MOVW 0x8(R10), R1
	x.go:3	0x2004	e59fb028	MOVW 0x28(R15), R11
	x.go:3	0x2008	e08d200b	ADD R11, R13, R2
	x.go:3	0x200c	e1520001	CMP R1, R2
	x.go:3	0x2010	91a0300e	MOVW.LS R14, R3
	x.go:3	0x2014	9b0118a9	BL.LS runtime.morestack_noctxt(SB)
	x.go:3	0x2018	9afffff8	B.LS main.f(SB)
	x.go:3	0x201c	e52de084	MOVW.W R14, -0x84(R13)
	x.go:4	0x2020	e28d1004	ADD $4, R13, R1
	x.go:4	0x2024	e3a00000	MOVW $0, R0
	x.go:4	0x2028	eb012255	BL 0x4a984
	x.go:5	0x202c	e49df084	RET #132
	x.go:5	0x2030	eafffffe	B 0x2030
	x.go:5	0x2034	ffffff7c	?

After:

TEXT main.f(SB) x.go
	x.go:3	0x2000	e59a1008	MOVW 0x8(R10), R1
	x.go:3	0x2004	e59fb02c	MOVW 0x2c(R15), R11
	x.go:3	0x2008	e08d200b	ADD R11, R13, R2
	x.go:3	0x200c	e1520001	CMP R1, R2
	x.go:3	0x2010	9a000004	B.LS 0x2028
	x.go:3	0x2014	e52de084	MOVW.W R14, -0x84(R13)
	x.go:4	0x2018	e28d1004	ADD $4, R13, R1
	x.go:4	0x201c	e3a00000	MOVW $0, R0
	x.go:4	0x2020	eb0124dc	BL 0x4b398
	x.go:5	0x2024	e49df084	RET #132
	x.go:5	0x2028	e1a0300e	MOVW R14, R3
	x.go:5	0x202c	eb011b0d	BL runtime.morestack_noctxt(SB)
	x.go:5	0x2030	eafffff2	B main.f(SB)
	x.go:5	0x2034	eafffffe	B 0x2034
	x.go:5	0x2038	ffffff7c	?

Updates #10587.

package sort benchmarks on an iPhone 6:

name            old time/op  new time/op  delta
SortString1K     569µs ± 0%   565µs ± 1%  -0.75%  (p=0.000 n=23+24)
StableString1K   872µs ± 1%   870µs ± 1%  -0.16%  (p=0.009 n=23+24)
SortInt1K        317µs ± 2%   316µs ± 2%    ~     (p=0.410 n=26+26)
StableInt1K      343µs ± 1%   339µs ± 1%  -1.07%  (p=0.000 n=22+23)
SortInt64K      30.0ms ± 1%  30.0ms ± 1%    ~     (p=0.091 n=25+24)
StableInt64K    30.2ms ± 0%  30.0ms ± 0%  -0.69%  (p=0.000 n=22+22)
Sort1e2          147µs ± 1%   146µs ± 0%  -0.48%  (p=0.000 n=25+24)
Stable1e2        290µs ± 1%   286µs ± 1%  -1.30%  (p=0.000 n=23+24)
Sort1e4         29.5ms ± 2%  29.7ms ± 1%  +0.71%  (p=0.000 n=23+23)
Stable1e4       88.7ms ± 4%  88.6ms ± 8%  -0.07%  (p=0.022 n=26+26)
Sort1e6          4.81s ± 7%   4.83s ± 7%    ~     (p=0.192 n=26+26)
Stable1e6        18.3s ± 1%   18.1s ± 1%  -0.76%  (p=0.000 n=25+23)
SearchWrappers   318ns ± 1%   344ns ± 1%  +8.14%  (p=0.000 n=23+26)

package sort benchmarks on a first generation rpi:

name            old time/op  new time/op  delta
SearchWrappers  4.13µs ± 0%  3.95µs ± 0%   -4.42%  (p=0.000 n=15+13)
SortString1K    5.81ms ± 1%  5.82ms ± 2%     ~     (p=0.400 n=14+15)
StableString1K  9.69ms ± 1%  9.73ms ± 0%     ~     (p=0.121 n=15+11)
SortInt1K       3.30ms ± 2%  3.66ms ±19%  +10.82%  (p=0.000 n=15+14)
StableInt1K     5.97ms ±15%  4.17ms ± 8%  -30.05%  (p=0.000 n=15+15)
SortInt64K       319ms ± 1%   295ms ± 1%   -7.65%  (p=0.000 n=15+15)
StableInt64K     343ms ± 0%   332ms ± 0%   -3.26%  (p=0.000 n=12+13)
Sort1e2         3.36ms ± 2%  3.22ms ± 4%   -4.10%  (p=0.000 n=15+15)
Stable1e2       6.74ms ± 1%  6.43ms ± 2%   -4.67%  (p=0.000 n=15+15)
Sort1e4          247ms ± 1%   247ms ± 1%     ~     (p=0.331 n=15+14)
Stable1e4        864ms ± 0%   820ms ± 0%   -5.15%  (p=0.000 n=14+15)
Sort1e6          41.2s ± 0%   41.2s ± 0%   +0.15%  (p=0.000 n=13+14)
Stable1e6         192s ± 0%    182s ± 0%   -5.07%  (p=0.000 n=14+14)

Change-Id: I8a9db77e1d4ea1956575895893bc9d04bd81204b
Reviewed-on: https://go-review.googlesource.com/10497
Reviewed-by: Russ Cox <rsc@golang.org>
2015-06-04 16:35:12 +00:00
Austin Clements
faa7a7e8ae runtime: implement GC stack barriers
This commit implements stack barriers to minimize the amount of
stack re-scanning that must be done during mark termination.

Currently the GC scans stacks of active goroutines twice during every
GC cycle: once at the beginning during root discovery and once at the
end during mark termination. The second scan happens while the world
is stopped and guarantees that we've seen all of the roots (since
there are no write barriers on writes to local stack
variables). However, this means pause time is proportional to stack
size. In particularly recursive programs, this can drive pause time up
past our 10ms goal (e.g., it takes about 150ms to scan a 50MB heap).

Re-scanning the entire stack is rarely necessary, especially for large
stacks, because usually most of the frames on the stack were not
active between the first and second scans and hence any changes to
these frames (via non-escaping pointers passed down the stack) were
tracked by write barriers.

To efficiently track how far a stack has been unwound since the first
scan (and, hence, how much needs to be re-scanned), this commit
introduces stack barriers. During the first scan, at exponentially
spaced points in each stack, the scan overwrites return PCs with the
PC of the stack barrier function. When "returned" to, the stack
barrier function records how far the stack has unwound and jumps to
the original return PC for that point in the stack. Then the second
scan only needs to proceed as far as the lowest barrier that hasn't
been hit.

For deeply recursive programs, this substantially reduces mark
termination time (and hence pause time). For the goscheme example
linked in issue #10898, prior to this change, mark termination times
were typically between 100 and 500ms; with this change, mark
termination times are typically between 10 and 20ms. As a result of
the reduced stack scanning work, this reduces overall execution time
of the goscheme example by 20%.

Fixes #10898.

The effect of this on programs that are not deeply recursive is
minimal:

name                   old time/op    new time/op    delta
BinaryTree17              3.16s ± 2%     3.26s ± 1%  +3.31%  (p=0.000 n=19+19)
Fannkuch11                2.42s ± 1%     2.48s ± 1%  +2.24%  (p=0.000 n=17+19)
FmtFprintfEmpty          50.0ns ± 3%    49.8ns ± 1%    ~     (p=0.534 n=20+19)
FmtFprintfString          173ns ± 0%     175ns ± 0%  +1.49%  (p=0.000 n=16+19)
FmtFprintfInt             170ns ± 1%     175ns ± 1%  +2.97%  (p=0.000 n=20+19)
FmtFprintfIntInt          288ns ± 0%     295ns ± 0%  +2.73%  (p=0.000 n=16+19)
FmtFprintfPrefixedInt     242ns ± 1%     252ns ± 1%  +4.13%  (p=0.000 n=18+18)
FmtFprintfFloat           324ns ± 0%     323ns ± 0%  -0.36%  (p=0.000 n=20+19)
FmtManyArgs              1.14µs ± 0%    1.12µs ± 1%  -1.01%  (p=0.000 n=18+19)
GobDecode                8.88ms ± 1%    8.87ms ± 0%    ~     (p=0.480 n=19+18)
GobEncode                6.80ms ± 1%    6.85ms ± 0%  +0.82%  (p=0.000 n=20+18)
Gzip                      363ms ± 1%     363ms ± 1%    ~     (p=0.077 n=18+20)
Gunzip                   90.6ms ± 0%    90.0ms ± 1%  -0.71%  (p=0.000 n=17+18)
HTTPClientServer         51.5µs ± 1%    50.8µs ± 1%  -1.32%  (p=0.000 n=18+18)
JSONEncode               17.0ms ± 0%    17.1ms ± 0%  +0.40%  (p=0.000 n=18+17)
JSONDecode               61.8ms ± 0%    63.8ms ± 1%  +3.11%  (p=0.000 n=18+17)
Mandelbrot200            3.84ms ± 0%    3.84ms ± 1%    ~     (p=0.583 n=19+19)
GoParse                  3.71ms ± 1%    3.72ms ± 1%    ~     (p=0.159 n=18+19)
RegexpMatchEasy0_32       100ns ± 0%     100ns ± 1%  -0.19%  (p=0.033 n=17+19)
RegexpMatchEasy0_1K       342ns ± 1%     331ns ± 0%  -3.41%  (p=0.000 n=19+19)
RegexpMatchEasy1_32      82.5ns ± 0%    81.7ns ± 0%  -0.98%  (p=0.000 n=18+18)
RegexpMatchEasy1_1K       505ns ± 0%     494ns ± 1%  -2.16%  (p=0.000 n=18+18)
RegexpMatchMedium_32      137ns ± 1%     137ns ± 1%  -0.24%  (p=0.048 n=20+18)
RegexpMatchMedium_1K     41.6µs ± 0%    41.3µs ± 1%  -0.57%  (p=0.004 n=18+20)
RegexpMatchHard_32       2.11µs ± 0%    2.11µs ± 1%  +0.20%  (p=0.037 n=17+19)
RegexpMatchHard_1K       63.9µs ± 2%    63.3µs ± 0%  -0.99%  (p=0.000 n=20+17)
Revcomp                   560ms ± 1%     522ms ± 0%  -6.87%  (p=0.000 n=18+16)
Template                 75.0ms ± 0%    75.1ms ± 1%  +0.18%  (p=0.013 n=18+19)
TimeParse                 358ns ± 1%     364ns ± 0%  +1.74%  (p=0.000 n=20+15)
TimeFormat                360ns ± 0%     372ns ± 0%  +3.55%  (p=0.000 n=20+18)

Change-Id: If8a9bfae6c128d15a4f405e02bcfa50129df82a2
Reviewed-on: https://go-review.googlesource.com/10314
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-06-02 20:00:57 +00:00
Keith Randall
c526f3ac10 runtime: tail call into memeq/cmp body implementations
There's no need to call/ret to the body implementation.
It can write the result to the right place.  Just jump to
it and have it return to our caller.

Old:
  call body implementation
  compute result
  put result in a register
  return
  write register to result location
  return

New:
  load address of result location into a register
  jump to body implementation
  compute result
  write result to passed-in address
  return

It's a bit tricky on 386 because there is no free register
with which to pass the result location.  Free up a register
by keeping around blen-alen instead of both alen and blen.

Change-Id: If2cf0682a5bf1cc592bdda7c126ed4eee8944fba
Reviewed-on: https://go-review.googlesource.com/9202
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2015-04-29 04:46:25 +00:00
Russ Cox
92c826b1b2 cmd/internal/gc: inline runtime.getg
This more closely restores what the old C runtime did.
(In C, g was an 'extern register' with the same effective
implementation as in this CL.)

On a late 2012 MacBookPro10,2, best of 5 old vs best of 5 new:

benchmark                          old ns/op      new ns/op      delta
BenchmarkBinaryTree17              4981312777     4463426605     -10.40%
BenchmarkFannkuch11                3046495712     3006819428     -1.30%
BenchmarkFmtFprintfEmpty           89.3           79.8           -10.64%
BenchmarkFmtFprintfString          284            262            -7.75%
BenchmarkFmtFprintfInt             282            262            -7.09%
BenchmarkFmtFprintfIntInt          480            448            -6.67%
BenchmarkFmtFprintfPrefixedInt     382            358            -6.28%
BenchmarkFmtFprintfFloat           529            486            -8.13%
BenchmarkFmtManyArgs               1849           1773           -4.11%
BenchmarkGobDecode                 12835963       11794385       -8.11%
BenchmarkGobEncode                 10527170       10288422       -2.27%
BenchmarkGzip                      436109569      438422516      +0.53%
BenchmarkGunzip                    110121663      109843648      -0.25%
BenchmarkHTTPClientServer          81930          85446          +4.29%
BenchmarkJSONEncode                24638574       24280603       -1.45%
BenchmarkJSONDecode                93022423       85753546       -7.81%
BenchmarkMandelbrot200             4703899        4735407        +0.67%
BenchmarkGoParse                   5319853        5086843        -4.38%
BenchmarkRegexpMatchEasy0_32       151            151            +0.00%
BenchmarkRegexpMatchEasy0_1K       452            453            +0.22%
BenchmarkRegexpMatchEasy1_32       131            132            +0.76%
BenchmarkRegexpMatchEasy1_1K       761            722            -5.12%
BenchmarkRegexpMatchMedium_32      228            224            -1.75%
BenchmarkRegexpMatchMedium_1K      63751          64296          +0.85%
BenchmarkRegexpMatchHard_32        3188           3238           +1.57%
BenchmarkRegexpMatchHard_1K        95396          96756          +1.43%
BenchmarkRevcomp                   661587262      687107364      +3.86%
BenchmarkTemplate                  108312598      104008540      -3.97%
BenchmarkTimeParse                 453            459            +1.32%
BenchmarkTimeFormat                475            441            -7.16%

The garbage benchmark from the benchmarks subrepo gets 2.6% faster as well.

Change-Id: I320aeda332db81012688b26ffab23f6581c59cfa
Reviewed-on: https://go-review.googlesource.com/8460
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2015-04-07 14:26:47 +00:00
Josh Bleecher Snyder
ad3600945a runtime: auto-generate duff routines
This makes it easier to experiment with alternative implementations.

While we're here, update the comments.

No functional changes. Passes toolstash -cmp.

Change-Id: I428535754908f0fdd7cc36c214ddb6e1e60f376e
Reviewed-on: https://go-review.googlesource.com/8310
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-04-02 02:37:59 +00:00
Michael Hudson-Doyle
f78dc1dac1 runtime: rename ·main·f to ·mainPC to avoid duplicate symbol
runtime·main·f is normalized by the linker to runtime.main.f, as is
the compiler-generated symbol runtime.main·f.  Change the former to
runtime·mainPC instead.

Fixes issue #9934

Change-Id: I656a6fa6422d45385fa2cc55bd036c6affa1abfe
Reviewed-on: https://go-review.googlesource.com/8234
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-03-30 18:52:14 +00:00
Dave Cheney
e2543ef62c runtime: add runtime.cmpstring and bytes.Compare
Update #10007

Implement runtime.cmpstring and bytes.Compare in asm for arm.

benchmark                                old ns/op     new ns/op     delta
BenchmarkCompareBytesEqual               254           91.4          -64.02%
BenchmarkCompareBytesToNil               41.5          37.6          -9.40%
BenchmarkCompareBytesEmpty               40.7          37.6          -7.62%
BenchmarkCompareBytesIdentical           255           96.3          -62.24%
BenchmarkCompareBytesSameLength          125           60.9          -51.28%
BenchmarkCompareBytesDifferentLength     133           60.9          -54.21%
BenchmarkCompareBytesBigUnaligned        17985879      5669706       -68.48%
BenchmarkCompareBytesBig                 17097634      4926798       -71.18%
BenchmarkCompareBytesBigIdentical        16861941      4389206       -73.97%

benchmark                             old MB/s     new MB/s     speedup
BenchmarkCompareBytesBigUnaligned     58.30        184.95       3.17x
BenchmarkCompareBytesBig              61.33        212.83       3.47x
BenchmarkCompareBytesBigIdentical     62.19        238.90       3.84x

This is a collaboration between Josh Bleecher Snyder and myself.

Change-Id: Ib3944b8c410d0e12135c2ba9459bfe131df48edd
Reviewed-on: https://go-review.googlesource.com/8010
Reviewed-by: Keith Randall <khr@golang.org>
2015-03-25 22:46:39 +00:00
Shenghou Ma
3b00197017 runtime: add argument sizes for asm functions for bytes, strings
Also fixed a stack corruption bug for nacl/amd64p32.

Change-Id: I64b821b16999c296a159137d971af3870053c621
Signed-off-by: Shenghou Ma <minux@golang.org>
Reviewed-on: https://go-review.googlesource.com/7073
Reviewed-by: Dave Cheney <dave@cheney.net>
2015-03-07 06:02:40 +00:00
Dmitry Vyukov
894024f478 runtime: fix traceback from goexit1
We used to not call traceback from goexit1.
But now tracer does it and crashes on amd64p32:

runtime: unexpected return pc for runtime.getg called from 0x108a4240
goroutine 18 [runnable, locked to thread]:
runtime.traceGoEnd()
    src/runtime/trace.go:758 fp=0x10818fe0 sp=0x10818fdc
runtime.goexit1()
    src/runtime/proc1.go:1540 +0x20 fp=0x10818fe8 sp=0x10818fe0
runtime.getg(0x0)
    src/runtime/asm_386.s:2414 fp=0x10818fec sp=0x10818fe8
created by runtime/pprof_test.TestTraceStress
    src/runtime/pprof/trace_test.go:123 +0x500

Return PC from goexit1 points right after goexit (+0x6).
It happens to work most of the time somehow.

This change fixes traceback from goexit1 by adding an additional NOP to goexit.

Fixes #9931

Change-Id: Ied25240a181b0a2d7bc98127b3ed9068e9a1a13e
Reviewed-on: https://go-review.googlesource.com/5460
Reviewed-by: Russ Cox <rsc@golang.org>
2015-02-28 23:19:57 +00:00
Matthew Dempsky
7abdc90fe3 runtime: remove gogetcallerpc and gogetcallersp functions
Package runtime's Go code was converted to directly call getcallerpc
and getcallersp in https://golang.org/cl/138740043, but the assembly
implementations were not removed.

Change-Id: Ib2eaee674d594cbbe799925aae648af782a01c83
Reviewed-on: https://go-review.googlesource.com/5901
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-02-25 09:34:58 +00:00
Rob Pike
c21f1d5ef3 [dev.cc] runtime,syscall: quiet some more vet errors
Fix many incorrect FP references and a few other details.

Some errors remain, especially in vlop, but fixing them requires semantics. For another day.

Change-Id: Ib769fb519b465e79fc08d004a51acc5644e8b259
Reviewed-on: https://go-review.googlesource.com/5288
Reviewed-by: Russ Cox <rsc@golang.org>
2015-02-20 00:20:54 +00:00
Rob Pike
345350bf07 [dev.cc] cmd/asm: make 4(SP) illegal except on 386
Require a name to be specified when referencing the pseudo-stack.
If you want a real stack offset, use the hardware stack pointer (e.g.,
R13 on arm), not SP.

Fix affected assembly files.

Change-Id: If3545f187a43cdda4acc892000038ec25901132a
Reviewed-on: https://go-review.googlesource.com/5120
Run-TryBot: Rob Pike <r@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2015-02-18 03:41:29 +00:00
Rob Pike
69ddb7a408 [dev.cc] all: edit assembly source for ARM to be more regular
Several .s files for ARM had several properties the new assembler will not support.
These include:

- mentioning SP or PC as a hardware register
	These are always pseudo-registers except that in some contexts
	they're not, and it's confusing because the context should not affect
	which register you mean. Change the references to the hardware
	registers to be explicit: R13 for SP, R15 for PC.
- constant creation using assignment
	The files say a=b when they could instead say #define a b.
	There is no reason to have both mechanisms.
- R(0) to refer to R0.
	Some macros use this to a great extent. Again, it's easy just to
	use a #define to rename a register.

Change-Id: I002335ace8e876c5b63c71c2560533eb835346d2
Reviewed-on: https://go-review.googlesource.com/4822
Reviewed-by: Dave Cheney <dave@cheney.net>
2015-02-13 23:08:51 +00:00
Shenghou Ma
0661143447 liblink, runtime: move all references to runtime.tlsg to tls_arm.s
CL 2118 makes the assumption that all references to runtime.tlsg
should be accompanied by a declaration of runtime.tlsg if its type
should be a normal variable, instead of a placeholder for TLS
relocation.

Because if runtime.tlsg is not declared by the runtime package,
the type of runtime.tlsg will be zero, so fix the check in liblink
to look for 0 instead of STLSBSS (the type will be initialized by
cmd/ld, but cmd/ld doesn't run during assembly).

Change-Id: I691ac5c3faea902f8b9a0b963e781b22e7b269a7
Reviewed-on: https://go-review.googlesource.com/4030
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-02-09 22:14:06 +00:00
Josh Bleecher Snyder
135ef49fde runtime: speed up eqstring
eqstring does not need to check the length of the strings.

6g

benchmark                              old ns/op     new ns/op     delta
BenchmarkCompareStringEqual            7.03          6.14          -12.66%
BenchmarkCompareStringIdentical        3.36          3.04          -9.52%

5g

benchmark                                 old ns/op     new ns/op     delta
BenchmarkCompareStringEqual               238           232           -2.52%
BenchmarkCompareStringIdentical           90.8          80.7          -11.12%

The equivalent PPC changes are in a separate commit
because I don't have the hardware to test them.

Change-Id: I292874324b9bbd9d24f57a390cfff8b550cdd53c
Reviewed-on: https://go-review.googlesource.com/3955
Reviewed-by: Keith Randall <khr@golang.org>
2015-02-06 18:51:00 +00:00
Dave Cheney
f15c675fb4 runtime: use runtime.sysargs to parse auxv on linux/arm
Make auxv parsing in linux/arm less of a special case.

* rename setup_auxv to sysargs
* exclude linux/arm from vdso_none.go
* move runtime.checkarm after runtime.sysargs so arm specific
  values are properly initialised

Change-Id: I1ca7f5844ad5a162337ff061a83933fc9a2b5ff6
Reviewed-on: https://go-review.googlesource.com/2681
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-01-14 05:08:06 +00:00
Keith Randall
d5e4c4061b runtime: remove size argument from hash and equal algorithms
The equal algorithm used to take the size
   equal(p, q *T, size uintptr) bool
With this change, it does not
   equal(p, q *T) bool
Similarly for the hash algorithm.

The size is rarely used, as most equal functions know the size
of the thing they are comparing.  For instance f32equal already
knows its inputs are 4 bytes in size.

For cases where the size is not known, we allocate a closure
(one for each size needed) that points to an assembly stub that
reads the size out of the closure and calls generic code that
has a size argument.

Reduces the size of the go binary by 0.07%.  Performance impact
is not measurable.

Change-Id: I6e00adf3dde7ad2974adbcff0ee91e86d2194fec
Reviewed-on: https://go-review.googlesource.com/2392
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-07 21:57:01 +00:00
Shenghou Ma
a6a30fefd9 runtime: fix build for ARM
Change-Id: Ia18b8411bebc47ea71ac1acd9ff9dc570ec15dea
Reviewed-on: https://go-review.googlesource.com/2341
Reviewed-by: Dave Cheney <dave@cheney.net>
2015-01-06 01:29:42 +00:00
Russ Cox
df027aceb9 reflect: add write barriers
Use typedmemmove, typedslicecopy, and adjust reflect.call
to execute the necessary write barriers.

Found with GODEBUG=wbshadow=2 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.

Change-Id: Iec5b5b0c1be5589295e28e5228e37f1a92e07742
Reviewed-on: https://go-review.googlesource.com/2312
Reviewed-by: Keith Randall <khr@golang.org>
2015-01-06 00:28:31 +00:00
Russ Cox
e6d3511264 Revert "liblink, cmd/ld, runtime: remove stackguard1"
This reverts commit ab0535ae3f.

I think it will remain useful to distinguish code that must
run on a system stack from code that can run on either stack,
even if that distinction is no
longer based on the implementation language.

That is, I expect to add a //go:systemstack comment that,
in terms of the old implementation, tells the compiler,
to pretend this function was written in C.

Change-Id: I33d2ebb2f99ae12496484c6ec8ed07233d693275
Reviewed-on: https://go-review.googlesource.com/2275
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-05 16:29:56 +00:00
Shenghou Ma
ab0535ae3f liblink, cmd/ld, runtime: remove stackguard1
Now that we've removed all the C code in runtime and the C compilers,
there is no need to have a separate stackguard field to check for C
code on Go stack.

Remove field g.stackguard1 and rename g.stackguard0 to g.stackguard.
Adjust liblink and cmd/ld as necessary.

Change-Id: I54e75db5a93d783e86af5ff1a6cd497d669d8d33
Reviewed-on: https://go-review.googlesource.com/2144
Reviewed-by: Keith Randall <khr@golang.org>
2014-12-29 07:36:07 +00:00
Russ Cox
7a524a1036 runtime: remove thunk.s
Replace with uses of //go:linkname in Go files, direct use of name in .s files.
The only one that really truly needs a jump is reflect.call; the jump is now
next to the runtime.reflectcall assembly implementations.

Change-Id: Ie7ff3020a8f60a8e4c8645fe236e7883a3f23f46
Reviewed-on: https://go-review.googlesource.com/1962
Reviewed-by: Austin Clements <austin@google.com>
2014-12-23 03:17:22 +00:00
Russ Cox
8c3f64022a [dev.garbage] runtime: add prefetcht0, prefetcht1, prefetcht2, prefetcht3, prefetchnta for GC
We don't know what we need yet, so add them all.
Add them even on x86 architectures (as no-ops) so that
the GC can refer to them unconditionally.

Eventually we'll know what we want and probably
have just one 'prefetch' with an appropriate meaning
on each architecture.

LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/179160043
2014-11-21 15:57:10 -05:00
Russ Cox
3e804631d9 [dev.cc] all: merge dev.power64 (7667e41f3ced) into dev.cc
This is to reduce the delta between dev.cc and dev.garbage to just garbage collector changes.

These are the files that had merge conflicts and have been edited by hand:
        malloc.go
        mem_linux.go
        mgc.go
        os1_linux.go
        proc1.go
        panic1.go
        runtime1.go

LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/174180043
2014-11-14 12:10:52 -05:00
Russ Cox
656be317d0 [dev.cc] runtime: delete scalararg, ptrarg; rename onM to systemstack
Scalararg and ptrarg are not "signal safe".
Go code filling them out can be interrupted by a signal,
and then the signal handler runs, and if it also ends up
in Go code that uses scalararg or ptrarg, now the old
values have been smashed.
For the pieces of code that do need to run in a signal handler,
we introduced onM_signalok, which is really just onM
except that the _signalok is meant to convey that the caller
asserts that scalarg and ptrarg will be restored to their old
values after the call (instead of the usual behavior, zeroing them).

Scalararg and ptrarg are also untyped and therefore error-prone.

Go code can always pass a closure instead of using scalararg
and ptrarg; they were only really necessary for C code.
And there's no more C code.

For all these reasons, delete scalararg and ptrarg, converting
the few remaining references to use closures.

Once those are gone, there is no need for a distinction between
onM and onM_signalok, so replace both with a single function
equivalent to the current onM_signalok (that is, it can be called
on any of the curg, g0, and gsignal stacks).

The name onM and the phrase 'm stack' are misnomers,
because on most system an M has two system stacks:
the main thread stack and the signal handling stack.

Correct the misnomer by naming the replacement function systemstack.

Fix a few references to "M stack" in code.

The main motivation for this change is to eliminate scalararg/ptrarg.
Rick and I have already seen them cause problems because
the calling sequence m.ptrarg[0] = p is a heap pointer assignment,
so it gets a write barrier. The write barrier also uses onM, so it has
all the same problems as if it were being invoked by a signal handler.
We worked around this by saving and restoring the old values
and by calling onM_signalok, but there's no point in keeping this nice
home for bugs around any longer.

This CL also changes funcline to return the file name as a result
instead of filling in a passed-in *string. (The *string signature is
left over from when the code was written in and called from C.)
That's arguably an unrelated change, except that once I had done
the ptrarg/scalararg/onM cleanup I started getting false positives
about the *string argument escaping (not allowed in package runtime).
The compiler is wrong, but the easiest fix is to write the code like
Go code instead of like C code. I am a bit worried that the compiler
is wrong because of some use of uninitialized memory in the escape
analysis. If that's the reason, it will go away when we convert the
compiler to Go. (And if not, we'll debug it the next time.)

LGTM=khr
R=r, khr
CC=austin, golang-codereviews, iant, rlh
https://golang.org/cl/174950043
2014-11-12 14:54:31 -05:00
Russ Cox
c81d248eca [dev.cc] runtime: convert softfloat_arm.c to Go + build fixes
Also include onM_signalok fix from issue 8995.

Fixes linux/arm build.
Fixes #8995.

LGTM=r
R=r, dave
CC=golang-codereviews
https://golang.org/cl/168580043
2014-11-11 22:30:02 -05:00
Russ Cox
15ced2d008 [dev.cc] runtime: convert assembly files for C to Go transition
The main change is that #include "zasm_GOOS_GOARCH.h"
is now #include "go_asm.h" and/or #include "go_tls.h".

Also, because C StackGuard is now Go _StackGuard,
the assembly name changes from const_StackGuard to
const__StackGuard.

In asm_$GOARCH.s, add new function getg, formerly
implemented in C.

The renamed atomics now have Go wrappers, to get
escape analysis annotations right. Those wrappers
are in CL 174860043.

LGTM=r, aram
R=r, aram
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/168510043
2014-11-11 17:06:22 -05:00
Austin Clements
31b1207fde [dev.power64] all: merge default into dev.power64
Trivial merge except for src/runtime/asm_power64x.s and
src/runtime/signal_power64x.c

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/168950044
2014-11-03 10:53:11 -05:00
Russ Cox
a5a0733144 runtime: change top-most return PC from goexit to goexit+PCQuantum
If you get a stack of PCs from Callers, it would be expected
that every PC is immediately after a call instruction, so to find
the line of the call, you look up the line for PC-1.
CL 163550043 now explicitly documents that.

The most common exception to this is the top-most return PC
on the stack, which is the entry address of the runtime.goexit
function. Subtracting 1 from that PC will end up in a different
function entirely.

To remove this special case, make the top-most return PC
goexit+PCQuantum and then implement goexit in assembly
so that the first instruction can be skipped.

Fixes #7690.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/170720043
2014-10-29 20:37:44 -04:00
Russ Cox
599199fd9f [dev.power64] all: merge default (dd5014ed9b01) into dev.power64
Still passes on amd64.

LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/165110043
2014-10-29 11:45:01 -04:00
Russ Cox
c4efaac15d runtime: fix unrecovered panic on external thread
Fixes #8588.

LGTM=austin
R=austin
CC=golang-codereviews, khr
https://golang.org/cl/159700044
2014-10-28 21:53:09 -04:00
Russ Cox
b55791e200 [dev.power64] cmd/5a, cmd/6a, cmd/8a, cmd/9a: make labels function-scoped
I removed support for jumping between functions years ago,
as part of doing the instruction layout for each function separately.

Given that, it makes sense to treat labels as function-scoped.
This lets each function have its own 'loop' label, for example.

Makes the assembly much cleaner and removes the last
reason anyone would reach for the 123(PC) form instead.

Note that this is on the dev.power64 branch, but it changes all
the assemblers. The change will ship in Go 1.5 (perhaps after
being ported into the new assembler).

Came up as part of CL 167730043.

LGTM=r
R=r
CC=austin, dave, golang-codereviews, minux
https://golang.org/cl/159670043
2014-10-28 21:50:16 -04:00
Keith Randall
b60d5e12e9 runtime: warn that cputicks() might not be monotonic.
Get rid of gocputicks(), it is no longer used.

LGTM=bradfitz, dave
R=golang-codereviews, bradfitz, dave, minux
CC=golang-codereviews
https://golang.org/cl/161110044
2014-10-21 14:46:07 -07:00
Russ Cox
cb6f5ac0b0 runtime: remove hand-generated ptr bitmaps for reflectcall
A Go prototype can be used instead now, and the compiler
will do a better job than we will doing it by hand.
(We got it wrong in amd64p32, causing the current build
breakage.)

The auto-prototype-matching only applies to functions
without an explicit package path, so the TEXT lines for
reflectcall and callXX are s/runtime·/·/.

LGTM=khr
R=khr
CC=golang-codereviews, iant, r
https://golang.org/cl/153600043
2014-10-15 13:12:16 -04:00
Russ Cox
2b1659b57d runtime: change Windows M.thread from void* to uintptr
It appears to be an opaque bit pattern more than a pointer.
The Go garbage collector has discovered that for m0
it is set to 0x4c.

Should fix Windows build.

TBR=brainman
CC=golang-codereviews
https://golang.org/cl/149640043
2014-10-07 23:27:25 -04:00
Keith Randall
e1364a6d0e runtime: fix cgo_topofstack to save clobbered registers
Fixes #8816

At least, I hope it does.

TBR=rsc
CC=golang-codereviews
https://golang.org/cl/153730043
2014-09-28 23:52:08 -07:00
Keith Randall
1aa65fe8d4 runtime: add load_g call in arm callback.
Need to restore the g register.  Somehow this line vaporized from
CL 144130043.  Also cgo_topofstack -> _cgo_topofstack, that vaporized also.

TBR=rsc
CC=golang-codereviews
https://golang.org/cl/150940044
2014-09-25 08:37:04 -07:00