1
0
mirror of https://github.com/golang/go synced 2024-10-04 14:21:21 -06:00
Commit Graph

58 Commits

Author SHA1 Message Date
Russ Cox
72c5d5e756 reflect, runtime: fix crash in GC due to reflect.call + precise GC
Given
        type Outer struct {
                *Inner
                ...
        }
the compiler generates the implementation of (*Outer).M dispatching to
the embedded Inner. The implementation is logically:
        func (p *Outer) M() {
                (p.Inner).M()
        }
but since the only change here is the replacement of one pointer
receiver with another, the actual generated code overwrites the
original receiver with the p.Inner pointer and then jumps to the M
method expecting the *Inner receiver.

During reflect.Value.Call, we create an argument frame and the
associated data structures to describe it to the garbage collector,
populate the frame, call reflect.call to run a function call using
that frame, and then copy the results back out of the frame. The
reflect.call function does a memmove of the frame structure onto the
stack (to set up the inputs), runs the call, and the memmoves the
stack back to the frame structure (to preserve the outputs).

Originally reflect.call did not distinguish inputs from outputs: both
memmoves were for the full stack frame. However, in the case where the
called function was one of these wrappers, the rewritten receiver is
almost certainly a different type than the original receiver. This is
not a problem on the stack, where we use the program counter to
determine the type information and understand that during (*Outer).M
the receiver is an *Outer while during (*Inner).M the receiver in the
same memory word is now an *Inner. But in the statically typed
argument frame created by reflect, the receiver is always an *Outer.
Copying the modified receiver pointer off the stack into the frame
will store an *Inner there, and then if a garbage collection happens
to scan that argument frame before it is discarded, it will scan the
*Inner memory as if it were an *Outer. If the two have different
memory layouts, the collection will intepret the memory incorrectly.

Fix by only copying back the results.

Fixes #7725.

LGTM=khr
R=khr
CC=dave, golang-codereviews
https://golang.org/cl/85180043
2014-04-08 11:11:35 -04:00
Keith Randall
6c7cbf086c runtime: get rid of most uses of REP for copying/zeroing.
REP MOVSQ and REP STOSQ have a really high startup overhead.
Use a Duff's device to do the repetition instead.

benchmark                 old ns/op     new ns/op     delta
BenchmarkClearFat32       7.20          1.60          -77.78%
BenchmarkCopyFat32        6.88          2.38          -65.41%
BenchmarkClearFat64       7.15          3.20          -55.24%
BenchmarkCopyFat64        6.88          3.44          -50.00%
BenchmarkClearFat128      9.53          5.34          -43.97%
BenchmarkCopyFat128       9.27          5.56          -40.02%
BenchmarkClearFat256      13.8          9.53          -30.94%
BenchmarkCopyFat256       13.5          10.3          -23.70%
BenchmarkClearFat512      22.3          18.0          -19.28%
BenchmarkCopyFat512       22.0          19.7          -10.45%
BenchmarkCopyFat1024      36.5          38.4          +5.21%
BenchmarkClearFat1024     35.1          35.0          -0.28%

TODO: use for stack frame zeroing
TODO: REP prefixes are still used for "reverse" copying when src/dst
regions overlap.  Might be worth fixing.

LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews, r
https://golang.org/cl/81370046
2014-04-01 12:51:02 -07:00
Russ Cox
c2dd33a46f cmd/ld: clear unused ctxt before morestack
For non-closure functions, the context register is uninitialized
on entry and will not be used, but morestack saves it and then the
garbage collector treats it as live. This can be a source of memory
leaks if the context register points at otherwise dead memory.
Avoid this by introducing a parallel set of morestack functions
that clear the context register, and use those for the non-closure functions.

I hope this will help with some of the finalizer flakiness, but it probably won't.

Fixes #7244.

LGTM=dvyukov
R=khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/71030044
2014-03-04 13:53:08 -05:00
Keith Randall
da7cf0ba5d runtime: faster memclr on x86.
Use explicit SSE writes instead of REP STOSQ.

benchmark               old ns/op    new ns/op    delta
BenchmarkMemclr5               22            5  -73.62%
BenchmarkMemclr16              27            5  -78.49%
BenchmarkMemclr64              28            6  -76.43%
BenchmarkMemclr256             34            8  -74.94%
BenchmarkMemclr4096           112           84  -24.73%
BenchmarkMemclr65536         1902         1920   +0.95%

LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews
https://golang.org/cl/60090044
2014-02-06 17:43:22 -08:00
Dmitriy Vyukov
9cbd2fb1aa runtime: remove locks from netpoll hotpaths
Introduces two-phase goroutine parking mechanism -- prepare to park, commit park.
This mechanism does not require backing mutex to protect wait predicate.
Use it in netpoll. See comment in netpoll.goc for details.
This slightly reduces contention between reader, writer and read/write io notifications;
and just eliminates a bunch of mutex operations from hotpaths, thus making then faster.

benchmark                             old ns/op    new ns/op    delta
BenchmarkTCP4ConcurrentReadWrite           2109         1945   -7.78%
BenchmarkTCP4ConcurrentReadWrite-2         1162         1113   -4.22%
BenchmarkTCP4ConcurrentReadWrite-4          798          755   -5.39%
BenchmarkTCP4ConcurrentReadWrite-8          803          748   -6.85%
BenchmarkTCP4Persistent                    9411         9240   -1.82%
BenchmarkTCP4Persistent-2                  5888         5813   -1.27%
BenchmarkTCP4Persistent-4                  4016         3968   -1.20%
BenchmarkTCP4Persistent-8                  3943         3857   -2.18%

R=golang-codereviews, mikioh.mikioh, gobot, iant, rsc
CC=golang-codereviews, khr
https://golang.org/cl/45700043
2014-01-22 11:27:16 +04:00
Aram Hăvărneanu
a46b434931 runtime: add support for GOOS=solaris
R=alex.brainman, dave, jsing, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/35990043
2014-01-17 17:58:10 +13:00
Russ Cox
7276c02b41 runtime, cmd/gc, cmd/ld: ignore method wrappers in recover
Bug #1:

Issue 5406 identified an interesting case:
        defer iface.M()
may end up calling a wrapper that copies an indirect receiver
from the iface value and then calls the real M method. That's
two calls down, not just one, and so recover() == nil always
in the real M method, even during a panic.

[For the purposes of this entire discussion, a wrapper's
implementation is a function containing an ordinary call, not
the optimized tail call form that is somtimes possible. The
tail call does not create a second frame, so it is already
handled correctly.]

Fix this bug by introducing g->panicwrap, which counts the
number of bytes on current stack segment that are due to
wrapper calls that should not count against the recover
check. All wrapper functions must now adjust g->panicwrap up
on entry and back down on exit. This adds slightly to their
expense; on the x86 it is a single instruction at entry and
exit; on the ARM it is three. However, the alternative is to
make a call to recover depend on being able to walk the stack,
which I very much want to avoid. We have enough problems
walking the stack for garbage collection and profiling.
Also, if performance is critical in a specific case, it is already
faster to use a pointer receiver and avoid this kind of wrapper
entirely.

Bug #2:

The old code, which did not consider the possibility of two
calls, already contained a check to see if the call had split
its stack and so the panic-created segment was one behind the
current segment. In the wrapper case, both of the two calls
might split their stacks, so the panic-created segment can be
two behind the current segment.

Fix this by propagating the Stktop.panic flag forward during
stack splits instead of looking backward during recover.

Fixes #5406.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/13367052
2013-09-12 14:00:16 -04:00
Keith Randall
32b770b2c0 runtime: jump to badmcall instead of calling it.
This replaces the mcall frame with the badmcall frame instead of
leaving the mcall frame on the stack and adding the badmcall frame.
Because mcall is no longer on the stack, traceback will now report what
called mcall, which is what we would like to see in this situation.

R=golang-dev, cshapiro
CC=golang-dev
https://golang.org/cl/13012044
2013-08-29 15:53:34 -07:00
Keith Randall
a97a91de06 runtime: Record jmpdefer's argument size.
Fixes bug 6055.

R=golang-dev, bradfitz, dvyukov, khr
CC=golang-dev
https://golang.org/cl/12536045
2013-08-07 14:03:50 -07:00
Keith Randall
5a54696d78 cmd/ld: Put the textflag constants in a separate file.
We can then include this file in assembly to replace
cryptic constants like "7" with meaningful constants
like "(NOPROF|DUPOK|NOSPLIT)".

Converting just pkg/runtime/asm*.s for now.  Dropping NOPROF
and DUPOK from lots of places where they aren't needed.
More .s files to come in a subsequent changelist.

A nonzero number in the textflag field now means
"has not been converted yet".

R=golang-dev, daniel.morsing, rsc, khr
CC=golang-dev
https://golang.org/cl/12568043
2013-08-07 10:23:24 -07:00
Keith Randall
12e46e42ec runtime: don't mark the new call trampolines as NOSPLIT.
They may call other NOSPLIT routines, and that might
overflow the stack.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12563043
2013-08-06 14:33:55 -07:00
Brad Fitzpatrick
598c78967f strings: use runtime assembly for IndexByte
Fixes #3751

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/12483043
2013-08-05 15:04:05 -07:00
Keith Randall
9cd570680b runtime: reimplement reflect.call to not use stack splitting.
R=golang-dev, r, khr, rsc
CC=golang-dev
https://golang.org/cl/12053043
2013-08-02 13:03:14 -07:00
Brad Fitzpatrick
e2a1bd68b3 bytes: move IndexByte assembly to pkg runtime
Per suggestion from Russ in February. Then strings.IndexByte
can be implemented in terms of the shared code in pkg runtime.

Update #3751

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/12289043
2013-08-01 16:11:19 -07:00
Dmitriy Vyukov
e84d9e1fb3 runtime: do not split stacks in syscall status
Split stack checks (morestack) corrupt g->sched,
but g->sched must be preserved consistent for GC/traceback.
The change implements runtime.notetsleepg function,
which does entersyscall/exitsyscall and is carefully arranged
to not call any split functions in between.

R=rsc
CC=golang-dev
https://golang.org/cl/11575044
2013-07-29 22:22:34 +04:00
Russ Cox
f011282578 runtime: more cgocallback_gofunc work
Debugging the Windows breakage I noticed that SEH
only exists on 386, so we can balance the two stacks
a little more on amd64 and reclaim another word.

Now we're down to just one word consumed by
cgocallback_gofunc, having reclaimed 25% of the
overall budget (4 words out of 16).

Separately, fix windows/386 - the SEH must be on the
m0 stack, as must the saved SP, so we are forced to have
a three-word frame for 386. It matters much less for
386, because there 128 bytes gives 32 words to use.

R=dvyukov, alex.brainman
CC=golang-dev
https://golang.org/cl/11551044
2013-07-24 09:01:57 -04:00
Russ Cox
cefdb9c286 runtime: fix windows build
TBR=golang-dev
CC=golang-dev
https://golang.org/cl/11595045
2013-07-23 22:59:32 -04:00
Russ Cox
dba623b1c7 runtime: reduce frame size for runtime.cgocallback_gofunc
Tying preemption to stack splits means that we have to able to
complete the call to exitsyscall (inside cgocallbackg at least for now)
without any stack split checks, meaning that the whole sequence
has to work within 128 bytes of stack, unless we increase the size
of the red zone. This CL frees up 24 bytes along that critical path
on amd64. (The 32-bit systems have plenty of space because all
their words are smaller.)

R=dvyukov
CC=golang-dev
https://golang.org/cl/11676043
2013-07-23 18:40:02 -04:00
Russ Cox
58f12ffd79 runtime: handle morestack/lessstack in stack trace
If we start a garbage collection on g0 during a
stack split or unsplit, we'll see morestack or lessstack
at the top of the stack. Record an argument frame size
for those, and record that they terminate the stack.

R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/11533043
2013-07-18 16:53:45 -04:00
Russ Cox
9ddfb64365 runtime: record argument size in assembly functions
I have not done the system call stubs in sys_*.s.
I hope to avoid that, because those do not block, so those
frames will not appear in stack traces during garbage
collection.

R=golang-dev, dvyukov, khr
CC=golang-dev
https://golang.org/cl/11360043
2013-07-16 16:24:09 -04:00
Russ Cox
5d363c6357 cmd/ld, runtime: new in-memory symbol table format
Design at http://golang.org/s/go12symtab.

This enables some cleanup of the garbage collector metadata
that will be done in future CLs.

This CL does not move the old symtab and pclntab back into
an unmapped section of the file. That's a bit tricky and will be
done separately.

Fixes #4020.

R=golang-dev, dave, cshapiro, iant, r
CC=golang-dev, nigeltao
https://golang.org/cl/11085043
2013-07-16 09:41:38 -04:00
Russ Cox
fb63e4fefb runtime: make cas64 like cas32 and casp
The current cas64 definition hard-codes the x86 behavior
of updating *old with the new value when the cas fails.
This is inconsistent with cas32 and casp.
Make it consistent.

This means that the cas64 uses will be epsilon less efficient
than they might be, because they have to do an unnecessary
memory load on x86. But so be it. Code clarity and consistency
is more important.

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/10909045
2013-07-12 00:03:32 -04:00
Russ Cox
f0d73fbc7c runtime: use gp->sched.sp for stack overflow check
On x86 it is a few words lower on the stack than m->morebuf.sp
so it is a more precise check. Enabling the check requires recording
a valid gp->sched in reflect.call too. This is a good thing in general,
since it will make stack traces during reflect.call work better, and it
may be useful for preemption too.

R=dvyukov
CC=golang-dev
https://golang.org/cl/10709043
2013-06-27 16:51:06 -04:00
Russ Cox
6fa3c89b77 runtime: record proper goroutine state during stack split
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.

For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:

        TEXT myfunc
                check for split
                jmpcond nosplit
                call morestack
        nosplit:
                sub $xxx, sp

Now an extra instruction is inserted after the call:

        TEXT myfunc
        start:
                check for split
                jmpcond nosplit
                call morestack
                jmp start
        nosplit:
                sub $xxx, sp

The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.

The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.

The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.

Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)

runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
        morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
        sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow

goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
        /Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
        /Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
        /Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
        /Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
        /Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8

And here is a seg fault during oldstack:

SIGSEGV: segmentation violation
PC=0x1b2a6

runtime.oldstack()
        /Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
        /Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22

goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
        /Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
        /Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
        /Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
        strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8

goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
        /Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
        /Users/rsc/g/go/src/pkg/runtime/proc.c:166

rax     0x23ccc0
rbx     0x23ccc0
rcx     0x0
rdx     0x38
rdi     0x2102c0170
rsi     0x221032cfe0
rbp     0x221032cfa0
rsp     0x7fff5fbff5b0
r8      0x2102c0120
r9      0x221032cfa0
r10     0x221032c000
r11     0x104ce8
r12     0xe5c80
r13     0x1be82baac718
r14     0x13091135f7d69200
r15     0x0
rip     0x1b2a6
rflags  0x10246
cs      0x2b
fs      0x0
gs      0x0

Fixes #5723.

R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
2013-06-27 11:32:01 -04:00
Ian Lance Taylor
0627248a1f runtime: update runtime·gogo comment in asm files
R=golang-dev, minux.ma
CC=golang-dev
https://golang.org/cl/10244043
2013-06-12 15:05:10 -07:00
Russ Cox
d67e7e3acf runtime: add lr, ctxt, ret to Gobuf
Add gostartcall and gostartcallfn.
The old gogocall = gostartcall + gogo.
The old gogocallfn = gostartcallfn + gogo.

R=dvyukov, minux.ma
CC=golang-dev
https://golang.org/cl/10036044
2013-06-12 15:22:26 -04:00
Russ Cox
6120ef0799 runtime: rename _rt0_$GOARCH to _rt0_go
There's no reason to use a different name on each architecture,
and doing so makes it impossible for portable code to refer to
the original Go runtime entry point. Rename it _rt0_go everywhere.

This is a global search and replace only.

R=golang-dev, bradfitz, minux.ma
CC=golang-dev
https://golang.org/cl/10196043
2013-06-11 16:49:24 -04:00
Russ Cox
528534c1d4 runtime: fix comments (g->gobuf became g->sched long ago)
Should reduce size of CL 9868044.

R=golang-dev, ality
CC=golang-dev
https://golang.org/cl/10045043
2013-06-05 07:16:53 -04:00
Dmitriy Vyukov
f5becf4233 runtime: add stackguard0 to G
This is part of preemptive scheduler.
stackguard0 is checked in split stack checks and can be set to StackPreempt.
stackguard is not set to StackPreempt (holds the original value).

R=golang-dev, daniel.morsing, iant
CC=golang-dev
https://golang.org/cl/9875043
2013-06-03 12:28:24 +04:00
Keith Randall
ee66972dce runtime: Optimize aeshash a bit. Use a better predicted branch
for checking for page boundary.  Also avoid boundary check
when >=16 bytes are hashed.

benchmark                        old ns/op    new ns/op    delta
BenchmarkHashStringSpeed                23           22   -0.43%
BenchmarkHashBytesSpeed                 44           42   -3.61%
BenchmarkHashStringArraySpeed           71           68   -4.05%

R=iant, khr
CC=gobot, golang-dev, google
https://golang.org/cl/9123046
2013-05-15 09:40:14 -07:00
Keith Randall
b3946dc119 runtime/bytes: fast Compare for byte arrays and strings.
Uses SSE instructions to process 16 bytes at a time.

fixes #5354

R=bradfitz, google
CC=golang-dev
https://golang.org/cl/8853048
2013-05-14 16:05:51 -07:00
Keith Randall
3d5daa2319 runtime: Implement faster equals for strings and bytes.
(amd64)
benchmark           old ns/op    new ns/op    delta
BenchmarkEqual0            16            6  -63.15%
BenchmarkEqual9            22            7  -65.37%
BenchmarkEqual32           36            9  -74.91%
BenchmarkEqual4K         2187          120  -94.51%

benchmark            old MB/s     new MB/s  speedup
BenchmarkEqual9        392.22      1134.38    2.89x
BenchmarkEqual32       866.72      3457.39    3.99x
BenchmarkEqual4K      1872.73     33998.87   18.15x

(386)
benchmark           old ns/op    new ns/op    delta
BenchmarkEqual0            16            5  -63.85%
BenchmarkEqual9            22            7  -67.84%
BenchmarkEqual32           34           12  -64.94%
BenchmarkEqual4K         2196          113  -94.85%

benchmark            old MB/s     new MB/s  speedup
BenchmarkEqual9        405.81      1260.18    3.11x
BenchmarkEqual32       919.55      2631.21    2.86x
BenchmarkEqual4K      1864.85     36072.54   19.34x

Update #3751

R=bradfitz, r, khr, dave, remyoudompheng, fullung, minux.ma, ality
CC=golang-dev
https://golang.org/cl/8056043
2013-04-02 16:26:15 -07:00
Russ Cox
6a70f9d073 runtime: pass setmg function to cgo_init
This keeps the logic about how to set the thread-local variables
m and g in code compiled and linked by the gc toolchain,
an important property for upcoming cgo changes.

It's also just a nice cleanup: one less place to update when
these details change.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7560048
2013-03-25 18:14:02 -04:00
Russ Cox
07720b67b3 build: update assembly variable names for vet
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7834046
2013-03-22 12:57:55 -04:00
Keith Randall
db53d97ac4 runtime: Use aligned loads for AES key schedule.
R=rsc, minux.ma, khr
CC=golang-dev
https://golang.org/cl/7763050
2013-03-20 14:34:26 -07:00
Keith Randall
a5d4024139 runtime: faster & safer hash function
Uses AES hardware instructions on 386/amd64 to implement
a fast hash function.  Incorporates a random key to
thwart hash collision DOS attacks.
Depends on CL#7548043 for new assembly instructions.

Update #3885
Helps some by making hashing faster.  Go time drops from
0.65s to 0.51s.

R=rsc, r, bradfitz, remyoudompheng, khr, dsymonds, minux.ma, elias.naur
CC=golang-dev
https://golang.org/cl/7543043
2013-03-12 10:47:44 -07:00
Russ Cox
36b414f639 runtime: change amd64 startup convention
Now the default startup is that the program begins at _rt0_amd64_$GOOS,
which sets DI = argc, SI = argv and jumps to _rt0_amd64.

This makes the _rt0_amd64 entry match the expected semantics for
the standard C "main" function, which we can now provide for use when
linking against a standard C library.

R=golang-dev, devon.odell, minux.ma
CC=golang-dev
https://golang.org/cl/7525043
2013-03-06 15:03:04 -05:00
Dmitriy Vyukov
add3349867 runtime: add atomic xchg64
It will be handy for network poller.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7429048
2013-03-05 09:46:52 +02:00
Russ Cox
f8d49b509b runtime/cgo: make symbol naming consistent
The naming in this package is a disaster.
Make it all consistent.

Remove some 'static' from functions that will
be referred to from other files soon.

This CL is purely renames using global search and replace.

Submitting separately so that real changes will not
be drowned out by these renames in future CLs.

TBR=iant
CC=golang-dev
https://golang.org/cl/7416046
2013-02-28 16:24:38 -05:00
Russ Cox
3d2dfc5a7b runtime: add cgocallback_gofunc that can call Go func value
For now, all the callbacks from C use top-level Go functions,
so they use the equivalent C function pointer, and will continue
to do so. But perhaps some day this will be useful for calling
a Go func value (at least if the type is already known).

More importantly, the Windows callback code needs to be able
to use cgocallback_gofunc to call a Go func value.
Should fix the Windows build.

R=ken2
CC=golang-dev
https://golang.org/cl/7388049
2013-02-22 16:08:56 -05:00
Russ Cox
6066fdcf38 cmd/6g, cmd/8g: switch to DX for indirect call block
runtime: add context argument to gogocall

Too many other things use AX, and at least one
(stack zeroing) cannot be moved onto a different
register. Use the less special DX instead.

Preparation for step 2 of http://golang.org/s/go11func.
Nothing interesting here, just split out so that we can
see it's correct before moving on.

R=ken2
CC=golang-dev
https://golang.org/cl/7395050
2013-02-22 10:47:54 -05:00
Russ Cox
1903ad7189 cmd/gc, reflect, runtime: switch to indirect func value representation
Step 1 of http://golang.org/s/go11func.

R=golang-dev, r, daniel.morsing, remyoudompheng
CC=golang-dev
https://golang.org/cl/7393045
2013-02-21 17:01:13 -05:00
Russ Cox
6c976393ae runtime: allow cgo callbacks on non-Go threads
Fixes #4435.

R=golang-dev, iant, alex.brainman, minux.ma, dvyukov
CC=golang-dev
https://golang.org/cl/7304104
2013-02-20 17:48:23 -05:00
Alex Brainman
7f075ece42 runtime: increase stack frame during cgo call on windows/amd64
Fixes #3945.

R=golang-dev, minux.ma
CC=golang-dev, vcc.163
https://golang.org/cl/6490056
2012-09-03 12:12:51 +10:00
Akshat Kumar
a72bebf6e1 src: Add support for 64-bit version of Plan 9
This set of changes extends the Plan 9 support
to include the AMD64 architecture and should
work on all versions of Plan 9.

R=golang-dev, rminnich, noah.evans, rsc, minux.ma, npe
CC=akskuma, golang-dev, jfflore, noah.evans
https://golang.org/cl/6479052
2012-08-31 13:21:13 -04:00
Russ Cox
d42495aa80 cmd/cc: add PREFETCH built-in (like SET, USED)
This makes it possible to inline the prefetch of upcoming
memory addresses during garbage collection, instead of
needing to flush registers, make a function call, and
reload registers.  On garbage collection-heavy workloads,
this results in a 5% speedup.

Fixes #3493.

R=dvyukov, ken, r, dave
CC=golang-dev
https://golang.org/cl/5990066
2012-05-02 16:22:56 -04:00
Russ Cox
35d260fa4c 6a, 6l: add PREFETCH instructions
R=ken2
CC=golang-dev
https://golang.org/cl/5989073
2012-04-10 10:09:09 -04:00
Dmitriy Vyukov
4667571619 runtime: add 64-bit atomics
This is factored out part of:
https://golang.org/cl/5279048/
(Parallel GC)

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5985047
2012-04-05 18:47:43 +04:00
Russ Cox
9e5db8c90a 5l, 6l, 8l: fix stack split logic for stacks near default segment size
Fixes #3310.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5823051
2012-03-15 15:22:30 -04:00
Russ Cox
36aa7d4d14 runtime: inline calls to notok
When a very low-level system call that should never fail
does fail, we call notok, which crashes the program.
Often, we are then left with only the program counter as
information about the crash, and it is in notok.
Instead, inline calls to notok (it is just one instruction
on most systems) so that the program counter will
tell us which system call is unhappy.

R=golang-dev, gri, minux.ma, bradfitz
CC=golang-dev
https://golang.org/cl/5792048
2012-03-08 14:03:56 -05:00