1
0
mirror of https://github.com/golang/go synced 2024-10-04 22:31:22 -06:00
Commit Graph

314 Commits

Author SHA1 Message Date
Keith Randall
a5d4024139 runtime: faster & safer hash function
Uses AES hardware instructions on 386/amd64 to implement
a fast hash function.  Incorporates a random key to
thwart hash collision DOS attacks.
Depends on CL#7548043 for new assembly instructions.

Update #3885
Helps some by making hashing faster.  Go time drops from
0.65s to 0.51s.

R=rsc, r, bradfitz, remyoudompheng, khr, dsymonds, minux.ma, elias.naur
CC=golang-dev
https://golang.org/cl/7543043
2013-03-12 10:47:44 -07:00
Dmitriy Vyukov
c211884731 runtime: add network polling support into scheduler
This is a part of the bigger change that moves network poller into runtime:
https://golang.org/cl/7326051/

R=golang-dev, bradfitz, mikioh.mikioh, rsc
CC=golang-dev
https://golang.org/cl/7448048
2013-03-12 21:14:26 +04:00
Dmitriy Vyukov
6ee739d7e9 runtime: fix deadlock detector false negative
The issue was that scvg is assigned *after* the scavenger goroutine is started,
so when the scavenger calls entersyscall() the g==scvg check can fail.
Fixes #5025.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/7629045
2013-03-12 17:21:44 +04:00
Dmitriy Vyukov
433824d808 runtime: fix misaligned 64-bit atomic
Fixes #4869.
Fixes #5007.
Update #5005.

R=golang-dev, 0xe2.0x9a.0x9b, bradfitz
CC=golang-dev
https://golang.org/cl/7534044
2013-03-10 20:46:11 +04:00
Rémy Oudompheng
1b8f51c917 runtime: fix integer overflow in amd64 memmove.
Fixes #4981.

R=bradfitz, fullung, rsc, minux.ma
CC=golang-dev
https://golang.org/cl/7474047
2013-03-09 00:41:03 +01:00
Akshat Kumar
a566deace1 syscall: Plan 9: use lightweight errstr in entersyscall mode
Change 231af8ac63aa (CL 7314062) made runtime.enteryscall()
set m->mcache = nil, which means that we can no longer use
syscall.errstr in syscall.Syscall and syscall.Syscall6, since it
requires a new buffer to be allocated for holding the error string.
Instead, we use pre-allocated per-M storage to hold error strings
from syscalls made while in entersyscall mode, and call
runtime.findnull to calculate the lengths.

Fixes #4994.

R=rsc, rminnich, ality, dvyukov, rminnich, r
CC=golang-dev
https://golang.org/cl/7567043
2013-03-08 00:54:44 +01:00
Russ Cox
e0deb2ef7f undo CL 7301062 / 9742f722b558
broke arm garbage collector

traceback_arm fails with a missing pc. It needs CL 7494043.
But that only makes the build break later, this time with
"invalid freelist". Roll back until it can be fixed correctly.

««« original CL description
runtime: restrict stack root scan to locals and arguments

R=rsc
CC=golang-dev
https://golang.org/cl/7301062
»»»

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/7493044
2013-03-05 15:36:40 -05:00
Dmitriy Vyukov
add3349867 runtime: add atomic xchg64
It will be handy for network poller.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7429048
2013-03-05 09:46:52 +02:00
Dmitriy Vyukov
d0c11d20b8 runtime: declare addtimer/deltimer in runtime.h
In preparation for integrated network poller
(https://golang.org/cl/7326051),
this is required to handle deadlines.

R=golang-dev, remyoudompheng, rsc
CC=golang-dev
https://golang.org/cl/7446047
2013-03-05 09:38:15 +02:00
Carl Shapiro
db018bf77c runtime: restrict stack root scan to locals and arguments
R=rsc
CC=golang-dev
https://golang.org/cl/7301062
2013-03-04 19:48:50 -08:00
Russ Cox
3611553c3b runtime: add atomics to fix arm
R=golang-dev, minux.ma
CC=golang-dev
https://golang.org/cl/7429046
2013-03-01 14:57:05 -05:00
Russ Cox
e6a3e22c75 runtime: start all threads with runtime.mstart
Putting the M initialization in multiple places will not scale.
Various code assumes mstart is the start already. Make it so.

R=golang-dev, devon.odell
CC=golang-dev
https://golang.org/cl/7420048
2013-03-01 11:44:43 -05:00
Russ Cox
d0d7416d3f runtime: more build fixing
Move the mstartfn into its own field.
Simpler, more likely to be correct.

R=golang-dev, devon.odell
CC=golang-dev
https://golang.org/cl/7414046
2013-03-01 09:24:17 -05:00
Russ Cox
c5f694a5c9 runtime: fix new scheduler on freebsd, windows
R=devon.odell
CC=golang-dev
https://golang.org/cl/7443046
2013-03-01 08:30:11 -05:00
Dmitriy Vyukov
779c45a507 runtime: improved scheduler
Distribute runnable queues, memory cache
and cache of dead G's per processor.
Faster non-blocking syscall enter/exit.
More conservative worker thread blocking/unblocking.

R=dave, bradfitz, remyoudompheng, rsc
CC=golang-dev
https://golang.org/cl/7314062
2013-03-01 13:49:16 +02:00
Dmitriy Vyukov
6cdfb00f4e runtime: more changes in preparation to the new scheduler
add per-P cache of dead G's
add global runnable queue (not used for now)
add list of idle P's (not used for now)

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7397061
2013-02-27 21:17:53 +02:00
Jan Ziak
a656f82071 runtime: precise garbage collection of channels
This changeset adds a mostly-precise garbage collection of channels.
The garbage collection support code in the linker isn't recognizing
channel types yet.

Fixes issue http://stackoverflow.com/questions/14712586/memory-consumption-skyrocket

R=dvyukov, rsc, bradfitz
CC=dave, golang-dev, minux.ma, remyoudompheng
https://golang.org/cl/7307086
2013-02-25 15:58:23 -05:00
Dmitriy Vyukov
353ce60f6e runtime: implement local work queues (in preparation for new scheduler)
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7402047
2013-02-23 08:48:02 +04:00
Russ Cox
6066fdcf38 cmd/6g, cmd/8g: switch to DX for indirect call block
runtime: add context argument to gogocall

Too many other things use AX, and at least one
(stack zeroing) cannot be moved onto a different
register. Use the less special DX instead.

Preparation for step 2 of http://golang.org/s/go11func.
Nothing interesting here, just split out so that we can
see it's correct before moving on.

R=ken2
CC=golang-dev
https://golang.org/cl/7395050
2013-02-22 10:47:54 -05:00
Russ Cox
1903ad7189 cmd/gc, reflect, runtime: switch to indirect func value representation
Step 1 of http://golang.org/s/go11func.

R=golang-dev, r, daniel.morsing, remyoudompheng
CC=golang-dev
https://golang.org/cl/7393045
2013-02-21 17:01:13 -05:00
Carl Shapiro
f466617a62 cmd/5g, cmd/5l, cmd/6l, cmd/8l, cmd/gc, cmd/ld, runtime: accurate args and locals information
Previously, the func structure contained an inaccurate value for
the args member and a 0 value for the locals member.

This change populates the func structure with args and locals
values computed by the compiler.  The number of args was
already available in the ATEXT instruction.  The number of
locals is now passed through in the new ALOCALS instruction.

This change also switches the unit of args and locals to be
bytes, just like the frame member, instead of 32-bit words.

R=golang-dev, bradfitz, cshapiro, dave, rsc
CC=golang-dev
https://golang.org/cl/7399045
2013-02-21 12:52:26 -08:00
Dmitriy Vyukov
a0955a2aa2 runtime: split minit() to mpreinit() and minit()
mpreinit() is called on the parent thread and with mcache (can allocate memory),
minit() is called on the child thread and can not allocate memory.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7389043
2013-02-21 16:24:38 +04:00
Russ Cox
6c976393ae runtime: allow cgo callbacks on non-Go threads
Fixes #4435.

R=golang-dev, iant, alex.brainman, minux.ma, dvyukov
CC=golang-dev
https://golang.org/cl/7304104
2013-02-20 17:48:23 -05:00
Dmitriy Vyukov
e25f19a638 runtime: introduce entersyscallblock()
In preparation for the new scheduler.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7386044
2013-02-20 20:21:45 +04:00
Carl Shapiro
7f9c02a10d runtime: add conversion specifier to printf for char values
R=r, golang-dev
CC=golang-dev
https://golang.org/cl/7327053
2013-02-19 18:05:44 -08:00
Russ Cox
c7f7bbbf03 runtime: fix build on linux
In addition to the compile failure fixed in signal*.c,
preserving the signal mask led to very strange crashes.
Testing shows that looking for SIG_IGN is all that
matters to get along with nohup, so reintroduce
sigset_zero instead of trying to preserve the signal mask.

TBR=iant
CC=golang-dev
https://golang.org/cl/7323067
2013-02-15 12:18:33 -05:00
Russ Cox
f3407f445d runtime: fix running under nohup
There are two ways nohup(1) might be implemented:
it might mask away the signal, or it might set the handler
to SIG_IGN, both of which are inherited across fork+exec.
So two fixes:

* Make sure to preserve the inherited signal mask at
minit instead of clearing it.

* If the SIGHUP handler is SIG_IGN, leave it that way.

Fixes #4491.

R=golang-dev, mikioh.mikioh, iant
CC=golang-dev
https://golang.org/cl/7308102
2013-02-15 11:18:55 -05:00
Dmitriy Vyukov
0a40cd2661 runtime/race: switch to explicit race context instead of goroutine id's
Removes limit on maximum number of goroutines ever existed.
code.google.com/p/goexecutor tests now pass successfully.
Also slightly improves performance.
Before: $ time ./flate.test -test.short
real	0m9.314s
After:  $ time ./flate.test -test.short
real	0m8.958s
Fixes #4286.
The runtime is built from llvm rev 174312.

R=rsc
CC=golang-dev
https://golang.org/cl/7218044
2013-02-06 11:40:54 +04:00
Russ Cox
b0a29f393b runtime: cgo-related fixes
* Separate internal and external LockOSThread, for cgo safety.
* Show goroutine that made faulting cgo call.
* Never start a panic due to a signal caused by a cgo call.

Fixes #3774.
Fixes #3775.
Fixes #3797.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/7228081
2013-02-01 08:34:41 -08:00
Jan Ziak
4da6b36fbf runtime: local allocation in mprof.goc
Binary data in mprof.goc may prevent the garbage collector from freeing
memory blocks. This patch replaces all calls to runtime·mallocgc() with
calls to an allocator private to mprof.goc, thus making the private
memory invisible to the garbage collector. The addrhash variable is
moved outside of the .bss section.

R=golang-dev, dvyukov, rsc, minux.ma
CC=dave, golang-dev, remyoudompheng
https://golang.org/cl/7135063
2013-01-30 09:01:31 -08:00
Akshat Kumar
c74f3c4576 runtime: add support for panic/recover in Plan 9 note handler
This change also resolves some issues with note handling: we now make
sure that there is enough room at the bottom of every goroutine to
execute the note handler, and the `exitstatus' is no longer a global
entity, which resolves some race conditions.

R=rminnich, npe, rsc, ality
CC=golang-dev
https://golang.org/cl/6569068
2013-01-30 02:53:56 -08:00
Dmitriy Vyukov
81221f512d runtime: dump the full stack of a throwing goroutine
Useful for debugging of runtime bugs.
+ Do not print "stack segment boundary" unless GOTRACEBACK>1.
+ Do not traceback system goroutines unless GOTRACEBACK>1.

R=rsc, minux.ma
CC=golang-dev
https://golang.org/cl/7098050
2013-01-29 14:57:11 +04:00
Shenghou Ma
4019d0e424 runtime: avoid defining the same variable in more than one translation unit
For gccgo runtime and Darwin where -fno-common is the default.

R=iant, dave
CC=golang-dev
https://golang.org/cl/7094061
2013-01-26 09:57:06 +08:00
Dmitriy Vyukov
f82db7d9e4 runtime: less aggressive per-thread stack segment caching
Introduce global stack segment cache and limit per-thread cache size.
This greatly reduces StackSys memory on workloads that create lots of threads.

benchmark                      old ns/op    new ns/op    delta
BenchmarkStackGrowth                 665          656   -1.35%
BenchmarkStackGrowth-2               333          328   -1.50%
BenchmarkStackGrowth-4               224          172  -23.21%
BenchmarkStackGrowth-8               124           91  -26.13%
BenchmarkStackGrowth-16               82           47  -41.94%
BenchmarkStackGrowth-32               73           40  -44.79%

BenchmarkStackGrowthDeep           97231        94391   -2.92%
BenchmarkStackGrowthDeep-2         47230        58562  +23.99%
BenchmarkStackGrowthDeep-4         24993        49356  +97.48%
BenchmarkStackGrowthDeep-8         15105        30072  +99.09%
BenchmarkStackGrowthDeep-16        10005        15623  +56.15%
BenchmarkStackGrowthDeep-32        12517        13069   +4.41%

TestStackMem#1,MB                  310          12       -96.13%
TestStackMem#2,MB                  296          14       -95.27%
TestStackMem#3,MB                  479          14       -97.08%

TestStackMem#1,sec                 3.22         2.26     -29.81%
TestStackMem#2,sec                 2.43         2.15     -11.52%
TestStackMem#3,sec                 2.50         2.38      -4.80%

R=sougou, no.smile.face, rsc
CC=golang-dev, msolomon
https://golang.org/cl/7029044
2013-01-10 09:57:06 +04:00
Ian Lance Taylor
63bee953a2 runtime: always incorporate hash seed at start of hash computation
Otherwise we can get predictable collisions.

R=golang-dev, dave, patrick, rsc
CC=golang-dev
https://golang.org/cl/7051043
2013-01-04 07:53:42 -08:00
Russ Cox
0de71619ce runtime: aggregate defer allocations
benchmark             old ns/op    new ns/op    delta
BenchmarkDefer              165          113  -31.52%
BenchmarkDefer10            155          103  -33.55%
BenchmarkDeferMany          216          158  -26.85%

benchmark            old allocs   new allocs    delta
BenchmarkDefer                1            0  -100.00%
BenchmarkDefer10              1            0  -100.00%
BenchmarkDeferMany            1            0  -100.00%

benchmark             old bytes    new bytes    delta
BenchmarkDefer               64            0  -100.00%
BenchmarkDefer10             64            0  -100.00%
BenchmarkDeferMany           64           66    3.12%

Fixes #2364.

R=ken2
CC=golang-dev
https://golang.org/cl/7001051
2012-12-22 14:54:39 -05:00
Jingcheng Zhang
70e967b7bc runtime: use "mp" and "gp" instead of "m" and "g" for local variable name to avoid confusion with the global "m" and "g".
R=golang-dev, minux.ma, rsc
CC=bradfitz, golang-dev
https://golang.org/cl/6939064
2012-12-19 00:30:29 +08:00
Jan Ziak
51b8edcb37 runtime: use reflect·call() to enter the function gc()
Garbage collection code (to be merged later) is calling functions
which have many local variables. This increases the probability that
the stack capacity won't be big enough to hold the local variables.
So, start gc() on a bigger stack to eliminate a potentially large number
of calls to runtime·morestack().

R=rsc, remyoudompheng, dsymonds, minux.ma, iant, iant
CC=golang-dev
https://golang.org/cl/6846044
2012-11-27 13:04:59 -05:00
Ian Lance Taylor
e9a3087e29 runtime, runtime/cgo: track memory allocated by non-Go code
Otherwise a poorly timed GC can collect the memory before it
is returned to the Go program.

R=golang-dev, dave, dvyukov, minux.ma
CC=golang-dev
https://golang.org/cl/6819119
2012-11-10 11:19:06 -08:00
Jan Ziak
5c1422afab runtime: move Itab to runtime.h
The 'type' field of Itab will be used by the garbage collector.

R=rsc
CC=golang-dev
https://golang.org/cl/6815059
2012-11-01 13:13:20 -04:00
Dmitriy Vyukov
320df44f04 runtime: switch to 64-bit goroutine ids
Fixes #4275.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6759053
2012-10-26 10:13:06 +04:00
Jan Ziak
4a191c2c1b runtime: store types of allocated objects
R=rsc
CC=golang-dev
https://golang.org/cl/6569057
2012-10-21 17:41:32 -04:00
Nigel Tao
90ad6a2d11 runtime: update comment for the "extern register" variables g and m.
R=rsc, minux.ma, ality
CC=dave, golang-dev
https://golang.org/cl/6620050
2012-10-19 11:02:32 +11:00
Dmitriy Vyukov
2f6cbc74f1 race: runtime changes
This is a part of a bigger change that adds data race detection feature:
https://golang.org/cl/6456044

R=rsc
CC=gobot, golang-dev
https://golang.org/cl/6535050
2012-10-07 22:05:32 +04:00
Dmitriy Vyukov
4cc7bf326a pprof: add goroutine blocking profiling
The profiler collects goroutine blocking information similar to Google Perf Tools.
You may see an example of the profile (converted to svg) attached to
http://code.google.com/p/go/issues/detail?id=3946
The public API changes are:
+pkg runtime, func BlockProfile([]BlockProfileRecord) (int, bool)
+pkg runtime, func SetBlockProfileRate(int)
+pkg runtime, method (*BlockProfileRecord) Stack() []uintptr
+pkg runtime, type BlockProfileRecord struct
+pkg runtime, type BlockProfileRecord struct, Count int64
+pkg runtime, type BlockProfileRecord struct, Cycles int64
+pkg runtime, type BlockProfileRecord struct, embedded StackRecord

R=rsc, dave, minux.ma, r
CC=gobot, golang-dev, r, remyoudompheng
https://golang.org/cl/6443115
2012-10-06 12:56:04 +04:00
Russ Cox
10ea6519e4 build: make int 64 bits on amd64
The assembly offsets were converted mechanically using
code.google.com/p/rsc/cmd/asmlint. The instruction
changes were done by hand.

Fixes #2188.

R=iant, r, bradfitz, remyoudompheng
CC=golang-dev
https://golang.org/cl/6550058
2012-09-24 20:57:01 -04:00
Jan Ziak
f8c58373e5 runtime: add types to MSpan
R=rsc
CC=golang-dev
https://golang.org/cl/6554060
2012-09-24 20:08:05 -04:00
Russ Cox
0b08c9483f runtime: prepare for 64-bit ints
This CL makes the runtime understand that the type of
the len or cap of a map, slice, or string is 'int', not 'int32',
and it is also careful to distinguish between function arguments
and results of type 'int' vs type 'int32'.

In the runtime, the new typedefs 'intgo' and 'uintgo' refer
to Go int and uint. The C types int and uint continue to be
unavailable (cause intentional compile errors).

This CL does not change the meaning of int, but it should make
the eventual change of the meaning of int on amd64 a bit
smoother.

Update #2188.

R=iant, r, dave, remyoudompheng
CC=golang-dev
https://golang.org/cl/6551067
2012-09-24 14:58:34 -04:00
Shenghou Ma
b151af1f36 runtime: fix mmap comments
We only pass lower 32 bits of file offset to asm routine.

R=r, dave, rsc
CC=golang-dev
https://golang.org/cl/6499118
2012-09-21 13:50:02 +08:00
Dmitriy Vyukov
f20fd87384 runtime: refactor goroutine blocking
The change is a preparation for the new scheduler.
It introduces runtime.park() function,
that will atomically unlock the mutex and park the goroutine.
It will allow to remove the racy readyonstop flag
that is difficult to implement w/o the global scheduler mutex.

R=rsc, remyoudompheng, dave
CC=golang-dev
https://golang.org/cl/6501077
2012-09-18 21:15:46 +04:00
Jan Ziak
54193689cc cmd/ld: fix compilation when GOARCH != GOHOSTARCH
R=rsc, dave, minux.ma
CC=golang-dev
https://golang.org/cl/6493123
2012-09-17 17:18:21 -04:00
Ivan Krasin
5287175ad9 runtime: add vdso support for linux/amd64. Fixes issue 1933.
R=iant, imkrasin, krasin, iant, minux.ma, rsc, nigeltao, r, fullung
CC=golang-dev
https://golang.org/cl/6454046
2012-08-31 18:07:04 -04:00
Shenghou Ma
0157c72d13 runtime: inline several float64 routines to speed up complex128 division
Depends on CL 6197045.

Result obtained on Core i7 620M, Darwin/amd64:
benchmark                       old ns/op    new ns/op    delta
BenchmarkComplex128DivNormal           57           28  -50.78%
BenchmarkComplex128DivNisNaN           49           15  -68.90%
BenchmarkComplex128DivDisNaN           49           15  -67.88%
BenchmarkComplex128DivNisInf           40           12  -68.50%
BenchmarkComplex128DivDisInf           33           13  -61.06%

Result obtained on Core i7 620M, Darwin/386:
benchmark                       old ns/op    new ns/op    delta
BenchmarkComplex128DivNormal           89           50  -44.05%
BenchmarkComplex128DivNisNaN          307          802  +161.24%
BenchmarkComplex128DivDisNaN          309          788  +155.02%
BenchmarkComplex128DivNisInf          278          237  -14.75%
BenchmarkComplex128DivDisInf           46           22  -52.46%

Result obtained on 700MHz OMAP4460, Linux/ARM:
benchmark                       old ns/op    new ns/op    delta
BenchmarkComplex128DivNormal         1557          465  -70.13%
BenchmarkComplex128DivNisNaN         1443          220  -84.75%
BenchmarkComplex128DivDisNaN         1481          218  -85.28%
BenchmarkComplex128DivNisInf          952          216  -77.31%
BenchmarkComplex128DivDisInf          861          231  -73.17%

The 386 version has a performance regression, but as we have
decided to use SSE2 instead of x87 FPU for 386 too (issue 3912),
I won't address this issue.

R=dsymonds, mchaten, iant, dave, mtj, rsc, r
CC=golang-dev
https://golang.org/cl/6024045
2012-08-07 23:45:50 +08:00
Dmitriy Vyukov
a54f920bfe runtime: move panic/defer/recover-related stuff to a separate file
Move panic/defer/recover-related stuff from proc.c/runtime.c to a new file panic.c.
No semantic changes.
proc.c is 1800+ LOC and is a bit difficult to work with.

R=golang-dev, dave, r
CC=golang-dev
https://golang.org/cl/6343071
2012-07-04 14:52:51 +04:00
Dmitriy Vyukov
ed516df4e4 runtime: add freemcache() function
It will be required for scheduler that maintains
GOMAXPROCS MCache's.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6350062
2012-07-01 13:10:01 +04:00
Jan Ziak
334bf95f9e runtime: update field types in preparation for GC changes
R=rsc, remyoudompheng, minux.ma, ality
CC=golang-dev
https://golang.org/cl/6242061
2012-05-30 13:07:52 -04:00
Alex Brainman
afe0e97aa6 runtime: handle windows exceptions, even in cgo programs
Fixes #3543.

R=golang-dev, kardianos, rsc
CC=golang-dev, hectorchu, vcc.163
https://golang.org/cl/6245063
2012-05-30 15:10:54 +10:00
Russ Cox
6dbaa206fb runtime: replace runtime·rnd function with ROUND macro
It's sad to introduce a new macro, but rnd shows up consistently
in profiles, and the function call overwhelms the two arithmetic
instructions it performs.

R=r
CC=golang-dev
https://golang.org/cl/6260051
2012-05-29 14:02:29 -04:00
Dmitriy Vyukov
01826280eb runtime: refactor helpgc functionality in preparation for parallel GC
Parallel GC needs to know in advance how many helper threads will be there.
Hopefully it's the last patch before I can tackle parallel sweep phase.
The benchmarks are unaffected.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6200064
2012-05-15 19:10:16 +04:00
Dmitriy Vyukov
95643647ae runtime: add parallel for algorithm
This is factored out part of:
https://golang.org/cl/5279048/
(parallel GC)

R=bsiegert, mpimenov, rsc, minux.ma, r
CC=golang-dev
https://golang.org/cl/5986054
2012-05-11 10:50:03 +04:00
Dmitriy Vyukov
a5dc7793c0 runtime: add lock-free stack
This is factored out part of the:
https://golang.org/cl/5279048/
(parallel GC)

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5993043
2012-04-12 11:49:25 +04:00
Dmitriy Vyukov
d839a809b2 runtime: make GC stats per-M
This is factored out part of:
https://golang.org/cl/5279048/
(Parallel GC)

benchmark                             old ns/op    new ns/op    delta
garbage.BenchmarkParser              3999106750   3975026500   -0.60%
garbage.BenchmarkParser-2            3720553750   3719196500   -0.04%
garbage.BenchmarkParser-4            3502857000   3474980500   -0.80%
garbage.BenchmarkParser-8            3375448000   3341310500   -1.01%
garbage.BenchmarkParserLastPause      329401000    324097000   -1.61%
garbage.BenchmarkParserLastPause-2    208953000    214222000   +2.52%
garbage.BenchmarkParserLastPause-4    110933000    111656000   +0.65%
garbage.BenchmarkParserLastPause-8     71969000     78230000   +8.70%
garbage.BenchmarkParserPause          230808842    197237400  -14.55%
garbage.BenchmarkParserPause-2        123674365    125197595   +1.23%
garbage.BenchmarkParserPause-4         80518525     85710333   +6.45%
garbage.BenchmarkParserPause-8         58310243     56940512   -2.35%
garbage.BenchmarkTree2                 31471700     31289400   -0.58%
garbage.BenchmarkTree2-2               21536800     21086300   -2.09%
garbage.BenchmarkTree2-4               11074700     10880000   -1.76%
garbage.BenchmarkTree2-8                7568600      7351400   -2.87%
garbage.BenchmarkTree2LastPause       314664000    312840000   -0.58%
garbage.BenchmarkTree2LastPause-2     215319000    210815000   -2.09%
garbage.BenchmarkTree2LastPause-4     110698000    108751000   -1.76%
garbage.BenchmarkTree2LastPause-8      75635000     73463000   -2.87%
garbage.BenchmarkTree2Pause           174280857    173147571   -0.65%
garbage.BenchmarkTree2Pause-2         131332714    129665761   -1.27%
garbage.BenchmarkTree2Pause-4          93803095     93422904   -0.41%
garbage.BenchmarkTree2Pause-8          86242333     85146761   -1.27%

R=rsc
CC=golang-dev
https://golang.org/cl/5987045
2012-04-05 20:48:28 +04:00
Dmitriy Vyukov
4667571619 runtime: add 64-bit atomics
This is factored out part of:
https://golang.org/cl/5279048/
(Parallel GC)

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5985047
2012-04-05 18:47:43 +04:00
Ian Lance Taylor
aabbcda816 runtime: remove unused runtime·signame and runtime·newError
R=golang-dev
CC=golang-dev
https://golang.org/cl/5756044
2012-03-06 09:07:00 -08:00
Russ Cox
6e2ae0a12c runtime/pprof: support OS X CPU profiling
Work around profiling kernel bug with signal masks.
Still broken on 64-bit Snow Leopard kernel,
but I think we can ignore that one and let people
upgrade to Lion.

Add new trivial tools addr2line and objdump to take
the place of the GNU tools of the same name, since
those are not installed on OS X.

Adapt pprof to invoke 'go tool addr2line' and
'go tool objdump' if the system tools do not exist.

Clean up disassembly of base register on amd64.

Fixes #2008.

R=golang-dev, bradfitz, mikioh.mikioh, r, iant
CC=golang-dev
https://golang.org/cl/5697066
2012-02-28 16:18:24 -05:00
Russ Cox
102274a30e runtime: size arena to fit in virtual address space limit
For Brad.
Now FreeBSD/386 binaries run on nearlyfreespeech.net.

Fixes #2302.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5700060
2012-02-24 15:28:51 -05:00
Russ Cox
e4b02bfdc0 runtime: goroutine profile, stack dumps
R=golang-dev, r, r
CC=golang-dev
https://golang.org/cl/5687076
2012-02-22 21:45:01 -05:00
David Symonds
3d8ebefbbe runtime: Permit default behaviour of SIGTSTP, SIGTTIN, SIGTTOU.
Fixes #3037.

R=rsc, minux.ma, r, rsc
CC=golang-dev
https://golang.org/cl/5674072
2012-02-17 14:36:40 +11:00
Russ Cox
1707a9977f runtime: on 386, fix FP control word on all threads, not just initial thread
It is possible that Linux and Windows copy the FP control word
from the parent thread when creating a new thread.  Empirically,
Darwin does not.  Reset the FP control world in all cases.

Enable the floating-point strconv test.

Fixes #2917 (again).

R=golang-dev, r, iant
CC=golang-dev
https://golang.org/cl/5660047
2012-02-14 01:23:15 -05:00
Alex Brainman
07a2989d17 runtime, syscall, os/signal: fix windows build
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/5656048
2012-02-14 13:51:38 +11:00
Russ Cox
35586f718c os/signal: selective signal handling
Restore package os/signal, with new API:
Notify replaces Incoming, allowing clients
to ask for certain signals only.  Also, signals
go to everyone who asks, not just one client.

This could plausibly move into package os now
that there are no magic side effects as a result
of the import.

Update runtime for new API: move common Unix
signal handling code into signal_unix.c.
(It's so easy to do this now that we don't have
to edit Makefiles!)

Tested on darwin,linux 386,amd64.

Fixes #1266.

R=r, dsymonds, bradfitz, iant, borman
CC=golang-dev
https://golang.org/cl/3749041
2012-02-13 13:52:37 -05:00
Russ Cox
6a75ece01c runtime: delete Type and implementations (use reflect instead)
unsafe: delete Typeof, Reflect, Unreflect, New, NewArray

Part of issue 2955 and issue 2968.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5650069
2012-02-12 23:26:20 -05:00
Russ Cox
5b93fc9da6 runtime, pprof: add profiling of thread creation
Same idea as heap profile: how did each thread get created?
Low memory (256 bytes per OS thread), high reward for
programs that suddenly have many threads running.

Fixes #1477.

R=golang-dev, r, dvyukov
CC=golang-dev
https://golang.org/cl/5639059
2012-02-08 10:33:54 -05:00
Damian Gryski
8e765da941 runtime: add runtime.cputicks() and seed fastrand with it
This patch adds a function to get the current cpu ticks.  This is
deemed to be 'sufficiently random' to use to seed fastrand to mitigate
the algorithmic complexity attacks on the hash table implementation.

On AMD64 we use the RDTSC instruction.  For 386, this instruction,
while valid, is not recognized by 8a so I've inserted the opcode by
hand.  For ARM, this routine is currently stubbed to return a constant
0 value.

Future work: update 8a to recognize RDTSC.

Fixes #2630.

R=rsc
CC=golang-dev
https://golang.org/cl/5606048
2012-02-02 14:09:27 -05:00
Russ Cox
408f0b1f74 gc, runtime: handle floating point map keys
Fixes #2609.

R=ken2
CC=golang-dev
https://golang.org/cl/5572069
2012-01-26 16:25:07 -05:00
Dmitriy Vyukov
1ff1405cc7 runtime: add type algorithms for zero-sized types
BenchmarkChanSem old=127ns new=78.6ns

R=golang-dev, bradfitz, sameer, rsc
CC=golang-dev
https://golang.org/cl/5558049
2012-01-20 10:32:55 +04:00
Shenghou Ma
fec7aa952f doc: update out-of-date comments about runtime/cgo
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5532100
2012-01-19 17:13:33 -05:00
Russ Cox
e83cd7f750 build: a round of fixes
TBR=r
CC=golang-dev
https://golang.org/cl/5503052
2011-12-20 17:54:40 -05:00
Russ Cox
851f30136d runtime: make more build-friendly
Collapse the arch,os-specific directories into the main directory
by renaming xxx/foo.c to foo_xxx.c, and so on.

There are no substantial edits here, except to the Makefile.
The assumption is that the Go tool will #define GOOS_darwin
and GOARCH_amd64 and will make any file named something
like signals_darwin.h available as signals_GOOS.h during the
build.  This replaces what used to be done with -I$(GOOS).

There is still work to be done to make runtime build with
standard tools, but this is a big step.  After this we will have
to write a script to generate all the generated files so they
can be checked in (instead of generated during the build).

R=r, iant, r, lucio.dere
CC=golang-dev
https://golang.org/cl/5490053
2011-12-16 15:33:58 -05:00
Russ Cox
196b663075 gc: implement == on structs and arrays
To allow these types as map keys, we must fill in
equal and hash functions in their algorithm tables.
Structs or arrays that are "just memory", like [2]int,
can and do continue to use the AMEM algorithm.
Structs or arrays that contain special values like
strings or interface values use generated functions
for both equal and hash.

The runtime helper func runtime.equal(t, x, y) bool handles
the general equality case for x == y and calls out to
the equal implementation in the algorithm table.

For short values (<= 4 struct fields or array elements),
the sequence of elementwise comparisons is inlined
instead of calling runtime.equal.

R=ken, mpimenov
CC=golang-dev
https://golang.org/cl/5451105
2011-12-12 22:22:09 -05:00
Russ Cox
b9ccd077dc runtime: prep for type-specific algorithms
Equality on structs will require arbitrary code for type equality,
so change algorithm in type data from uint8 to table pointer.
In the process, trim top-level map structure from
104/80 bytes (64-bit/32-bit) to 24/12.

Equality on structs will require being able to call code generated
by the Go compiler, and C code has no way to access Go return
values, so change the hash and equal algorithm functions to take
a pointer to a result instead of returning the result.

R=ken
CC=golang-dev
https://golang.org/cl/5453043
2011-12-05 09:40:22 -05:00
Russ Cox
3b860269ee runtime: add timer support, use for package time
This looks like it is just moving some code from
time to runtime (and translating it to C), but the
runtime can do a better job managing the goroutines,
and it needs this functionality for its own maintenance
(for example, for the garbage collector to hand back
unused memory to the OS on a time delay).
Might as well have just one copy of the timer logic,
and runtime can't depend on time, so vice versa.

It also unifies Sleep, NewTicker, and NewTimer behind
one mechanism, so that there are no claims that one
is more efficient than another.  (For example, today
people recommend using time.After instead of time.Sleep
to avoid blocking an OS thread.)

Fixes #1644.
Fixes #1731.
Fixes #2190.

R=golang-dev, r, hectorchu, iant, iant, jsing, alex.brainman, dvyukov
CC=golang-dev
https://golang.org/cl/5334051
2011-11-09 15:17:05 -05:00
Russ Cox
f437331f80 time: faster Nanoseconds call
runtime knows how to get the time of day
without allocating memory.

R=golang-dev, dsymonds, dave, hectorchu, r, cw
CC=golang-dev
https://golang.org/cl/5297078
2011-11-03 17:35:28 -04:00
Dmitriy Vyukov
ee24bfc058 runtime: unify mutex code across OSes
The change introduces 2 generic mutex implementations
(futex- and semaphore-based). Each OS chooses a suitable mutex
implementation and implements few callbacks (e.g. futex wait/wake).
The CL reduces code duplication, extends some optimizations available
only on Linux/Windows to other OSes and provides ground
for futher optimizations. Chan finalizers are finally eliminated.

(Linux/amd64, 8 HT cores)
benchmark                      old      new
BenchmarkChanContended         83.6     77.8 ns/op
BenchmarkChanContended-2       341      328 ns/op
BenchmarkChanContended-4       382      383 ns/op
BenchmarkChanContended-8       390      374 ns/op
BenchmarkChanContended-16      313      291 ns/op

(Darwin/amd64, 2 cores)
benchmark                      old      new
BenchmarkChanContended         159      172 ns/op
BenchmarkChanContended-2       6735     263 ns/op
BenchmarkChanContended-4       10384    255 ns/op
BenchmarkChanCreation          1174     407 ns/op
BenchmarkChanCreation-2        4007     254 ns/op
BenchmarkChanCreation-4        4029     246 ns/op

R=rsc, jsing, hectorchu
CC=golang-dev
https://golang.org/cl/5140043
2011-11-02 16:42:01 +03:00
Russ Cox
6808da0163 runtime: lock the main goroutine to the main OS thread during init
We only guarantee that the main goroutine runs on the
main OS thread for initialization.  Programs that wish to
preserve that property for main.main can call runtime.LockOSThread.
This is what programs used to do before we unleashed
goroutines during init, so it is both a simple fix and keeps
existing programs working.

R=iant, r, dave, dvyukov
CC=golang-dev
https://golang.org/cl/5309070
2011-10-27 18:04:12 -07:00
Dmitriy Vyukov
c14b2689f0 runtime: faster finalizers
Linux/amd64, 2 x Intel Xeon E5620, 8 HT cores, 2.40GHz
benchmark                    old ns/op    new ns/op    delta
BenchmarkFinalizer              420.00       261.00  -37.86%
BenchmarkFinalizer-2            985.00       201.00  -79.59%
BenchmarkFinalizer-4           1077.00       244.00  -77.34%
BenchmarkFinalizer-8           1155.00       180.00  -84.42%
BenchmarkFinalizer-16          1182.00       184.00  -84.43%

BenchmarkFinalizerRun          2128.00      1378.00  -35.24%
BenchmarkFinalizerRun-2        1655.00      1418.00  -14.32%
BenchmarkFinalizerRun-4        1634.00      1522.00   -6.85%
BenchmarkFinalizerRun-8        2213.00      1581.00  -28.56%
BenchmarkFinalizerRun-16       2424.00      1599.00  -34.03%

Darwin/amd64, Intel L9600, 2 cores, 2.13GHz
benchmark                    old ns/op    new ns/op    delta
BenchmarkChanCreation          1451.00       926.00  -36.18%
BenchmarkChanCreation-2        3124.00      1412.00  -54.80%
BenchmarkChanCreation-4        6121.00      2628.00  -57.07%

BenchmarkFinalizer              684.00       420.00  -38.60%
BenchmarkFinalizer-2          11195.00       398.00  -96.44%
BenchmarkFinalizer-4          15862.00       654.00  -95.88%

BenchmarkFinalizerRun          2025.00      1397.00  -31.01%
BenchmarkFinalizerRun-2        3920.00      1447.00  -63.09%
BenchmarkFinalizerRun-4        9471.00      1545.00  -83.69%

R=golang-dev, cw, rsc
CC=golang-dev
https://golang.org/cl/4963057
2011-10-06 18:42:51 +03:00
Russ Cox
d324f2143b runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.

On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 	 1 cpu, no atomics
32.27u 0.58s 32.86r 	 1 cpu, atomic instructions
33.04u 0.83s 27.47r 	 2 cpu

On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 	 1 cpu, no atomics
34.87u 1.12s 29.60r 	 2 cpu
36.00u 1.87s 28.43r 	 3 cpu
36.46u 2.34s 27.10r 	 4 cpu
38.28u 3.85s 26.92r 	 5 cpu
37.72u 5.25s 26.73r	 6 cpu
39.63u 7.11s 26.95r	 7 cpu
39.67u 8.10s 26.68r	 8 cpu

On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 	 1 cpu, no atomics
43.98u 2.95s 38.69r 	 2 cpu

On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 	 1 cpu, no atomics
57.15u 4.72s 51.54r 	 2 cpu

The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.

R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 09:40:01 -04:00
Hector Chu
9fd26872cb runtime: implement pprof support for windows
Credit to jp for proof of concept.

R=alex.brainman, jp, rsc, dvyukov
CC=golang-dev
https://golang.org/cl/4960057
2011-09-17 17:57:59 +10:00
Hector Chu
5c30325983 runtime: eliminate handle churn when churning channels on Windows
The Windows implementation of the net package churns through a couple of channels for every read/write operation.  This translates into a lot of time spent in the kernel creating and deleting event objects.

R=rsc, dvyukov, alex.brainman, jp
CC=golang-dev
https://golang.org/cl/4997044
2011-09-14 20:23:21 -04:00
Alex Brainman
7406379fff runtime: syscall to return both AX and DX for windows/386
Fixes #2181.

R=golang-dev, jp
CC=golang-dev
https://golang.org/cl/5000042
2011-09-14 16:19:45 +10:00
Alex Brainman
2a80882601 runtime: use cgo runtime functions to call windows syscalls
R=rsc
CC=golang-dev, jp, vcc.163
https://golang.org/cl/4926042
2011-08-27 23:17:00 +10:00
Russ Cox
33e9d24ad9 runtime: fix void warnings
Add -V flag to 6c command line to keep them fixed.

R=ken2
CC=golang-dev
https://golang.org/cl/4930046
2011-08-23 13:13:27 -04:00
Russ Cox
03e9ea5b74 runtime: simplify stack traces
Make the stack traces more readable for new
Go programmers while preserving their utility for old hands.

- Change status number [4] to string.
- Elide frames in runtime package (internal details).
- Swap file:line and arguments.
- Drop 'created by' for main goroutine.
- Show goroutines in order of allocation:
  implies main goroutine first if nothing else.

There is no option to get the extra frames back.
Uncomment 'return 1' at the bottom of symtab.c.

$ 6.out
throw: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.main()
       /Users/rsc/g/go/src/pkg/runtime/x.go:22 +0x8a

goroutine 2 [select (no cases)]:
main.sel()
       /Users/rsc/g/go/src/pkg/runtime/x.go:11 +0x18
created by main.main
       /Users/rsc/g/go/src/pkg/runtime/x.go:19 +0x23

goroutine 3 [chan receive]:
main.recv(0xf8400010a0, 0x0)
       /Users/rsc/g/go/src/pkg/runtime/x.go:15 +0x2e
created by main.main
       /Users/rsc/g/go/src/pkg/runtime/x.go:20 +0x50

goroutine 4 [chan receive (nil chan)]:
main.recv(0x0, 0x0)
       /Users/rsc/g/go/src/pkg/runtime/x.go:15 +0x2e
created by main.main
       /Users/rsc/g/go/src/pkg/runtime/x.go:21 +0x66
$

$ 6.out index
panic: runtime error: index out of range

goroutine 1 [running]:
main.main()
        /Users/rsc/g/go/src/pkg/runtime/x.go:25 +0xb9
$

$ 6.out nil
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x22ca]

goroutine 1 [running]:
main.main()
        /Users/rsc/g/go/src/pkg/runtime/x.go:28 +0x211
$

$ 6.out panic
panic: panic

goroutine 1 [running]:
main.main()
        /Users/rsc/g/go/src/pkg/runtime/x.go:30 +0x101
$

R=golang-dev, qyzhai, n13m3y3r, r
CC=golang-dev
https://golang.org/cl/4907048
2011-08-22 23:26:39 -04:00
Alex Brainman
72e83483a7 runtime: speed up cgo calls
Allocate Defer on stack during cgo calls, as suggested
by dvyukov. Also includes some comment corrections.

benchmark                   old,ns/op   new,ns/op
BenchmarkCgoCall                  669         330
(Intel Xeon CPU 1.80GHz * 4, Linux 386)

R=dvyukov, rsc
CC=golang-dev
https://golang.org/cl/4910041
2011-08-18 12:17:09 -04:00
Russ Cox
3770b0e60c gc: implement nil chan support
The spec has defined nil chans this way for months.
I'm behind.

R=ken2
CC=golang-dev
https://golang.org/cl/4897050
2011-08-17 15:54:17 -04:00
Russ Cox
65bde087ae gc: implement nil map support
The spec has defined nil maps this way for months.
I'm behind.

R=ken2
CC=golang-dev
https://golang.org/cl/4901052
2011-08-17 14:56:27 -04:00
Dmitriy Vyukov
a2677cf363 runtime: fix GC bitmap corruption
The corruption can occur when GOMAXPROCS
is changed from >1 to 1, since GOMAXPROCS=1
does not imply there is only 1 goroutine running,
other goroutines can still be not parked after
the change.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4873050
2011-08-16 16:53:02 -04:00
Alex Brainman
9c774c3f26 runtime: correct seh installation during callbacks
Every time we enter callback from Windows, it is
possible that go exception handler is not at the top
of per-thread exception handlers chain. So it needs
to be installed again. At this moment this is done
by replacing top SEH frame with SEH frame as at time
of syscall for the time of callback. This is incorrect,
because, if exception strike, we won't be able to call
any exception handlers installed inside syscall,
because they are not in the chain. This changes
procedure to add new SEH frame on top of existing
chain instead.

I also removed m sehframe field, because I don't
think it is needed. We use single global exception
handler everywhere.

R=golang-dev, r
CC=golang-dev, hectorchu
https://golang.org/cl/4832060
2011-08-10 17:17:28 +10:00
Dmitriy Vyukov
54e9406ffb runtime: add more specialized type algorithms
The change adds specialized type algorithms
for slices and types of size 8/16/32/64/128.
It significantly accelerates chan and map operations
for most builtin types as well as user structs.

benchmark                   old,ns/op   new,ns/op
BenchmarkChanUncontended          226          94
(on Intel Xeon E5620, 2.4GHz, Linux 64 bit)

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4815087
2011-08-08 09:35:32 -04:00
Dmitriy Vyukov
d770aadee5 runtime: faster chan creation on Linux/FreeBSD/Plan9
The change removes chan finalizer (Lock destructor)
if it is not required on the platform.

benchmark                    old ns/op    new ns/op    delta
BenchmarkChanCreation          1132.00       381.00  -66.34%
BenchmarkChanCreation-2        1215.00       243.00  -80.00%
BenchmarkChanCreation-4        1084.00       186.00  -82.84%
BenchmarkChanCreation-8        1415.00       154.00  -89.12%
BenchmarkChanCreation-16       1386.00       144.00  -89.61%
(on 2 x Intel Xeon E5620, 8 HT cores, 2.4 GHz, Linux)

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4841041
2011-08-04 08:31:03 -04:00
Dmitriy Vyukov
a496c9eaa6 runtime: correct Note documentation
Reflect the fact that notesleep() can be called
by exactly one thread.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4816064
2011-08-03 15:51:55 -04:00
Dmitriy Vyukov
91f0f18100 runtime: fix data race in findfunc()
The data race can lead to reads of partially
initialized concurrently mutated symbol data.
The change also adds a simple sanity test
for Caller() and FuncForPC().

R=rsc
CC=golang-dev
https://golang.org/cl/4817058
2011-07-29 13:47:24 -04:00
Dmitriy Vyukov
4e5086b993 runtime: improve Linux mutex
The implementation is hybrid active/passive spin/blocking mutex.
The design minimizes amount of context switches and futex calls.
The idea is that all critical sections in runtime are intentially
small, so pure blocking mutex behaves badly causing
a lot of context switches, thread parking/unparking and kernel calls.
Note that some synthetic benchmarks become somewhat slower,
that's due to increased contention on other data structures,
it should not affect programs that do any real work.

On 2 x Intel E5620, 8 HT cores, 2.4GHz
benchmark                     old ns/op    new ns/op    delta
BenchmarkSelectContended         521.00       503.00   -3.45%
BenchmarkSelectContended-2       661.00       320.00  -51.59%
BenchmarkSelectContended-4      1139.00       629.00  -44.78%
BenchmarkSelectContended-8      2870.00       878.00  -69.41%
BenchmarkSelectContended-16     5276.00       818.00  -84.50%
BenchmarkChanContended           112.00       103.00   -8.04%
BenchmarkChanContended-2         631.00       174.00  -72.42%
BenchmarkChanContended-4         682.00       272.00  -60.12%
BenchmarkChanContended-8        1601.00       520.00  -67.52%
BenchmarkChanContended-16       3100.00       372.00  -88.00%
BenchmarkChanSync                253.00       239.00   -5.53%
BenchmarkChanSync-2             5030.00      4648.00   -7.59%
BenchmarkChanSync-4             4826.00      4694.00   -2.74%
BenchmarkChanSync-8             4778.00      4713.00   -1.36%
BenchmarkChanSync-16            5289.00      4710.00  -10.95%
BenchmarkChanProdCons0           273.00       254.00   -6.96%
BenchmarkChanProdCons0-2         599.00       400.00  -33.22%
BenchmarkChanProdCons0-4        1168.00       659.00  -43.58%
BenchmarkChanProdCons0-8        2831.00      1057.00  -62.66%
BenchmarkChanProdCons0-16       4197.00      1037.00  -75.29%
BenchmarkChanProdCons10          150.00       140.00   -6.67%
BenchmarkChanProdCons10-2        607.00       268.00  -55.85%
BenchmarkChanProdCons10-4       1137.00       404.00  -64.47%
BenchmarkChanProdCons10-8       2115.00       828.00  -60.85%
BenchmarkChanProdCons10-16      4283.00       855.00  -80.04%
BenchmarkChanProdCons100         117.00       110.00   -5.98%
BenchmarkChanProdCons100-2       558.00       218.00  -60.93%
BenchmarkChanProdCons100-4       722.00       287.00  -60.25%
BenchmarkChanProdCons100-8      1840.00       431.00  -76.58%
BenchmarkChanProdCons100-16     3394.00       448.00  -86.80%
BenchmarkChanProdConsWork0      2014.00      1996.00   -0.89%
BenchmarkChanProdConsWork0-2    1207.00      1127.00   -6.63%
BenchmarkChanProdConsWork0-4    1913.00       611.00  -68.06%
BenchmarkChanProdConsWork0-8    3016.00       949.00  -68.53%
BenchmarkChanProdConsWork0-16   4320.00      1154.00  -73.29%
BenchmarkChanProdConsWork10     1906.00      1897.00   -0.47%
BenchmarkChanProdConsWork10-2   1123.00      1033.00   -8.01%
BenchmarkChanProdConsWork10-4   1076.00       571.00  -46.93%
BenchmarkChanProdConsWork10-8   2748.00      1096.00  -60.12%
BenchmarkChanProdConsWork10-16  4600.00      1105.00  -75.98%
BenchmarkChanProdConsWork100    1884.00      1852.00   -1.70%
BenchmarkChanProdConsWork100-2  1235.00      1146.00   -7.21%
BenchmarkChanProdConsWork100-4  1217.00       619.00  -49.14%
BenchmarkChanProdConsWork100-8  1534.00       509.00  -66.82%
BenchmarkChanProdConsWork100-16 4126.00       918.00  -77.75%
BenchmarkSyscall                  34.40        33.30   -3.20%
BenchmarkSyscall-2               160.00       121.00  -24.38%
BenchmarkSyscall-4               131.00       136.00   +3.82%
BenchmarkSyscall-8               139.00       131.00   -5.76%
BenchmarkSyscall-16              161.00       168.00   +4.35%
BenchmarkSyscallWork             950.00       950.00   +0.00%
BenchmarkSyscallWork-2           481.00       480.00   -0.21%
BenchmarkSyscallWork-4           268.00       270.00   +0.75%
BenchmarkSyscallWork-8           156.00       169.00   +8.33%
BenchmarkSyscallWork-16          188.00       184.00   -2.13%
BenchmarkSemaSyntNonblock         36.40        35.60   -2.20%
BenchmarkSemaSyntNonblock-2       81.40        45.10  -44.59%
BenchmarkSemaSyntNonblock-4      126.00       108.00  -14.29%
BenchmarkSemaSyntNonblock-8      112.00       112.00   +0.00%
BenchmarkSemaSyntNonblock-16     110.00       112.00   +1.82%
BenchmarkSemaSyntBlock            35.30        35.30   +0.00%
BenchmarkSemaSyntBlock-2         118.00       124.00   +5.08%
BenchmarkSemaSyntBlock-4         105.00       108.00   +2.86%
BenchmarkSemaSyntBlock-8         101.00       111.00   +9.90%
BenchmarkSemaSyntBlock-16        112.00       118.00   +5.36%
BenchmarkSemaWorkNonblock        810.00       811.00   +0.12%
BenchmarkSemaWorkNonblock-2      476.00       414.00  -13.03%
BenchmarkSemaWorkNonblock-4      238.00       228.00   -4.20%
BenchmarkSemaWorkNonblock-8      140.00       126.00  -10.00%
BenchmarkSemaWorkNonblock-16     117.00       116.00   -0.85%
BenchmarkSemaWorkBlock           810.00       811.00   +0.12%
BenchmarkSemaWorkBlock-2         454.00       466.00   +2.64%
BenchmarkSemaWorkBlock-4         243.00       241.00   -0.82%
BenchmarkSemaWorkBlock-8         145.00       137.00   -5.52%
BenchmarkSemaWorkBlock-16        132.00       123.00   -6.82%
BenchmarkContendedSemaphore      123.00       102.00  -17.07%
BenchmarkContendedSemaphore-2     34.80        34.90   +0.29%
BenchmarkContendedSemaphore-4     34.70        34.80   +0.29%
BenchmarkContendedSemaphore-8     34.70        34.70   +0.00%
BenchmarkContendedSemaphore-16    34.80        34.70   -0.29%
BenchmarkMutex                    26.80        26.00   -2.99%
BenchmarkMutex-2                 108.00        45.20  -58.15%
BenchmarkMutex-4                 103.00       127.00  +23.30%
BenchmarkMutex-8                 109.00       147.00  +34.86%
BenchmarkMutex-16                102.00       152.00  +49.02%
BenchmarkMutexSlack               27.00        26.90   -0.37%
BenchmarkMutexSlack-2            149.00       165.00  +10.74%
BenchmarkMutexSlack-4            121.00       209.00  +72.73%
BenchmarkMutexSlack-8            101.00       158.00  +56.44%
BenchmarkMutexSlack-16            97.00       129.00  +32.99%
BenchmarkMutexWork               792.00       794.00   +0.25%
BenchmarkMutexWork-2             407.00       409.00   +0.49%
BenchmarkMutexWork-4             220.00       209.00   -5.00%
BenchmarkMutexWork-8             267.00       160.00  -40.07%
BenchmarkMutexWork-16            315.00       300.00   -4.76%
BenchmarkMutexWorkSlack          792.00       793.00   +0.13%
BenchmarkMutexWorkSlack-2        406.00       404.00   -0.49%
BenchmarkMutexWorkSlack-4        225.00       212.00   -5.78%
BenchmarkMutexWorkSlack-8        268.00       136.00  -49.25%
BenchmarkMutexWorkSlack-16       300.00       300.00   +0.00%
BenchmarkRWMutexWrite100          27.10        27.00   -0.37%
BenchmarkRWMutexWrite100-2        33.10        40.80  +23.26%
BenchmarkRWMutexWrite100-4       113.00        88.10  -22.04%
BenchmarkRWMutexWrite100-8       119.00        95.30  -19.92%
BenchmarkRWMutexWrite100-16      148.00       109.00  -26.35%
BenchmarkRWMutexWrite10           29.60        29.40   -0.68%
BenchmarkRWMutexWrite10-2        111.00        61.40  -44.68%
BenchmarkRWMutexWrite10-4        270.00       208.00  -22.96%
BenchmarkRWMutexWrite10-8        204.00       185.00   -9.31%
BenchmarkRWMutexWrite10-16       261.00       190.00  -27.20%
BenchmarkRWMutexWorkWrite100    1040.00      1036.00   -0.38%
BenchmarkRWMutexWorkWrite100-2   593.00       580.00   -2.19%
BenchmarkRWMutexWorkWrite100-4   470.00       365.00  -22.34%
BenchmarkRWMutexWorkWrite100-8   468.00       289.00  -38.25%
BenchmarkRWMutexWorkWrite100-16  604.00       374.00  -38.08%
BenchmarkRWMutexWorkWrite10      951.00       951.00   +0.00%
BenchmarkRWMutexWorkWrite10-2   1001.00       928.00   -7.29%
BenchmarkRWMutexWorkWrite10-4   1555.00      1006.00  -35.31%
BenchmarkRWMutexWorkWrite10-8   2085.00      1171.00  -43.84%
BenchmarkRWMutexWorkWrite10-16  2082.00      1614.00  -22.48%

R=rsc, iant, msolo, fw, iant
CC=golang-dev
https://golang.org/cl/4711045
2011-07-29 12:44:06 -04:00
Russ Cox
db9229def8 cgo: add GoBytes, fix gmp example
Fixes #1640.
Fixes #2007.

R=golang-dev, adg
CC=golang-dev
https://golang.org/cl/4815063
2011-07-28 12:39:50 -04:00
Dmitriy Vyukov
d6ed1b70ad runtime: replace centralized ncgocall counter with a distributed one
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4809042
2011-07-21 11:29:08 -04:00
Dmitriy Vyukov
86a659cad0 runtime: fix data race during Itab hash update/lookup
The data race is on newly published Itab nodes, which are
both unsafely published and unsafely acquired. It can
break on IA-32/Intel64 due to compiler optimizations
(most likely not an issue as of now) and on ARM due to
hardware memory access reorderings.

R=rsc
CC=golang-dev
https://golang.org/cl/4673055
2011-07-13 11:22:41 -07:00
Quan Yong Zhai
fe9991e8b2 runtime: replace runtime.mcpy with runtime.memmove
faster string operations, and more

tested on linux/386

runtime_test.BenchmarkSliceToString                    642          532  -17.13%
runtime_test.BenchmarkStringToSlice                    636          528  -16.98%
runtime_test.BenchmarkConcatString                    1109          897  -19.12%

R=r, iant, rsc
CC=golang-dev
https://golang.org/cl/4674042
2011-07-12 17:30:40 -07:00
Dmitriy Vyukov
c9152a8568 runtime: eliminate contention during stack allocation
Standard-sized stack frames use plain malloc/free
instead of centralized lock-protected FixAlloc.
Benchmark results on HP Z600 (2 x Xeon E5620, 8 HT cores, 2.40GHz)
are as follows:
benchmark                                        old ns/op    new ns/op    delta
BenchmarkStackGrowth                               1045.00       949.00   -9.19%
BenchmarkStackGrowth-2                             3450.00       800.00  -76.81%
BenchmarkStackGrowth-4                             5076.00       513.00  -89.89%
BenchmarkStackGrowth-8                             7805.00       471.00  -93.97%
BenchmarkStackGrowth-16                           11751.00       321.00  -97.27%

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4657091
2011-07-12 09:24:32 -07:00
Dmitriy Vyukov
013ad89c9b runtime: eliminate false sharing on runtime.goidgen
runtime.goidgen can be quite frequently modified and
shares cache line with the following variables,
it leads to false sharing.
50c6b0 b nfname
50c6b4 b nfunc
50c6b8 b nfunc$17
50c6bc b nhist$17
50c6c0 B runtime.checking
50c6c4 B runtime.gcwaiting
50c6c8 B runtime.goidgen
50c6cc B runtime.gomaxprocs
50c6d0 B runtime.panicking
50c6d4 B strconv.IntSize
50c6d8 B src/pkg/runtime/_xtest_.ss
50c6e0 B src/pkg/runtime/_xtest_.stop
50c6e8 b addrfree
50c6f0 b addrmem
50c6f8 b argv

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4673054
2011-07-12 01:25:14 -04:00
Dmitriy Vyukov
909f31872a runtime: eliminate false sharing on random number generators
Use machine-local random number generator instead of
racy global ones.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4674049
2011-07-12 01:23:58 -04:00
Wei Guangjing
f83609f642 runtime: windows/amd64 port
R=rsc, alex.brainman, hectorchu, r
CC=golang-dev
https://golang.org/cl/3759042
2011-06-29 17:37:56 +10:00
Dmitriy Vyukov
997c00f991 runtime: replace Semacquire/Semrelease implementation
1. The implementation uses distributed hash table of waitlists instead of a centralized one.
  It significantly improves scalability for uncontended semaphores.
2. The implementation provides wait-free fast-path for signalers.
3. The implementation uses less locks (1 lock/unlock instead of 5 for Semacquire).
4. runtime·ready() call is moved out of critical section.
5. Semacquire() does not call semwake().
Benchmark results on HP Z600 (2 x Xeon E5620, 8 HT cores, 2.40GHz)
are as follows:
benchmark                                        old ns/op    new ns/op    delta
runtime_test.BenchmarkSemaUncontended                58.20        36.30  -37.63%
runtime_test.BenchmarkSemaUncontended-2             199.00        18.30  -90.80%
runtime_test.BenchmarkSemaUncontended-4             327.00         9.20  -97.19%
runtime_test.BenchmarkSemaUncontended-8             491.00         5.32  -98.92%
runtime_test.BenchmarkSemaUncontended-16            946.00         4.18  -99.56%

runtime_test.BenchmarkSemaSyntNonblock               59.00        36.80  -37.63%
runtime_test.BenchmarkSemaSyntNonblock-2            167.00       138.00  -17.37%
runtime_test.BenchmarkSemaSyntNonblock-4            333.00       129.00  -61.26%
runtime_test.BenchmarkSemaSyntNonblock-8            464.00       130.00  -71.98%
runtime_test.BenchmarkSemaSyntNonblock-16          1015.00       136.00  -86.60%

runtime_test.BenchmarkSemaSyntBlock                  58.80        36.70  -37.59%
runtime_test.BenchmarkSemaSyntBlock-2               294.00       149.00  -49.32%
runtime_test.BenchmarkSemaSyntBlock-4               333.00       177.00  -46.85%
runtime_test.BenchmarkSemaSyntBlock-8               471.00       221.00  -53.08%
runtime_test.BenchmarkSemaSyntBlock-16              990.00       227.00  -77.07%

runtime_test.BenchmarkSemaWorkNonblock              829.00       832.00   +0.36%
runtime_test.BenchmarkSemaWorkNonblock-2            425.00       419.00   -1.41%
runtime_test.BenchmarkSemaWorkNonblock-4            308.00       220.00  -28.57%
runtime_test.BenchmarkSemaWorkNonblock-8            394.00       147.00  -62.69%
runtime_test.BenchmarkSemaWorkNonblock-16          1510.00       149.00  -90.13%

runtime_test.BenchmarkSemaWorkBlock                 828.00       813.00   -1.81%
runtime_test.BenchmarkSemaWorkBlock-2               428.00       436.00   +1.87%
runtime_test.BenchmarkSemaWorkBlock-4               232.00       219.00   -5.60%
runtime_test.BenchmarkSemaWorkBlock-8               392.00       251.00  -35.97%
runtime_test.BenchmarkSemaWorkBlock-16             1524.00       298.00  -80.45%

sync_test.BenchmarkMutexUncontended                  24.10        24.00   -0.41%
sync_test.BenchmarkMutexUncontended-2                12.00        12.00   +0.00%
sync_test.BenchmarkMutexUncontended-4                 6.25         6.17   -1.28%
sync_test.BenchmarkMutexUncontended-8                 3.43         3.34   -2.62%
sync_test.BenchmarkMutexUncontended-16                2.34         2.32   -0.85%

sync_test.BenchmarkMutex                             24.70        24.70   +0.00%
sync_test.BenchmarkMutex-2                          208.00        99.50  -52.16%
sync_test.BenchmarkMutex-4                         2744.00       256.00  -90.67%
sync_test.BenchmarkMutex-8                         5137.00       556.00  -89.18%
sync_test.BenchmarkMutex-16                        5368.00      1284.00  -76.08%

sync_test.BenchmarkMutexSlack                        24.70        25.00   +1.21%
sync_test.BenchmarkMutexSlack-2                    1094.00       186.00  -83.00%
sync_test.BenchmarkMutexSlack-4                    3430.00       402.00  -88.28%
sync_test.BenchmarkMutexSlack-8                    5051.00      1066.00  -78.90%
sync_test.BenchmarkMutexSlack-16                   6806.00      1363.00  -79.97%

sync_test.BenchmarkMutexWork                        793.00       792.00   -0.13%
sync_test.BenchmarkMutexWork-2                      398.00       398.00   +0.00%
sync_test.BenchmarkMutexWork-4                     1441.00       308.00  -78.63%
sync_test.BenchmarkMutexWork-8                     8532.00       847.00  -90.07%
sync_test.BenchmarkMutexWork-16                    8225.00      2760.00  -66.44%

sync_test.BenchmarkMutexWorkSlack                   793.00       793.00   +0.00%
sync_test.BenchmarkMutexWorkSlack-2                 418.00       414.00   -0.96%
sync_test.BenchmarkMutexWorkSlack-4                4481.00       480.00  -89.29%
sync_test.BenchmarkMutexWorkSlack-8                6317.00      1598.00  -74.70%
sync_test.BenchmarkMutexWorkSlack-16               9111.00      3038.00  -66.66%

R=rsc
CC=golang-dev
https://golang.org/cl/4631059
2011-06-28 15:09:53 -04:00
Jonathan Mark
ddde52ae56 runtime: SysMap uses MAP_FIXED if needed on 64-bit Linux
This change was adapted from gccgo's libgo/runtime/mem.c at
Ian Taylor's suggestion.  It fixes all.bash failing with
"address space conflict: map() =" on amd64 Linux with kernel
version 2.6.32.8-grsec-2.1.14-modsign-xeon-64.
With this change, SysMap will use MAP_FIXED to allocate its desired
address space, after first calling mincore to check that there is
nothing else mapped there.

R=iant, dave, n13m3y3r, rsc
CC=golang-dev
https://golang.org/cl/4438091
2011-06-07 21:50:10 -07:00
Robert Hencke
3fbd478a8a pkg: spelling tweaks, I-Z
also, a few miscellaneous fixes to files outside pkg

R=golang-dev, dsymonds, mikioh.mikioh, r
CC=golang-dev
https://golang.org/cl/4517116
2011-05-30 18:02:59 +10:00
Alexey Borzenkov
b701cf3332 runtime: make StackSystem part of StackGuard
Fixes #1779

R=rsc
CC=golang-dev
https://golang.org/cl/4543052
2011-05-16 16:57:49 -04:00
Russ Cox
370276a3e5 runtime: stack split + garbage collection bug
The g->sched.sp saved stack pointer and the
g->stackbase and g->stackguard stack bounds
can change even while "the world is stopped",
because a goroutine has to call functions (and
therefore might split its stack) when exiting a
system call to check whether the world is stopped
(and if so, wait until the world continues).

That means the garbage collector cannot access
those values safely (without a race) for goroutines
executing system calls.  Instead, save a consistent
triple in g->gcsp, g->gcstack, g->gcguard during
entersyscall and have the garbage collector refer
to those.

The old code was occasionally seeing (because of
the race) an sp and stk that did not correspond to
each other, so that stk - sp was not the number of
stack bytes following sp.  In that case, if sp < stk
then the call scanblock(sp, stk - sp) scanned too
many bytes (anything between the two pointers,
which pointed into different allocation blocks).
If sp > stk then stk - sp wrapped around.
On 32-bit, stk - sp is a uintptr (uint32) converted
to int64 in the call to scanblock, so a large (~4G)
but positive number.  Scanblock would try to scan
that many bytes and eventually fault accessing
unmapped memory.  On 64-bit, stk - sp is a uintptr (uint64)
promoted to int64 in the call to scanblock, so a negative
number.  Scanblock would not scan anything, possibly
causing in-use blocks to be freed.

In short, 32-bit platforms would have seen either
ineffective garbage collection or crashes during garbage
collection, while 64-bit platforms would have seen
either ineffective or incorrect garbage collection.
You can see the invalid arguments to scanblock in the
stack traces in issue 1620.

Fixes #1620.
Fixes #1746.

R=iant, r
CC=golang-dev
https://golang.org/cl/4437075
2011-04-27 23:21:12 -04:00
Russ Cox
40fccbce6b reflect: more efficient; cannot Set result of NewValue anymore
* Reduces malloc counts during gob encoder/decoder test from 6/6 to 3/5.

The current reflect uses Set to mean two subtly different things.

(1) If you have a reflect.Value v, it might just represent
itself (as in v = reflect.NewValue(42)), in which case calling
v.Set only changed v, not any other data in the program.

(2) If you have a reflect Value v derived from a pointer
or a slice (as in x := []int{42}; v = reflect.NewValue(x).Index(0)),
v represents the value held there.  Changing x[0] affects the
value returned by v.Int(), and calling v.Set affects x[0].

This was not really by design; it just happened that way.

The motivation for the new reflect implementation was
to remove mallocs.  The use case (1) has an implicit malloc
inside it.  If you can do:

       v := reflect.NewValue(0)
       v.Set(42)
       i := v.Int()  // i = 42

then that implies that v is referring to some underlying
chunk of memory in order to remember the 42; that is,
NewValue must have allocated some memory.

Almost all the time you are using reflect the goal is to
inspect or to change other data, not to manipulate data
stored solely inside a reflect.Value.

This CL removes use case (1), so that an assignable
reflect.Value must always refer to some other piece of data
in the program.  Put another way, removing this case would
make

       v := reflect.NewValue(0)
       v.Set(42)

as illegal as

       0 = 42.

It would also make this illegal:

       x := 0
       v := reflect.NewValue(x)
       v.Set(42)

for the same reason.  (Note that right now, v.Set(42) "succeeds"
but does not change the value of x.)

If you really wanted to make v refer to x, you'd start with &x
and dereference it:

       x := 0
       v := reflect.NewValue(&x).Elem()  // v = *&x
       v.Set(42)

It's pretty rare, except in tests, to want to use NewValue and then
call Set to change the Value itself instead of some other piece of
data in the program.  I haven't seen it happen once yet while
making the tree build with this change.

For the same reasons, reflect.Zero (formerly reflect.MakeZero)
would also return an unassignable, unaddressable value.
This invalidates the (awkward) idiom:

       pv := ... some Ptr Value we have ...
       v := reflect.Zero(pv.Type().Elem())
       pv.PointTo(v)

which, when the API changed, turned into:

       pv := ... some Ptr Value we have ...
       v := reflect.Zero(pv.Type().Elem())
       pv.Set(v.Addr())

In both, it is far from clear what the code is trying to do.  Now that
it is possible, this CL adds reflect.New(Type) Value that does the
obvious thing (same as Go's new), so this code would be replaced by:

       pv := ... some Ptr Value we have ...
       pv.Set(reflect.New(pv.Type().Elem()))

The changes just described can be confusing to think about,
but I believe it is because the old API was confusing - it was
conflating two different kinds of Values - and that the new API
by itself is pretty simple: you can only Set (or call Addr on)
a Value if it actually addresses some real piece of data; that is,
only if it is the result of dereferencing a Ptr or indexing a Slice.

If you really want the old behavior, you'd get it by translating:

       v := reflect.NewValue(x)

into

       v := reflect.New(reflect.Typeof(x)).Elem()
       v.Set(reflect.NewValue(x))

Gofix will not be able to help with this, because whether
and how to change the code depends on whether the original
code meant use (1) or use (2), so the developer has to read
and think about the code.

You can see the effect on packages in the tree in
https://golang.org/cl/4423043/.

R=r
CC=golang-dev
https://golang.org/cl/4435042
2011-04-18 14:35:33 -04:00
Russ Cox
c19b373c8a runtime: cpu profiling support
R=r
CC=golang-dev
https://golang.org/cl/4306043
2011-03-23 11:43:37 -04:00
Russ Cox
8bf34e3356 gc, runtime: replace closed(c) with x, ok := <-c
R=ken2, ken3
CC=golang-dev
https://golang.org/cl/4259064
2011-03-11 14:47:26 -05:00
Russ Cox
f9ca3b5d5b runtime: scheduler, cgo reorganization
* Change use of m->g0 stack (aka scheduler stack).
* Provide runtime.mcall(f) to invoke f() on m->g0 stack.
* Replace scheduler loop entry with runtime.mcall(schedule).

Runtime.mcall eliminates the need for fake scheduler states that
exist just to run a bit of code on the m->g0 stack
(Grecovery, Gstackalloc).

The elimination of the scheduler as a loop that stops and
starts using gosave and gogo fixes a bad interaction with the
way cgo uses the m->g0 stack.  Cgo runs external (gcc-compiled)
C functions on that stack, and then when calling back into Go,
it sets m->g0->sched.sp below the added call frames, so that
other uses of m->g0's stack will not interfere with those frames.
Unfortunately, gogo (longjmp) back to the scheduler loop at
this point would end up running scheduler with the lower
sp, which no longer points at a valid stack frame for
a call to scheduler.  If scheduler then wrote any function call
arguments or local variables to where it expected the stack
frame to be, it would overwrite other data on the stack.
I realized this possibility while debugging a problem with
calling complex Go code in a Go -> C -> Go cgo callback.
This wasn't the bug I was looking for, it turns out, but I believe
it is a real bug nonetheless.  Switching to runtime.mcall, which
only adds new frames to the stack and never jumps into
functions running in existing ones, fixes this bug.

* Move cgo-related code out of proc.c into cgocall.c.
* Add very large comment describing cgo call sequences.
* Simpilify, regularize cgo function implementations and names.
* Add test suite as misc/cgo/test.

Now the Go -> C path calls cgocall, which calls asmcgocall,
and the C -> Go path calls cgocallback, which calls cgocallbackg.

The shuffling, which affects mainly the callback case, moves
most of the callback implementation to cgocallback running
on the m->curg stack (not the m->g0 scheduler stack) and
only while accounted for with $GOMAXPROCS (between calls
to exitsyscall and entersyscall).

The previous callback code did not block in startcgocallback's
approximation to exitsyscall, so if, say, the garbage collector
were running, it would still barge in and start doing things
like call malloc.  Similarly endcgocallback's approximation of
entersyscall did not call matchmg to kick off new OS threads
when necessary, which caused the bug in issue 1560.

Fixes #1560.

R=iant
CC=golang-dev
https://golang.org/cl/4253054
2011-03-07 10:37:42 -05:00
Russ Cox
324cc3d040 runtime: record goroutine creation pc and display in traceback
package main

func main() {
        go func() { *(*int)(nil) = 0 }()
        select{}
}

panic: runtime error: invalid memory address or nil pointer dereference

[signal 0xb code=0x1 addr=0x0 pc=0x1c96]

runtime.panic+0xac /Users/rsc/g/go/src/pkg/runtime/proc.c:1083
        runtime.panic(0x11bf0, 0xf8400011f0)
runtime.panicstring+0xa3 /Users/rsc/g/go/src/pkg/runtime/runtime.c:116
        runtime.panicstring(0x29a57, 0x0)
runtime.sigpanic+0x144 /Users/rsc/g/go/src/pkg/runtime/darwin/thread.c:470
        runtime.sigpanic()
main._func_001+0x16 /Users/rsc/g/go/src/pkg/runtime/x.go:188
        main._func_001()
runtime.goexit /Users/rsc/g/go/src/pkg/runtime/proc.c:150
        runtime.goexit()
----- goroutine created by -----
main.main+0x3d /Users/rsc/g/go/src/pkg/runtime/x.go:4

goroutine 1 [4]:
runtime.gosched+0x77 /Users/rsc/g/go/src/pkg/runtime/proc.c:598
        runtime.gosched()
runtime.block+0x27 /Users/rsc/g/go/src/pkg/runtime/chan.c:680
        runtime.block()
main.main+0x44 /Users/rsc/g/go/src/pkg/runtime/x.go:5
        main.main()
runtime.mainstart+0xf /Users/rsc/g/go/src/pkg/runtime/amd64/asm.s:77
        runtime.mainstart()
runtime.goexit /Users/rsc/g/go/src/pkg/runtime/proc.c:150
        runtime.goexit()
----- goroutine created by -----
_rt0_amd64+0x8e /Users/rsc/g/go/src/pkg/runtime/amd64/asm.s:64

Fixes #1563.

R=r
CC=golang-dev
https://golang.org/cl/4243046
2011-03-02 13:42:02 -05:00
Russ Cox
582fd17e11 runtime: idle goroutine
This functionality might be used in environments
where programs are limited to a single thread,
to simulate a select-driven network server.  It is
not exposed via the standard runtime API.

R=r, r2
CC=golang-dev
https://golang.org/cl/4254041
2011-02-27 23:32:42 -05:00
Russ Cox
b5dfac45ba runtime: always run stackalloc on scheduler stack
Avoids deadlocks like the one below, in which a stack split happened
in order to call lock(&stacks), but then the stack unsplit cannot run
because stacks is now locked.

The only code calling stackalloc that wasn't on a scheduler
stack already was malg, which creates a new goroutine.

runtime.futex+0x23 /home/rsc/g/go/src/pkg/runtime/linux/amd64/sys.s:139
       runtime.futex()
futexsleep+0x50 /home/rsc/g/go/src/pkg/runtime/linux/thread.c:51
       futexsleep(0x5b0188, 0x300000003, 0x100020000, 0x4159e2)
futexlock+0x85 /home/rsc/g/go/src/pkg/runtime/linux/thread.c:119
       futexlock(0x5b0188, 0x5b0188)
runtime.lock+0x56 /home/rsc/g/go/src/pkg/runtime/linux/thread.c:158
       runtime.lock(0x5b0188, 0x7f0d27b4a000)
runtime.stackfree+0x4d /home/rsc/g/go/src/pkg/runtime/malloc.goc:336
       runtime.stackfree(0x7f0d27b4a000, 0x1000, 0x8, 0x7fff37e1e218)
runtime.oldstack+0xa6 /home/rsc/g/go/src/pkg/runtime/proc.c:705
       runtime.oldstack()
runtime.lessstack+0x22 /home/rsc/g/go/src/pkg/runtime/amd64/asm.s:224
       runtime.lessstack()
----- lessstack called from goroutine 2 -----
runtime.lock+0x56 /home/rsc/g/go/src/pkg/runtime/linux/thread.c:158
       runtime.lock(0x5b0188, 0x40a5e2)
runtime.stackalloc+0x55 /home/rsc/g/go/src/pkg/runtime/malloc.c:316
       runtime.stackalloc(0x1000, 0x4055b0)
runtime.malg+0x3d /home/rsc/g/go/src/pkg/runtime/proc.c:803
       runtime.malg(0x1000, 0x40add9)
runtime.newproc1+0x12b /home/rsc/g/go/src/pkg/runtime/proc.c:854
       runtime.newproc1(0xf840027440, 0x7f0d27b49230, 0x0, 0x49f238, 0x40, ...)
runtime.newproc+0x2f /home/rsc/g/go/src/pkg/runtime/proc.c:831
       runtime.newproc(0x0, 0xf840027440, 0xf800000010, 0x44b059)
...

R=r, r2
CC=golang-dev
https://golang.org/cl/4216045
2011-02-23 15:51:20 -05:00
Russ Cox
d9fd11443c ld: detect stack overflow due to NOSPLIT
Fix problems found.

On amd64, various library routines had bigger
stack frames than expected, because large function
calls had been added.

runtime.assertI2T: nosplit stack overflow
        120	assumed on entry to runtime.assertI2T
        8	after runtime.assertI2T uses 112
        0	on entry to runtime.newTypeAssertionError
        -8	on entry to runtime.morestack01

runtime.assertE2E: nosplit stack overflow
        120	assumed on entry to runtime.assertE2E
        16	after runtime.assertE2E uses 104
        8	on entry to runtime.panic
        0	on entry to runtime.morestack16
        -8	after runtime.morestack16 uses 8

runtime.assertE2T: nosplit stack overflow
        120	assumed on entry to runtime.assertE2T
        16	after runtime.assertE2T uses 104
        8	on entry to runtime.panic
        0	on entry to runtime.morestack16
        -8	after runtime.morestack16 uses 8

runtime.newselect: nosplit stack overflow
        120	assumed on entry to runtime.newselect
        56	after runtime.newselect uses 64
        48	on entry to runtime.printf
        8	after runtime.printf uses 40
        0	on entry to vprintf
        -8	on entry to runtime.morestack16

runtime.selectdefault: nosplit stack overflow
        120	assumed on entry to runtime.selectdefault
        56	after runtime.selectdefault uses 64
        48	on entry to runtime.printf
        8	after runtime.printf uses 40
        0	on entry to vprintf
        -8	on entry to runtime.morestack16

runtime.selectgo: nosplit stack overflow
        120	assumed on entry to runtime.selectgo
        0	after runtime.selectgo uses 120
        -8	on entry to runtime.gosched

On arm, 5c was tagging functions NOSPLIT that should
not have been, like the recursive function printpanics:

printpanics: nosplit stack overflow
        124	assumed on entry to printpanics
        112	after printpanics uses 12
        108	on entry to printpanics
        96	after printpanics uses 12
        92	on entry to printpanics
        80	after printpanics uses 12
        76	on entry to printpanics
        64	after printpanics uses 12
        60	on entry to printpanics
        48	after printpanics uses 12
        44	on entry to printpanics
        32	after printpanics uses 12
        28	on entry to printpanics
        16	after printpanics uses 12
        12	on entry to printpanics
        0	after printpanics uses 12
        -4	on entry to printpanics

R=r, r2
CC=golang-dev
https://golang.org/cl/4188061
2011-02-22 17:40:40 -05:00
Russ Cox
6779350349 runtime: minor cleanup
implement runtime.casp on amd64.
keep simultaneous panic messages separate.

R=r
CC=golang-dev
https://golang.org/cl/4188053
2011-02-16 13:21:13 -05:00
Hector Chu
239ef63bf2 runtime: take the callback return value from the stack
R=brainman, lxn, rsc
CC=golang-dev
https://golang.org/cl/4126056
2011-02-10 23:02:27 +11:00
Russ Cox
b287d7cbe1 runtime: more detailed panic traces, line number work
Follow morestack, so that crashes during a stack split
give complete traces.  Also mark stack segment boundaries
as an aid to debugging.

Correct various line number bugs with yet another attempt
at interpreting the pc/ln table.  This one has a chance at
being correct, because I based it on reading src/cmd/ld/lib.c
instead of on reading the documentation.

Fixes #1138.
Fixes #1430.
Fixes #1461.

throw: runtime: split stack overflow

runtime.throw+0x3e /home/rsc/g/go2/src/pkg/runtime/runtime.c:78
        runtime.throw(0x81880af, 0xf75c8b18)
runtime.newstack+0xad /home/rsc/g/go2/src/pkg/runtime/proc.c:728
        runtime.newstack()
runtime.morestack+0x4f /home/rsc/g/go2/src/pkg/runtime/386/asm.s:184
        runtime.morestack()
----- morestack called from stack: -----
runtime.new+0x1a /home/rsc/g/go2/src/pkg/runtime/malloc.c:288
        runtime.new(0x1, 0x0, 0x0)
gongo.makeBoard+0x33 /tmp/Gongo/gongo_robot_test.go:344
        gongo.makeBoard(0x809d238, 0x1, 0xf76092c8, 0x1)
----- stack segment boundary -----
gongo.checkEasyScore+0xcc /tmp/Gongo/gongo_robot_test.go:287
        gongo.checkEasyScore(0xf764b710, 0x0, 0x809d238, 0x1)
gongo.TestEasyScore+0x8c /tmp/Gongo/gongo_robot_test.go:255
        gongo.TestEasyScore(0xf764b710, 0x818a990)
testing.tRunner+0x2f /home/rsc/g/go2/src/pkg/testing/testing.go:132
        testing.tRunner(0xf764b710, 0xf763b5dc, 0x0)
runtime.goexit /home/rsc/g/go2/src/pkg/runtime/proc.c:149
        runtime.goexit()

R=ken2, r
CC=golang-dev
https://golang.org/cl/4000053
2011-02-02 16:44:20 -05:00
Hector Chu
62afa225af windows: multiple improvements and cleanups
The callback mechanism has been made more flexible.
Eliminated one round of argument copying in Syscall.
Faster Get/SetLastError implemented.
Added gettime for gc perf profiling.

R=rsc, brainman, mattn, rog
CC=golang-dev
https://golang.org/cl/4058046
2011-02-01 11:49:24 -05:00
Luuk van Dijk
7400be87d8 runtime: generate Go defs for C types.
R=rsc, mattn
CC=golang-dev
https://golang.org/cl/4047047
2011-01-31 12:27:28 +01:00
Russ Cox
4608feb18b runtime: simpler heap map, memory allocation
The old heap maps used a multilevel table, but that
was overkill: there are only 1M entries on a 32-bit
machine and we can arrange to use a dense address
range on a 64-bit machine.

The heap map is in bss.  The assumption is that if
we don't touch the pages they won't be mapped in.

Also moved some duplicated memory allocation
code out of the OS-specific files.

R=r
CC=golang-dev
https://golang.org/cl/4118042
2011-01-28 15:03:26 -05:00
Russ Cox
afc6928ad9 runtime: prefer fixed stack allocator over general memory allocator
* move stack constants from proc.c to runtime.h
  * make memclr take uintptr length

R=r
CC=golang-dev
https://golang.org/cl/3985046
2011-01-25 16:35:36 -05:00
Russ Cox
b0543ddd8a gc, runtime: make range on channel safe for multiple goroutines
Fixes #397.

R=ken2
CC=golang-dev
https://golang.org/cl/3994043
2011-01-18 15:59:19 -05:00
Russ Cox
12307008e9 runtime: print signal information during panic
$ 6.out
panic: runtime error: invalid memory address or nil pointer dereference

[signal 11 code=0x1 addr=0x0 pc=0x1c16]

runtime.panic+0xa7 /Users/rsc/g/go/src/pkg/runtime/proc.c:1089
	runtime.panic(0xf6c8, 0x25c010)
runtime.panicstring+0x69 /Users/rsc/g/go/src/pkg/runtime/runtime.c:88
	runtime.panicstring(0x24814, 0x0)
runtime.sigpanic+0x144 /Users/rsc/g/go/src/pkg/runtime/darwin/thread.c:465
	runtime.sigpanic()
main.f+0x16 /Users/rsc/x.go:5
	main.f()
main.main+0x1c /Users/rsc/x.go:9
	main.main()
runtime.mainstart+0xf /Users/rsc/g/go/src/pkg/runtime/amd64/asm.s:77
	runtime.mainstart()
runtime.goexit /Users/rsc/g/go/src/pkg/runtime/proc.c:149
	runtime.goexit()

R=r
CC=golang-dev
https://golang.org/cl/4036042
2011-01-18 14:15:11 -05:00
Russ Cox
141a4a1759 runtime: fix arm reflect.call boundary case
The fault was lucky: when it wasn't faulting it was silently
copying a word from some other block and later putting
that same word back.  If some other goroutine had changed
that word of memory in the interim, too bad.

The ARM code was inconsistent about whether the
"argument frame" included the saved LR.  Including it made
some things more regular but mostly just caused confusion
in the places where the regularity broke.  Now the rule
reflects reality: argp is always a pointer to arguments,
never a saved link register.

Renamed struct fields to make meaning clearer.

Running ARM in QEMU, package time's gotest:
  * before: 27/58 failed
  * after: 0/50

R=r, r2
CC=golang-dev
https://golang.org/cl/3993041
2011-01-14 14:05:20 -05:00
Alex Brainman
a41d85498e runtime: revert 6974:1f3c3696babb
I missed that environment is used during runtime setup,
well before go init() functions run. Implemented os-dependent
runtime.goenvs functions to allow for different unix, plan9 and
windows versions of environment discovery.

R=rsc, paulzhol
CC=golang-dev
https://golang.org/cl/3787046
2011-01-12 11:48:15 +11:00
Alex Brainman
c83451971e runtime: move windows goargs implementation from runtime and into os package
R=rsc
CC=golang-dev
https://golang.org/cl/3702041
2010-12-16 12:18:18 +11:00
Russ Cox
d110ae8dd0 runtime: write only to standard error
Will mail a warning to golang-nuts once this is submitted.

R=r, niemeyer
CC=golang-dev
https://golang.org/cl/3573043
2010-12-14 11:52:42 -05:00
Russ Cox
dc9a3b2791 gc: align structs according to max alignment of fields
cc: same
runtime: test cc alignment (required moving #define of offsetof to runtime.h)
fix bug260

Fixes #482.
Fixes #609.

R=ken2, r
CC=golang-dev
https://golang.org/cl/3563042
2010-12-13 16:22:19 -05:00
Ken Thompson
ae60526848 arm floating point simulation
R=rsc
CC=golang-dev
https://golang.org/cl/3565041
2010-12-09 14:45:27 -08:00
Russ Cox
9042c2ce68 runtime/cgo: runtime changes for new cgo
Formerly known as libcgo.
Almost no code here is changing; the diffs
are shown relative to the originals in libcgo.

R=r
CC=golang-dev
https://golang.org/cl/3420043
2010-12-08 13:53:30 -05:00
Russ Cox
68b4255a96 runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries.  The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.

The symbols left alone are:

	** known to cgo **
	_cgo_free
	_cgo_malloc
	libcgo_thread_start
	initcgo
	ncgocall

	** known to linker **
	_rt0_$GOARCH
	_rt0_$GOARCH_$GOOS
	text
	etext
	data
	end
	pclntab
	epclntab
	symtab
	esymtab

	** known to C compiler **
	_divv
	_modv
	_div64by32
	etc (arch specific)

Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.

Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.

R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 14:00:19 -04:00
Russ Cox
7c2b1597c6 arm: precise float64 software floating point
Adds softfloat64 to generic runtime
(will be discarded by linker when unused)
and adds test for it.  I used the test to check
the software code against amd64 hardware
and then check the software code against
the arm and its simulation of hardware.
The latter should have been a no-op (testing
against itself) but turned up a bug in 5c causing
the vlrt.c routines to miscompile.

These changes make the cmath, math,
and strconv tests pass without any special
accommodations for arm.

R=ken2
CC=golang-dev
https://golang.org/cl/2713042
2010-10-25 17:55:50 -07:00
Yuval Pavel Zholkover
99a10eff16 8l, runtime: initial support for Plan 9
No multiple processes/locks, managed to compile
and run a hello.go (with print not fmt).  Also test/sieve.go
seems to run until 439 and stops with a
'throw: all goroutines are asleep - deadlock!'
- just like runtime/tiny.

based on Russ's suggestions at:
http://groups.google.com/group/comp.os.plan9/browse_thread/thread/cfda8b82535d2d68/243777a597ec1612

Build instructions:
cd src/pkg/runtime
make clean && GOOS=plan9 make install
this will build and install the runtime.

When linking with 8l, you should pass -s to suppress symbol
generation in the a.out, otherwise the generated executable will not run.

This is runtime only, the porting of the toolchain has already
been done: http://code.google.com/p/go-plan9/source/browse
in the plan9-quanstro branch.

R=rsc
CC=golang-dev
https://golang.org/cl/2273041
2010-10-18 12:32:55 -04:00
Ken Thompson
b33f5d537f fix arm bug in reflect.call
R=rsc
CC=golang-dev
https://golang.org/cl/2475042
2010-10-13 13:24:14 -07:00
Ken Thompson
ed575dc2b9 bug in stack size in arm.
stack is off by one if calling
through reflect.Call

R=rsc
CC=golang-dev
https://golang.org/cl/2400041
2010-10-08 16:46:05 -07:00
Russ Cox
f47d403cb4 gc: make string x + y + z + ... + w efficient
1 malloc per concatenation.

R=ken2
CC=golang-dev
https://golang.org/cl/2124045
2010-09-12 00:53:04 -04:00
Alex Brainman
f95a2f2b97 runtime(windows): make sure scheduler runs on os stack and new stdcall implementation
R=rsc
CC=golang-dev
https://golang.org/cl/2009045
2010-09-12 11:45:16 +10:00
Russ Cox
d4cc557b0d runtime: use manual stack for garbage collection
Old code was using recursion to traverse object graph.
New code uses an explicit stack, cutting the per-pointer
footprint to two words during the recursion and avoiding
the standard allocator and stack splitting code.

in test/garbage:

Reduces parser runtime by 2-3%
Reduces Peano runtime by 40%
Increases tree runtime by 4-5%

R=r
CC=golang-dev
https://golang.org/cl/2150042
2010-09-07 09:57:22 -04:00
Kyle Consalus
4d903504b3 runtime: special case copy, equal for one-word interface values
Based on the observation that a great number of the types that
are copied or compared in interfaces, maps, and channels are
word-sized, this uses specialized copy and equality functions
for them that use a word instead of 4 or 8 bytes. Seems to yield
0-6% improvements in performance in the benchmarks I've run.
For example, with the regexp benchmarks:

Before:
regexp.BenchmarkLiteral   500000       3.26 µs/op
regexp.BenchmarkNotLiteral    100000      13.67 µs/op
regexp.BenchmarkMatchClass    100000      18.72 µs/op
regexp.BenchmarkMatchClass_InRange    100000      20.04 µs/op
regexp.BenchmarkReplaceAll    100000      27.85 µs/op

After:
regexp.BenchmarkLiteral   500000       3.11 µs/op
regexp.BenchmarkNotLiteral    200000      13.29 µs/op
regexp.BenchmarkMatchClass    100000      17.65 µs/op
regexp.BenchmarkMatchClass_InRange    100000      18.49 µs/op
regexp.BenchmarkReplaceAll    100000      26.34 µs/op

R=rsc
CC=golang-dev
https://golang.org/cl/1967047
2010-08-26 13:32:40 -04:00
Eric Clark
95fa16a892 cgo: add C.GoStringN
Function to create a GoString with a known length so it can contain NUL
bytes anywhere in the string. Some C libraries have strings like this.

R=rsc
CC=golang-dev
https://golang.org/cl/2007042
2010-08-18 22:29:05 -04:00