1
0
mirror of https://github.com/golang/go synced 2024-11-20 00:04:43 -07:00
Commit Graph

28 Commits

Author SHA1 Message Date
Dmitriy Vyukov
4e5086b993 runtime: improve Linux mutex
The implementation is hybrid active/passive spin/blocking mutex.
The design minimizes amount of context switches and futex calls.
The idea is that all critical sections in runtime are intentially
small, so pure blocking mutex behaves badly causing
a lot of context switches, thread parking/unparking and kernel calls.
Note that some synthetic benchmarks become somewhat slower,
that's due to increased contention on other data structures,
it should not affect programs that do any real work.

On 2 x Intel E5620, 8 HT cores, 2.4GHz
benchmark                     old ns/op    new ns/op    delta
BenchmarkSelectContended         521.00       503.00   -3.45%
BenchmarkSelectContended-2       661.00       320.00  -51.59%
BenchmarkSelectContended-4      1139.00       629.00  -44.78%
BenchmarkSelectContended-8      2870.00       878.00  -69.41%
BenchmarkSelectContended-16     5276.00       818.00  -84.50%
BenchmarkChanContended           112.00       103.00   -8.04%
BenchmarkChanContended-2         631.00       174.00  -72.42%
BenchmarkChanContended-4         682.00       272.00  -60.12%
BenchmarkChanContended-8        1601.00       520.00  -67.52%
BenchmarkChanContended-16       3100.00       372.00  -88.00%
BenchmarkChanSync                253.00       239.00   -5.53%
BenchmarkChanSync-2             5030.00      4648.00   -7.59%
BenchmarkChanSync-4             4826.00      4694.00   -2.74%
BenchmarkChanSync-8             4778.00      4713.00   -1.36%
BenchmarkChanSync-16            5289.00      4710.00  -10.95%
BenchmarkChanProdCons0           273.00       254.00   -6.96%
BenchmarkChanProdCons0-2         599.00       400.00  -33.22%
BenchmarkChanProdCons0-4        1168.00       659.00  -43.58%
BenchmarkChanProdCons0-8        2831.00      1057.00  -62.66%
BenchmarkChanProdCons0-16       4197.00      1037.00  -75.29%
BenchmarkChanProdCons10          150.00       140.00   -6.67%
BenchmarkChanProdCons10-2        607.00       268.00  -55.85%
BenchmarkChanProdCons10-4       1137.00       404.00  -64.47%
BenchmarkChanProdCons10-8       2115.00       828.00  -60.85%
BenchmarkChanProdCons10-16      4283.00       855.00  -80.04%
BenchmarkChanProdCons100         117.00       110.00   -5.98%
BenchmarkChanProdCons100-2       558.00       218.00  -60.93%
BenchmarkChanProdCons100-4       722.00       287.00  -60.25%
BenchmarkChanProdCons100-8      1840.00       431.00  -76.58%
BenchmarkChanProdCons100-16     3394.00       448.00  -86.80%
BenchmarkChanProdConsWork0      2014.00      1996.00   -0.89%
BenchmarkChanProdConsWork0-2    1207.00      1127.00   -6.63%
BenchmarkChanProdConsWork0-4    1913.00       611.00  -68.06%
BenchmarkChanProdConsWork0-8    3016.00       949.00  -68.53%
BenchmarkChanProdConsWork0-16   4320.00      1154.00  -73.29%
BenchmarkChanProdConsWork10     1906.00      1897.00   -0.47%
BenchmarkChanProdConsWork10-2   1123.00      1033.00   -8.01%
BenchmarkChanProdConsWork10-4   1076.00       571.00  -46.93%
BenchmarkChanProdConsWork10-8   2748.00      1096.00  -60.12%
BenchmarkChanProdConsWork10-16  4600.00      1105.00  -75.98%
BenchmarkChanProdConsWork100    1884.00      1852.00   -1.70%
BenchmarkChanProdConsWork100-2  1235.00      1146.00   -7.21%
BenchmarkChanProdConsWork100-4  1217.00       619.00  -49.14%
BenchmarkChanProdConsWork100-8  1534.00       509.00  -66.82%
BenchmarkChanProdConsWork100-16 4126.00       918.00  -77.75%
BenchmarkSyscall                  34.40        33.30   -3.20%
BenchmarkSyscall-2               160.00       121.00  -24.38%
BenchmarkSyscall-4               131.00       136.00   +3.82%
BenchmarkSyscall-8               139.00       131.00   -5.76%
BenchmarkSyscall-16              161.00       168.00   +4.35%
BenchmarkSyscallWork             950.00       950.00   +0.00%
BenchmarkSyscallWork-2           481.00       480.00   -0.21%
BenchmarkSyscallWork-4           268.00       270.00   +0.75%
BenchmarkSyscallWork-8           156.00       169.00   +8.33%
BenchmarkSyscallWork-16          188.00       184.00   -2.13%
BenchmarkSemaSyntNonblock         36.40        35.60   -2.20%
BenchmarkSemaSyntNonblock-2       81.40        45.10  -44.59%
BenchmarkSemaSyntNonblock-4      126.00       108.00  -14.29%
BenchmarkSemaSyntNonblock-8      112.00       112.00   +0.00%
BenchmarkSemaSyntNonblock-16     110.00       112.00   +1.82%
BenchmarkSemaSyntBlock            35.30        35.30   +0.00%
BenchmarkSemaSyntBlock-2         118.00       124.00   +5.08%
BenchmarkSemaSyntBlock-4         105.00       108.00   +2.86%
BenchmarkSemaSyntBlock-8         101.00       111.00   +9.90%
BenchmarkSemaSyntBlock-16        112.00       118.00   +5.36%
BenchmarkSemaWorkNonblock        810.00       811.00   +0.12%
BenchmarkSemaWorkNonblock-2      476.00       414.00  -13.03%
BenchmarkSemaWorkNonblock-4      238.00       228.00   -4.20%
BenchmarkSemaWorkNonblock-8      140.00       126.00  -10.00%
BenchmarkSemaWorkNonblock-16     117.00       116.00   -0.85%
BenchmarkSemaWorkBlock           810.00       811.00   +0.12%
BenchmarkSemaWorkBlock-2         454.00       466.00   +2.64%
BenchmarkSemaWorkBlock-4         243.00       241.00   -0.82%
BenchmarkSemaWorkBlock-8         145.00       137.00   -5.52%
BenchmarkSemaWorkBlock-16        132.00       123.00   -6.82%
BenchmarkContendedSemaphore      123.00       102.00  -17.07%
BenchmarkContendedSemaphore-2     34.80        34.90   +0.29%
BenchmarkContendedSemaphore-4     34.70        34.80   +0.29%
BenchmarkContendedSemaphore-8     34.70        34.70   +0.00%
BenchmarkContendedSemaphore-16    34.80        34.70   -0.29%
BenchmarkMutex                    26.80        26.00   -2.99%
BenchmarkMutex-2                 108.00        45.20  -58.15%
BenchmarkMutex-4                 103.00       127.00  +23.30%
BenchmarkMutex-8                 109.00       147.00  +34.86%
BenchmarkMutex-16                102.00       152.00  +49.02%
BenchmarkMutexSlack               27.00        26.90   -0.37%
BenchmarkMutexSlack-2            149.00       165.00  +10.74%
BenchmarkMutexSlack-4            121.00       209.00  +72.73%
BenchmarkMutexSlack-8            101.00       158.00  +56.44%
BenchmarkMutexSlack-16            97.00       129.00  +32.99%
BenchmarkMutexWork               792.00       794.00   +0.25%
BenchmarkMutexWork-2             407.00       409.00   +0.49%
BenchmarkMutexWork-4             220.00       209.00   -5.00%
BenchmarkMutexWork-8             267.00       160.00  -40.07%
BenchmarkMutexWork-16            315.00       300.00   -4.76%
BenchmarkMutexWorkSlack          792.00       793.00   +0.13%
BenchmarkMutexWorkSlack-2        406.00       404.00   -0.49%
BenchmarkMutexWorkSlack-4        225.00       212.00   -5.78%
BenchmarkMutexWorkSlack-8        268.00       136.00  -49.25%
BenchmarkMutexWorkSlack-16       300.00       300.00   +0.00%
BenchmarkRWMutexWrite100          27.10        27.00   -0.37%
BenchmarkRWMutexWrite100-2        33.10        40.80  +23.26%
BenchmarkRWMutexWrite100-4       113.00        88.10  -22.04%
BenchmarkRWMutexWrite100-8       119.00        95.30  -19.92%
BenchmarkRWMutexWrite100-16      148.00       109.00  -26.35%
BenchmarkRWMutexWrite10           29.60        29.40   -0.68%
BenchmarkRWMutexWrite10-2        111.00        61.40  -44.68%
BenchmarkRWMutexWrite10-4        270.00       208.00  -22.96%
BenchmarkRWMutexWrite10-8        204.00       185.00   -9.31%
BenchmarkRWMutexWrite10-16       261.00       190.00  -27.20%
BenchmarkRWMutexWorkWrite100    1040.00      1036.00   -0.38%
BenchmarkRWMutexWorkWrite100-2   593.00       580.00   -2.19%
BenchmarkRWMutexWorkWrite100-4   470.00       365.00  -22.34%
BenchmarkRWMutexWorkWrite100-8   468.00       289.00  -38.25%
BenchmarkRWMutexWorkWrite100-16  604.00       374.00  -38.08%
BenchmarkRWMutexWorkWrite10      951.00       951.00   +0.00%
BenchmarkRWMutexWorkWrite10-2   1001.00       928.00   -7.29%
BenchmarkRWMutexWorkWrite10-4   1555.00      1006.00  -35.31%
BenchmarkRWMutexWorkWrite10-8   2085.00      1171.00  -43.84%
BenchmarkRWMutexWorkWrite10-16  2082.00      1614.00  -22.48%

R=rsc, iant, msolo, fw, iant
CC=golang-dev
https://golang.org/cl/4711045
2011-07-29 12:44:06 -04:00
Quan Yong Zhai
47410a2490 runtime: replace byte-at-a-time zeroing loop with memclr
R=golang-dev, r, r, dsymonds, rsc
CC=golang-dev
https://golang.org/cl/4813043
2011-07-23 15:46:58 -04:00
Wei Guangjing
9f636598ba cgo: windows amd64 port
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4725041
2011-07-19 10:47:33 -04:00
Dmitriy Vyukov
491aa1579d runtime: native xadd for 386/amd64
benchmark                          old ns/op    new ns/op    delta
BenchmarkSemaUncontended               37.40        34.10   -8.82%
BenchmarkSemaUncontended-2             18.90        17.70   -6.35%
BenchmarkSemaUncontended-4             11.90        10.90   -8.40%
BenchmarkSemaUncontended-8              6.26         5.19  -17.09%
BenchmarkSemaUncontended-16             4.39         3.91  -10.93%
BenchmarkSemaSyntNonblock              38.00        35.30   -7.11%
BenchmarkSemaSyntNonblock-2            83.00        46.70  -43.73%
BenchmarkSemaSyntNonblock-4           124.00       101.00  -18.55%
BenchmarkSemaSyntNonblock-8           124.00       116.00   -6.45%
BenchmarkSemaSyntNonblock-16          148.00       114.00  -22.97%

(on HP Z600 2 x Xeon E5620, 8 HT cores, 2.40GHz)

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4755041
2011-07-15 11:27:16 -04:00
Dmitriy Vyukov
86a659cad0 runtime: fix data race during Itab hash update/lookup
The data race is on newly published Itab nodes, which are
both unsafely published and unsafely acquired. It can
break on IA-32/Intel64 due to compiler optimizations
(most likely not an issue as of now) and on ARM due to
hardware memory access reorderings.

R=rsc
CC=golang-dev
https://golang.org/cl/4673055
2011-07-13 11:22:41 -07:00
Russ Cox
370276a3e5 runtime: stack split + garbage collection bug
The g->sched.sp saved stack pointer and the
g->stackbase and g->stackguard stack bounds
can change even while "the world is stopped",
because a goroutine has to call functions (and
therefore might split its stack) when exiting a
system call to check whether the world is stopped
(and if so, wait until the world continues).

That means the garbage collector cannot access
those values safely (without a race) for goroutines
executing system calls.  Instead, save a consistent
triple in g->gcsp, g->gcstack, g->gcguard during
entersyscall and have the garbage collector refer
to those.

The old code was occasionally seeing (because of
the race) an sp and stk that did not correspond to
each other, so that stk - sp was not the number of
stack bytes following sp.  In that case, if sp < stk
then the call scanblock(sp, stk - sp) scanned too
many bytes (anything between the two pointers,
which pointed into different allocation blocks).
If sp > stk then stk - sp wrapped around.
On 32-bit, stk - sp is a uintptr (uint32) converted
to int64 in the call to scanblock, so a large (~4G)
but positive number.  Scanblock would try to scan
that many bytes and eventually fault accessing
unmapped memory.  On 64-bit, stk - sp is a uintptr (uint64)
promoted to int64 in the call to scanblock, so a negative
number.  Scanblock would not scan anything, possibly
causing in-use blocks to be freed.

In short, 32-bit platforms would have seen either
ineffective garbage collection or crashes during garbage
collection, while 64-bit platforms would have seen
either ineffective or incorrect garbage collection.
You can see the invalid arguments to scanblock in the
stack traces in issue 1620.

Fixes #1620.
Fixes #1746.

R=iant, r
CC=golang-dev
https://golang.org/cl/4437075
2011-04-27 23:21:12 -04:00
Russ Cox
f9ca3b5d5b runtime: scheduler, cgo reorganization
* Change use of m->g0 stack (aka scheduler stack).
* Provide runtime.mcall(f) to invoke f() on m->g0 stack.
* Replace scheduler loop entry with runtime.mcall(schedule).

Runtime.mcall eliminates the need for fake scheduler states that
exist just to run a bit of code on the m->g0 stack
(Grecovery, Gstackalloc).

The elimination of the scheduler as a loop that stops and
starts using gosave and gogo fixes a bad interaction with the
way cgo uses the m->g0 stack.  Cgo runs external (gcc-compiled)
C functions on that stack, and then when calling back into Go,
it sets m->g0->sched.sp below the added call frames, so that
other uses of m->g0's stack will not interfere with those frames.
Unfortunately, gogo (longjmp) back to the scheduler loop at
this point would end up running scheduler with the lower
sp, which no longer points at a valid stack frame for
a call to scheduler.  If scheduler then wrote any function call
arguments or local variables to where it expected the stack
frame to be, it would overwrite other data on the stack.
I realized this possibility while debugging a problem with
calling complex Go code in a Go -> C -> Go cgo callback.
This wasn't the bug I was looking for, it turns out, but I believe
it is a real bug nonetheless.  Switching to runtime.mcall, which
only adds new frames to the stack and never jumps into
functions running in existing ones, fixes this bug.

* Move cgo-related code out of proc.c into cgocall.c.
* Add very large comment describing cgo call sequences.
* Simpilify, regularize cgo function implementations and names.
* Add test suite as misc/cgo/test.

Now the Go -> C path calls cgocall, which calls asmcgocall,
and the C -> Go path calls cgocallback, which calls cgocallbackg.

The shuffling, which affects mainly the callback case, moves
most of the callback implementation to cgocallback running
on the m->curg stack (not the m->g0 scheduler stack) and
only while accounted for with $GOMAXPROCS (between calls
to exitsyscall and entersyscall).

The previous callback code did not block in startcgocallback's
approximation to exitsyscall, so if, say, the garbage collector
were running, it would still barge in and start doing things
like call malloc.  Similarly endcgocallback's approximation of
entersyscall did not call matchmg to kick off new OS threads
when necessary, which caused the bug in issue 1560.

Fixes #1560.

R=iant
CC=golang-dev
https://golang.org/cl/4253054
2011-03-07 10:37:42 -05:00
Russ Cox
6779350349 runtime: minor cleanup
implement runtime.casp on amd64.
keep simultaneous panic messages separate.

R=r
CC=golang-dev
https://golang.org/cl/4188053
2011-02-16 13:21:13 -05:00
Russ Cox
afc6928ad9 runtime: prefer fixed stack allocator over general memory allocator
* move stack constants from proc.c to runtime.h
  * make memclr take uintptr length

R=r
CC=golang-dev
https://golang.org/cl/3985046
2011-01-25 16:35:36 -05:00
Russ Cox
141a4a1759 runtime: fix arm reflect.call boundary case
The fault was lucky: when it wasn't faulting it was silently
copying a word from some other block and later putting
that same word back.  If some other goroutine had changed
that word of memory in the interim, too bad.

The ARM code was inconsistent about whether the
"argument frame" included the saved LR.  Including it made
some things more regular but mostly just caused confusion
in the places where the regularity broke.  Now the rule
reflects reality: argp is always a pointer to arguments,
never a saved link register.

Renamed struct fields to make meaning clearer.

Running ARM in QEMU, package time's gotest:
  * before: 27/58 failed
  * after: 0/50

R=r, r2
CC=golang-dev
https://golang.org/cl/3993041
2011-01-14 14:05:20 -05:00
Russ Cox
68b4255a96 runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries.  The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.

The symbols left alone are:

	** known to cgo **
	_cgo_free
	_cgo_malloc
	libcgo_thread_start
	initcgo
	ncgocall

	** known to linker **
	_rt0_$GOARCH
	_rt0_$GOARCH_$GOOS
	text
	etext
	data
	end
	pclntab
	epclntab
	symtab
	esymtab

	** known to C compiler **
	_divv
	_modv
	_div64by32
	etc (arch specific)

Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.

Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.

R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 14:00:19 -04:00
Russ Cox
e473f42b2d amd64: use segment memory for thread-local storage
Returns R14 and R15 to the available register pool.
Plays more nicely with ELF ABI C code.
In particular, our signal handlers will no longer crash
when a signal arrives during execution of a cgo C call.

Fixes #720.

R=ken2, r
CC=golang-dev
https://golang.org/cl/1847051
2010-08-04 17:50:22 -07:00
Ian Lance Taylor
a4f8d36ba5 Run initcgo for all amd64 targets, not just GNU/Linux.
This is required to make cgo export work on Darwin.  Note that
this corrects the stack alignment when calling initcgo to that
required by gcc on amd64.

R=rsc
CC=golang-dev
https://golang.org/cl/907041
2010-04-09 14:15:15 -07:00
Ian Lance Taylor
2d0ff3f1a6 Support cgo export on amd64.
R=rsc
CC=golang-dev
https://golang.org/cl/857045
2010-04-09 13:30:11 -07:00
Russ Cox
6c196015e0 runtime: various arm fixes
* correct symbol table size
  * do not reorder functions in output
  * traceback
  * signal handling
  * use same code for go + defer
  * handle leaf functions in symbol table

R=kaib, dpx
CC=golang-dev
https://golang.org/cl/884041
2010-04-05 12:51:09 -07:00
Russ Cox
63e878a750 runtime: make type assertion a runtime.Error, the first of many
R=r
CC=golang-dev
https://golang.org/cl/805043
2010-03-31 15:55:10 -07:00
Russ Cox
01eaf780a8 gc: add panic and recover (still unimplemented in runtime)
main semantic change is to enforce single argument to panic.

runtime: change to 1-argument panic.
use String method on argument if it has one.

R=ken2, r
CC=golang-dev
https://golang.org/cl/812043
2010-03-30 10:53:16 -07:00
Russ Cox
83727ccf7c runtime: run deferred calls at Goexit
baby step toward panic+recover.

Fixes #349.

R=r
CC=golang-dev
https://golang.org/cl/825043
2010-03-29 21:48:22 -07:00
Russ Cox
718be3215f in C and asm, replace pkg·name with ·name
(eliminate assumption of package global name space,
make code easier to move between packages).

R=r
CC=golang-dev
https://golang.org/cl/194072
2010-01-25 18:52:55 -08:00
Devon H. O'Dell
5a4a08fab8 Fix stack on FreeBSD / add stack check across the board
FreeBSD was passing stk as the new thread's stack base, while
stk is the top of the stack in go. The added check should cause
a trap if this ever comes up in any new ports, or regresses
in current ones.

R=rsc
CC=golang-dev
https://golang.org/cl/167055
2009-12-08 18:19:30 -08:00
Devon H. O'Dell
0489a260da FreeBSD-specific porting work.
cgo/libmach remain unimplemented. However, compilers, runtime,
and packages are 100%. I still need to go through and implement
missing syscalls (at least make sure they're all listed), but
for all shipped functionality, this is done. Ship! ;)

R=rsc, VenkateshSrinivas
https://golang.org/cl/152142
2009-11-17 08:20:58 -08:00
Russ Cox
22a5c78f44 rename sys functions to runtime,
because they are in package runtime.

another step to enforcing package boundaries.

R=r
DELTA=732  (114 added, 93 deleted, 525 changed)
OCL=35811
CL=35824
2009-10-15 23:10:49 -07:00
Russ Cox
add89dd1ba stack overflow debugging and fix.
* in 6l, -K already meant check for stack underflow.
    add -KK to mean double-check stack overflows
    even in nosplit functions.

  * comment out print locks; they deadlock too easily
     but are still useful to put back for special occasions.

  * let runcgo assembly switch to scheduler stack
    without involving scheduler directly.  because runcgo
    gets called from matchmg, it is too hard to keep it
    from being called on other stacks.

R=r
DELTA=94  (65 added, 18 deleted, 11 changed)
OCL=35591
CL=35604
2009-10-12 10:26:38 -07:00
Russ Cox
133a158bd8 8c, 8l dynamic loading support.
better mach binaries.
cgo working on darwin+linux amd64+386.
eliminated context switches - pi is 30x faster.
add libcgo to build.

on snow leopard:
  - non-cgo binaries work; all tests pass.
  - cgo binaries work on amd64 but not 386.

R=r
DELTA=2031  (1316 added, 626 deleted, 89 changed)
OCL=35264
CL=35304
2009-10-03 10:37:12 -07:00
Russ Cox
bba278a43b reflection for functions
add channel send type check (thanks austin).
fix type mismatch message.

R=r
DELTA=241  (225 added, 5 deleted, 11 changed)
OCL=31370
CL=31375
2009-07-08 18:16:09 -07:00
Russ Cox
380200953a Forgot to check in 386/asm.h.
Rather than do that, fix build by
generating asm.h automatically.

R=r
DELTA=97  (48 added, 36 deleted, 13 changed)
OCL=30449
CL=30452
2009-06-17 16:31:02 -07:00
Russ Cox
7343e03c43 runtime: stack growth adjustments, cleanup
* keep coherent SP/PC in gobuf
	  (i.e., SP that would be in use at that PC)
	* gogocall replaces setspgoto,
	  should work better in presence of link registers
	* delete unused system calls

only amd64; 386 is now broken

R=r
DELTA=548  (183 added, 183 deleted, 182 changed)
OCL=30381
CL=30442
2009-06-17 15:12:16 -07:00
Rob Pike
d90e7cbac6 mv src/lib to src/pkg
tests: all.bash passes, gobuild still works, godoc still works.

R=rsc
OCL=30096
CL=30102
2009-06-09 09:53:44 -07:00