1
0
mirror of https://github.com/golang/go synced 2024-11-20 05:44:44 -07:00
Commit Graph

26 Commits

Author SHA1 Message Date
Russ Cox
f437331f80 time: faster Nanoseconds call
runtime knows how to get the time of day
without allocating memory.

R=golang-dev, dsymonds, dave, hectorchu, r, cw
CC=golang-dev
https://golang.org/cl/5297078
2011-11-03 17:35:28 -04:00
Hector Chu
85916146ea runtime: fix usleep on linux/386 and re-enable parallel gc
R=golang-dev, jsing, alex.brainman, cw, rsc
CC=golang-dev
https://golang.org/cl/5166047
2011-10-03 19:08:59 +01:00
Russ Cox
d324f2143b runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.

On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 	 1 cpu, no atomics
32.27u 0.58s 32.86r 	 1 cpu, atomic instructions
33.04u 0.83s 27.47r 	 2 cpu

On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 	 1 cpu, no atomics
34.87u 1.12s 29.60r 	 2 cpu
36.00u 1.87s 28.43r 	 3 cpu
36.46u 2.34s 27.10r 	 4 cpu
38.28u 3.85s 26.92r 	 5 cpu
37.72u 5.25s 26.73r	 6 cpu
39.63u 7.11s 26.95r	 7 cpu
39.67u 8.10s 26.68r	 8 cpu

On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 	 1 cpu, no atomics
43.98u 2.95s 38.69r 	 2 cpu

On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 	 1 cpu, no atomics
57.15u 4.72s 51.54r 	 2 cpu

The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.

R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 09:40:01 -04:00
Yuval Pavel Zholkover
c20a338c2f runtime, syscall: use the vdso page on linux x86 for faster syscalls instead of int $0x80.
8l: fix handling CALL $(constant) code generated by 8a.
8a,8l: add indirect call instruction: CALL *data(SB).

R=rsc, iant
CC=golang-dev
https://golang.org/cl/4817054
2011-08-29 10:36:06 -04:00
Dmitriy Vyukov
4e5086b993 runtime: improve Linux mutex
The implementation is hybrid active/passive spin/blocking mutex.
The design minimizes amount of context switches and futex calls.
The idea is that all critical sections in runtime are intentially
small, so pure blocking mutex behaves badly causing
a lot of context switches, thread parking/unparking and kernel calls.
Note that some synthetic benchmarks become somewhat slower,
that's due to increased contention on other data structures,
it should not affect programs that do any real work.

On 2 x Intel E5620, 8 HT cores, 2.4GHz
benchmark                     old ns/op    new ns/op    delta
BenchmarkSelectContended         521.00       503.00   -3.45%
BenchmarkSelectContended-2       661.00       320.00  -51.59%
BenchmarkSelectContended-4      1139.00       629.00  -44.78%
BenchmarkSelectContended-8      2870.00       878.00  -69.41%
BenchmarkSelectContended-16     5276.00       818.00  -84.50%
BenchmarkChanContended           112.00       103.00   -8.04%
BenchmarkChanContended-2         631.00       174.00  -72.42%
BenchmarkChanContended-4         682.00       272.00  -60.12%
BenchmarkChanContended-8        1601.00       520.00  -67.52%
BenchmarkChanContended-16       3100.00       372.00  -88.00%
BenchmarkChanSync                253.00       239.00   -5.53%
BenchmarkChanSync-2             5030.00      4648.00   -7.59%
BenchmarkChanSync-4             4826.00      4694.00   -2.74%
BenchmarkChanSync-8             4778.00      4713.00   -1.36%
BenchmarkChanSync-16            5289.00      4710.00  -10.95%
BenchmarkChanProdCons0           273.00       254.00   -6.96%
BenchmarkChanProdCons0-2         599.00       400.00  -33.22%
BenchmarkChanProdCons0-4        1168.00       659.00  -43.58%
BenchmarkChanProdCons0-8        2831.00      1057.00  -62.66%
BenchmarkChanProdCons0-16       4197.00      1037.00  -75.29%
BenchmarkChanProdCons10          150.00       140.00   -6.67%
BenchmarkChanProdCons10-2        607.00       268.00  -55.85%
BenchmarkChanProdCons10-4       1137.00       404.00  -64.47%
BenchmarkChanProdCons10-8       2115.00       828.00  -60.85%
BenchmarkChanProdCons10-16      4283.00       855.00  -80.04%
BenchmarkChanProdCons100         117.00       110.00   -5.98%
BenchmarkChanProdCons100-2       558.00       218.00  -60.93%
BenchmarkChanProdCons100-4       722.00       287.00  -60.25%
BenchmarkChanProdCons100-8      1840.00       431.00  -76.58%
BenchmarkChanProdCons100-16     3394.00       448.00  -86.80%
BenchmarkChanProdConsWork0      2014.00      1996.00   -0.89%
BenchmarkChanProdConsWork0-2    1207.00      1127.00   -6.63%
BenchmarkChanProdConsWork0-4    1913.00       611.00  -68.06%
BenchmarkChanProdConsWork0-8    3016.00       949.00  -68.53%
BenchmarkChanProdConsWork0-16   4320.00      1154.00  -73.29%
BenchmarkChanProdConsWork10     1906.00      1897.00   -0.47%
BenchmarkChanProdConsWork10-2   1123.00      1033.00   -8.01%
BenchmarkChanProdConsWork10-4   1076.00       571.00  -46.93%
BenchmarkChanProdConsWork10-8   2748.00      1096.00  -60.12%
BenchmarkChanProdConsWork10-16  4600.00      1105.00  -75.98%
BenchmarkChanProdConsWork100    1884.00      1852.00   -1.70%
BenchmarkChanProdConsWork100-2  1235.00      1146.00   -7.21%
BenchmarkChanProdConsWork100-4  1217.00       619.00  -49.14%
BenchmarkChanProdConsWork100-8  1534.00       509.00  -66.82%
BenchmarkChanProdConsWork100-16 4126.00       918.00  -77.75%
BenchmarkSyscall                  34.40        33.30   -3.20%
BenchmarkSyscall-2               160.00       121.00  -24.38%
BenchmarkSyscall-4               131.00       136.00   +3.82%
BenchmarkSyscall-8               139.00       131.00   -5.76%
BenchmarkSyscall-16              161.00       168.00   +4.35%
BenchmarkSyscallWork             950.00       950.00   +0.00%
BenchmarkSyscallWork-2           481.00       480.00   -0.21%
BenchmarkSyscallWork-4           268.00       270.00   +0.75%
BenchmarkSyscallWork-8           156.00       169.00   +8.33%
BenchmarkSyscallWork-16          188.00       184.00   -2.13%
BenchmarkSemaSyntNonblock         36.40        35.60   -2.20%
BenchmarkSemaSyntNonblock-2       81.40        45.10  -44.59%
BenchmarkSemaSyntNonblock-4      126.00       108.00  -14.29%
BenchmarkSemaSyntNonblock-8      112.00       112.00   +0.00%
BenchmarkSemaSyntNonblock-16     110.00       112.00   +1.82%
BenchmarkSemaSyntBlock            35.30        35.30   +0.00%
BenchmarkSemaSyntBlock-2         118.00       124.00   +5.08%
BenchmarkSemaSyntBlock-4         105.00       108.00   +2.86%
BenchmarkSemaSyntBlock-8         101.00       111.00   +9.90%
BenchmarkSemaSyntBlock-16        112.00       118.00   +5.36%
BenchmarkSemaWorkNonblock        810.00       811.00   +0.12%
BenchmarkSemaWorkNonblock-2      476.00       414.00  -13.03%
BenchmarkSemaWorkNonblock-4      238.00       228.00   -4.20%
BenchmarkSemaWorkNonblock-8      140.00       126.00  -10.00%
BenchmarkSemaWorkNonblock-16     117.00       116.00   -0.85%
BenchmarkSemaWorkBlock           810.00       811.00   +0.12%
BenchmarkSemaWorkBlock-2         454.00       466.00   +2.64%
BenchmarkSemaWorkBlock-4         243.00       241.00   -0.82%
BenchmarkSemaWorkBlock-8         145.00       137.00   -5.52%
BenchmarkSemaWorkBlock-16        132.00       123.00   -6.82%
BenchmarkContendedSemaphore      123.00       102.00  -17.07%
BenchmarkContendedSemaphore-2     34.80        34.90   +0.29%
BenchmarkContendedSemaphore-4     34.70        34.80   +0.29%
BenchmarkContendedSemaphore-8     34.70        34.70   +0.00%
BenchmarkContendedSemaphore-16    34.80        34.70   -0.29%
BenchmarkMutex                    26.80        26.00   -2.99%
BenchmarkMutex-2                 108.00        45.20  -58.15%
BenchmarkMutex-4                 103.00       127.00  +23.30%
BenchmarkMutex-8                 109.00       147.00  +34.86%
BenchmarkMutex-16                102.00       152.00  +49.02%
BenchmarkMutexSlack               27.00        26.90   -0.37%
BenchmarkMutexSlack-2            149.00       165.00  +10.74%
BenchmarkMutexSlack-4            121.00       209.00  +72.73%
BenchmarkMutexSlack-8            101.00       158.00  +56.44%
BenchmarkMutexSlack-16            97.00       129.00  +32.99%
BenchmarkMutexWork               792.00       794.00   +0.25%
BenchmarkMutexWork-2             407.00       409.00   +0.49%
BenchmarkMutexWork-4             220.00       209.00   -5.00%
BenchmarkMutexWork-8             267.00       160.00  -40.07%
BenchmarkMutexWork-16            315.00       300.00   -4.76%
BenchmarkMutexWorkSlack          792.00       793.00   +0.13%
BenchmarkMutexWorkSlack-2        406.00       404.00   -0.49%
BenchmarkMutexWorkSlack-4        225.00       212.00   -5.78%
BenchmarkMutexWorkSlack-8        268.00       136.00  -49.25%
BenchmarkMutexWorkSlack-16       300.00       300.00   +0.00%
BenchmarkRWMutexWrite100          27.10        27.00   -0.37%
BenchmarkRWMutexWrite100-2        33.10        40.80  +23.26%
BenchmarkRWMutexWrite100-4       113.00        88.10  -22.04%
BenchmarkRWMutexWrite100-8       119.00        95.30  -19.92%
BenchmarkRWMutexWrite100-16      148.00       109.00  -26.35%
BenchmarkRWMutexWrite10           29.60        29.40   -0.68%
BenchmarkRWMutexWrite10-2        111.00        61.40  -44.68%
BenchmarkRWMutexWrite10-4        270.00       208.00  -22.96%
BenchmarkRWMutexWrite10-8        204.00       185.00   -9.31%
BenchmarkRWMutexWrite10-16       261.00       190.00  -27.20%
BenchmarkRWMutexWorkWrite100    1040.00      1036.00   -0.38%
BenchmarkRWMutexWorkWrite100-2   593.00       580.00   -2.19%
BenchmarkRWMutexWorkWrite100-4   470.00       365.00  -22.34%
BenchmarkRWMutexWorkWrite100-8   468.00       289.00  -38.25%
BenchmarkRWMutexWorkWrite100-16  604.00       374.00  -38.08%
BenchmarkRWMutexWorkWrite10      951.00       951.00   +0.00%
BenchmarkRWMutexWorkWrite10-2   1001.00       928.00   -7.29%
BenchmarkRWMutexWorkWrite10-4   1555.00      1006.00  -35.31%
BenchmarkRWMutexWorkWrite10-8   2085.00      1171.00  -43.84%
BenchmarkRWMutexWorkWrite10-16  2082.00      1614.00  -22.48%

R=rsc, iant, msolo, fw, iant
CC=golang-dev
https://golang.org/cl/4711045
2011-07-29 12:44:06 -04:00
Jonathan Mark
ddde52ae56 runtime: SysMap uses MAP_FIXED if needed on 64-bit Linux
This change was adapted from gccgo's libgo/runtime/mem.c at
Ian Taylor's suggestion.  It fixes all.bash failing with
"address space conflict: map() =" on amd64 Linux with kernel
version 2.6.32.8-grsec-2.1.14-modsign-xeon-64.
With this change, SysMap will use MAP_FIXED to allocate its desired
address space, after first calling mincore to check that there is
nothing else mapped there.

R=iant, dave, n13m3y3r, rsc
CC=golang-dev
https://golang.org/cl/4438091
2011-06-07 21:50:10 -07:00
Russ Cox
8698bb6c8c runtime: turn "too many EPIPE" into real SIGPIPE
Tested on Linux and OS X, amd64 and 386.

R=r, iant
CC=golang-dev
https://golang.org/cl/4452046
2011-04-25 16:58:00 -04:00
Russ Cox
8dee872963 runtime: os-specific types and code for setitimer
R=r
CC=golang-dev
https://golang.org/cl/4273097
2011-03-23 11:31:42 -04:00
Russ Cox
690291a2c0 runtime: pass to signal handler value of g at time of signal
The existing code assumed that signals only arrived
while executing on the goroutine stack (g == m->curg),
not while executing on the scheduler stack (g == m->g0).

Most of the signal handling trampolines correctly saved
and restored g already, but the sighandler C code did not
have access to it.

Some rewriting of assembly to make the various
implementations as similar as possible.

Will need to change Windows too but I don't
understand how sigtramp gets called there.

R=r
CC=golang-dev
https://golang.org/cl/4203042
2011-02-23 14:47:42 -05:00
Russ Cox
68b4255a96 runtime: ,s/[a-zA-Z0-9_]+/runtime·&/g, almost
Prefix all external symbols in runtime by runtime·,
to avoid conflicts with possible symbols of the same
name in linked-in C libraries.  The obvious conflicts
are printf, malloc, and free, but hide everything to
avoid future pain.

The symbols left alone are:

	** known to cgo **
	_cgo_free
	_cgo_malloc
	libcgo_thread_start
	initcgo
	ncgocall

	** known to linker **
	_rt0_$GOARCH
	_rt0_$GOARCH_$GOOS
	text
	etext
	data
	end
	pclntab
	epclntab
	symtab
	esymtab

	** known to C compiler **
	_divv
	_modv
	_div64by32
	etc (arch specific)

Tested on darwin/386, darwin/amd64, linux/386, linux/amd64.

Built (but not tested) for freebsd/386, freebsd/amd64, linux/arm, windows/386.

R=r, PeterGo
CC=golang-dev
https://golang.org/cl/2899041
2010-11-04 14:00:19 -04:00
Russ Cox
d4cc557b0d runtime: use manual stack for garbage collection
Old code was using recursion to traverse object graph.
New code uses an explicit stack, cutting the per-pointer
footprint to two words during the recursion and avoiding
the standard allocator and stack splitting code.

in test/garbage:

Reduces parser runtime by 2-3%
Reduces Peano runtime by 40%
Increases tree runtime by 4-5%

R=r
CC=golang-dev
https://golang.org/cl/2150042
2010-09-07 09:57:22 -04:00
Russ Cox
2d6ae385e1 linux/386: use Xen-friendly ELF TLS instruction sequence
Fixes #465.

R=iant
CC=golang-dev
https://golang.org/cl/1665051
2010-07-17 16:54:03 -07:00
Russ Cox
53a529ab2b runtime: fix 386 signal handler bug
Cannot assume that g == m->curg at time of signal.
Must save actual g and restore.

Fixes flaky crashes with messages like

throw: malloc mlookup
throw: malloc/free - deadlock
throw: unwindstack on self
throw: free mlookup

(and probably others) when running cgo.

R=iant
CC=golang-dev
https://golang.org/cl/1648043
2010-06-12 10:48:04 -07:00
Russ Cox
e4f06812c5 runtime: instrument malloc + garbage collector.
add simple garbage collection benchmark.

R=iant
CC=golang-dev
https://golang.org/cl/204053
2010-02-08 14:32:22 -08:00
Russ Cox
718be3215f in C and asm, replace pkg·name with ·name
(eliminate assumption of package global name space,
make code easier to move between packages).

R=r
CC=golang-dev
https://golang.org/cl/194072
2010-01-25 18:52:55 -08:00
Devon H. O'Dell
1564b984a5 runtime: GS already set up by setldt in Linux/386; remove duplicate
R=rsc
CC=golang-dev
https://golang.org/cl/186146
2010-01-13 17:50:12 -08:00
Hector Chu
6bfe5f55f4 Ported runtime to Windows.
R=rsc
CC=golang-dev
https://golang.org/cl/176066
2010-01-06 17:58:55 -08:00
Devon H. O'Dell
5a4a08fab8 Fix stack on FreeBSD / add stack check across the board
FreeBSD was passing stk as the new thread's stack base, while
stk is the top of the stack in go. The added check should cause
a trap if this ever comes up in any new ports, or regresses
in current ones.

R=rsc
CC=golang-dev
https://golang.org/cl/167055
2009-12-08 18:19:30 -08:00
William Josephson
4c0f262a2d Remove unnecessary execute bits.
R=rsc
https://golang.org/cl/156077
2009-11-18 09:19:29 -08:00
Adam Langley
3f7a32405d runtime: warn about SELinux based mmap failures on Linux.
SELinux will cause mmap to fail when we request w+x memory unless the
user has configured their policies. We have a warning in make.bash,
but it's quite likely that the policy will be reset at some point and
then all their binaries start failing.

This patch prints a warning on Linux when mmap fails with EACCES.

R=rsc
CC=golang-dev
https://golang.org/cl/152086
2009-11-13 10:08:51 -08:00
Russ Cox
22a5c78f44 rename sys functions to runtime,
because they are in package runtime.

another step to enforcing package boundaries.

R=r
DELTA=732  (114 added, 93 deleted, 525 changed)
OCL=35811
CL=35824
2009-10-15 23:10:49 -07:00
Russ Cox
133a158bd8 8c, 8l dynamic loading support.
better mach binaries.
cgo working on darwin+linux amd64+386.
eliminated context switches - pi is 30x faster.
add libcgo to build.

on snow leopard:
  - non-cgo binaries work; all tests pass.
  - cgo binaries work on amd64 but not 386.

R=r
DELTA=2031  (1316 added, 626 deleted, 89 changed)
OCL=35264
CL=35304
2009-10-03 10:37:12 -07:00
Russ Cox
1b14bdbf1c changes to accommodate nacl:
* change ldt0setup to set GS itself; nacl won't let us do it.
  * change breakpoint to INT $3 so 8l can translate to HLT for nacl.
  * panic if closure is needed on nacl.
  * do not try to access symbol table on nacl.
  * mmap in 64kB chunks.

nacl support:
  * system calls, threading, locks.

R=r
DELTA=365  (357 added, 5 deleted, 3 changed)
OCL=34880
CL=34906
2009-09-22 16:28:32 -07:00
Russ Cox
bbcb91a3a7 convert 386 to use %gs instead of %fs for extern register.
required for nacl and may be nicer for ffi,
because %gs is the standard register for thread-local storage.

R=ken
OCL=34861
CL=34866
2009-09-21 15:46:50 -07:00
Russ Cox
8522a478bb update 386 to new runtime (CL 30381)
R=r
DELTA=298  (119 added, 81 deleted, 98 changed)
OCL=30427
CL=30443
2009-06-17 15:15:55 -07:00
Rob Pike
d90e7cbac6 mv src/lib to src/pkg
tests: all.bash passes, gobuild still works, godoc still works.

R=rsc
OCL=30096
CL=30102
2009-06-09 09:53:44 -07:00