1
0
mirror of https://github.com/golang/go synced 2024-10-03 09:21:21 -06:00
Commit Graph

21 Commits

Author SHA1 Message Date
Russ Cox
25f6b02ab0 cmd/cc, runtime: convert C compilers to use Go calling convention
To date, the C compilers and Go compilers differed only in how
values were returned from functions. This made it difficult to call
Go from C or C from Go if return values were involved. It also made
assembly called from Go and assembly called from C different.

This CL changes the C compiler to use the Go conventions, passing
results on the stack, after the arguments.
[Exception: this does not apply to C ... functions, because you can't
know where on the stack the arguments end.]

By doing this, the CL makes it possible to rewrite C functions into Go
one at a time, without worrying about which languages call that
function or which languages it calls.

This CL also updates all the assembly files in package runtime to use
the new conventions. Argument references of the form 40(SP) have
been rewritten to the form name+10(FP) instead, and there are now
Go func prototypes for every assembly function called from C or Go.
This means that 'go vet runtime' checks effectively every assembly
function, and go vet's output was used to automate the bulk of the
conversion.

Some functions, like seek and nsec on Plan 9, needed to be rewritten.

Many assembly routines called from C were reading arguments
incorrectly, using MOVL instead of MOVQ or vice versa, especially on
the less used systems like openbsd.
These were found by go vet and have been corrected too.
If we're lucky, this may reduce flakiness on those systems.

Tested on:
        darwin/386
        darwin/amd64
        linux/arm
        linux/386
        linux/amd64
If this breaks another system, the bug is almost certainly in the
sys_$GOOS_$GOARCH.s file, since the rest of the CL is tested
by the combination of the above systems.

LGTM=dvyukov, iant
R=golang-codereviews, 0intro, dave, alex.brainman, dvyukov, iant
CC=golang-codereviews, josharian, r
https://golang.org/cl/135830043
2014-08-27 11:32:17 -04:00
Russ Cox
89f185fe8a all: remove 'extern register M *m' from runtime
The runtime has historically held two dedicated values g (current goroutine)
and m (current thread) in 'extern register' slots (TLS on x86, real registers
backed by TLS on ARM).

This CL removes the extern register m; code now uses g->m.

On ARM, this frees up the register that formerly held m (R9).
This is important for NaCl, because NaCl ARM code cannot use R9 at all.

The Go 1 macrobenchmarks (those with per-op times >= 10 µs) are unaffected:

BenchmarkBinaryTree17              5491374955     5471024381     -0.37%
BenchmarkFannkuch11                4357101311     4275174828     -1.88%
BenchmarkGobDecode                 11029957       11364184       +3.03%
BenchmarkGobEncode                 6852205        6784822        -0.98%
BenchmarkGzip                      650795967      650152275      -0.10%
BenchmarkGunzip                    140962363      141041670      +0.06%
BenchmarkHTTPClientServer          71581          73081          +2.10%
BenchmarkJSONEncode                31928079       31913356       -0.05%
BenchmarkJSONDecode                117470065      113689916      -3.22%
BenchmarkMandelbrot200             6008923        5998712        -0.17%
BenchmarkGoParse                   6310917        6327487        +0.26%
BenchmarkRegexpMatchMedium_1K      114568         114763         +0.17%
BenchmarkRegexpMatchHard_1K        168977         169244         +0.16%
BenchmarkRevcomp                   935294971      914060918      -2.27%
BenchmarkTemplate                  145917123      148186096      +1.55%

Minux previous reported larger variations, but these were caused by
run-to-run noise, not repeatable slowdowns.

Actual code changes by Minux.
I only did the docs and the benchmarking.

LGTM=dvyukov, iant, minux
R=minux, josharian, iant, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/109050043
2014-06-26 11:54:39 -04:00
Russ Cox
90093f0634 liblink: introduce TLS register on 386 and amd64
When I did the original 386 ports on Linux and OS X, I chose to
define GS-relative expressions like 4(GS) as relative to the actual
thread-local storage base, which was usually GS but might not be
(it might be FS, or it might be a different constant offset from GS or FS).

The original scope was limited but since then the rewrites have
gotten out of control. Sometimes GS is rewritten, sometimes FS.
Some ports do other rewrites to enable shared libraries and
other linking. At no point in the code is it clear whether you are
looking at the real GS/FS or some synthesized thing that will be
rewritten. The code manipulating all these is duplicated in many
places.

The first step to fixing issue 7719 is to make the code intelligible
again.

This CL adds an explicit TLS pseudo-register to the 386 and amd64.
As a register, TLS refers to the thread-local storage base, and it
can only be loaded into another register:

        MOVQ TLS, AX

An offset from the thread-local storage base is written off(reg)(TLS*1).
Semantically it is off(reg), but the (TLS*1) annotation marks this as
indexing from the loaded TLS base. This emits a relocation so that
if the linker needs to adjust the offset, it can. For example:

        MOVQ TLS, AX
        MOVQ 8(AX)(TLS*1), CX // load m into CX

On systems that support direct access to the TLS memory, this
pair of instructions can be reduced to a direct TLS memory reference:

        MOVQ 8(TLS), CX // load m into CX

The 2-instruction and 1-instruction forms correspond roughly to
ELF TLS initial exec mode and ELF TLS local exec mode, respectively.

Liblink applies this rewrite on systems that support the 1-instruction form.
The decision is made using only the operating system (and probably
the -shared flag, eventually), not the link mode. If some link modes
on a particular operating system require the 2-instruction form,
then all builds for that operating system will use the 2-instruction
form, so that the link mode decision can be delayed to link time.

Obviously it is late to be making changes like this, but I despair
of correcting issue 7719 and issue 7164 without it. To make sure
I am not changing existing behavior, I built a "hello world" program
for every GOOS/GOARCH combination we have and then worked
to make sure that the rewrite generates exactly the same binaries,
byte for byte. There are a handful of TODOs in the code marking
kludges to get the byte-for-byte property, but at least now I can
explain exactly how each binary is handled.

The targets I tested this way are:

        darwin-386
        darwin-amd64
        dragonfly-386
        dragonfly-amd64
        freebsd-386
        freebsd-amd64
        freebsd-arm
        linux-386
        linux-amd64
        linux-arm
        nacl-386
        nacl-amd64p32
        netbsd-386
        netbsd-amd64
        openbsd-386
        openbsd-amd64
        plan9-386
        plan9-amd64
        solaris-amd64
        windows-386
        windows-amd64

There were four exceptions to the byte-for-byte goal:

windows-386 and windows-amd64 have a time stamp
at bytes 137 and 138 of the header.

darwin-386 and plan9-386 have five or six modified
bytes in the middle of the Go symbol table, caused by
editing comments in runtime/sys_{darwin,plan9}_386.s.

Fixes #7164.

LGTM=iant
R=iant, aram, minux.ma, dave
CC=golang-codereviews
https://golang.org/cl/87920043
2014-04-15 13:45:39 -04:00
Jay Weisskopf
86c976ffd0 runtime: use monotonic clock for timers (linux/386, linux/amd64)
This lays the groundwork for making Go robust when the system's
calendar time jumps around. All input values to the runtimeTimer
struct now use the runtime clock as a common reference point.
This affects net.Conn.Set[Read|Write]Deadline(), time.Sleep(),
time.Timer, etc. Under normal conditions, behavior is unchanged.

Each platform and architecture's implementation of runtime·nanotime()
should be modified to use a monotonic system clock when possible.

Platforms/architectures modified and tested with monotonic clock:
  linux/x86     - clock_gettime(CLOCK_MONOTONIC)

Update #6007

LGTM=dvyukov, rsc
R=golang-codereviews, dvyukov, alex.brainman, stephen.gutekanst, dave, rsc, mikioh.mikioh
CC=golang-codereviews
https://golang.org/cl/53010043
2014-02-24 10:57:46 -05:00
Keith Randall
0273dc131e runtime: convert .s textflags from numbers to symbolic constants.
Remove NOPROF/DUPOK from everything.

Edits done with a script, except pclinetest.asm which depended
on the DUPOK flag on main().

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12613044
2013-08-07 12:20:05 -07:00
Shenghou Ma
2f1ead7095 runtime: correctly handle signals received on foreign threads
Fixes #3250.

R=rsc
CC=golang-dev
https://golang.org/cl/10757044
2013-07-12 04:39:39 +08:00
Russ Cox
5146a93e72 runtime: accept GOTRACEBACK=crash to mean 'crash after panic'
This provides a way to generate core dumps when people need them.
The settings are:

        GOTRACEBACK=0  no traceback on panic, just exit
        GOTRACEBACK=1  default - traceback on panic, then exit
        GOTRACEBACK=2  traceback including runtime frames on panic, then exit
        GOTRACEBACK=crash traceback including runtime frames on panic, then crash

Fixes #3257.

R=golang-dev, devon.odell, r, daniel.morsing, ality
CC=golang-dev
https://golang.org/cl/7666044
2013-03-15 01:11:03 -04:00
Dmitriy Vyukov
49e0300854 runtime: integrated network poller for linux
vs tip:
BenchmarkTCP4OneShot                    172994        40485  -76.60%
BenchmarkTCP4OneShot-2                   96581        30028  -68.91%
BenchmarkTCP4OneShot-4                   52615        18454  -64.93%
BenchmarkTCP4OneShot-8                   26351        12289  -53.36%
BenchmarkTCP4OneShot-16                  12258        16093  +31.29%
BenchmarkTCP4OneShot-32                  13200        17045  +29.13%

BenchmarkTCP4OneShotTimeout             124814        42932  -65.60%
BenchmarkTCP4OneShotTimeout-2            99090        29040  -70.69%
BenchmarkTCP4OneShotTimeout-4            51860        18455  -64.41%
BenchmarkTCP4OneShotTimeout-8            26100        12073  -53.74%
BenchmarkTCP4OneShotTimeout-16           12198        16654  +36.53%
BenchmarkTCP4OneShotTimeout-32           13438        17143  +27.57%

BenchmarkTCP4Persistent                 115647         7782  -93.27%
BenchmarkTCP4Persistent-2                58024         4808  -91.71%
BenchmarkTCP4Persistent-4                24715         3674  -85.13%
BenchmarkTCP4Persistent-8                16431         2407  -85.35%
BenchmarkTCP4Persistent-16                2336         1875  -19.73%
BenchmarkTCP4Persistent-32                1689         1637   -3.08%

BenchmarkTCP4PersistentTimeout           79754         7859  -90.15%
BenchmarkTCP4PersistentTimeout-2         57708         5952  -89.69%
BenchmarkTCP4PersistentTimeout-4         26907         3823  -85.79%
BenchmarkTCP4PersistentTimeout-8         15036         2567  -82.93%
BenchmarkTCP4PersistentTimeout-16         2507         1903  -24.09%
BenchmarkTCP4PersistentTimeout-32         1717         1627   -5.24%

vs old scheduler:
benchmark                           old ns/op    new ns/op    delta
BenchmarkTCPOneShot                    192244        40485  -78.94%
BenchmarkTCPOneShot-2                   63835        30028  -52.96%
BenchmarkTCPOneShot-4                   35443        18454  -47.93%
BenchmarkTCPOneShot-8                   22140        12289  -44.49%
BenchmarkTCPOneShot-16                  16930        16093   -4.94%
BenchmarkTCPOneShot-32                  16719        17045   +1.95%

BenchmarkTCPOneShotTimeout             190495        42932  -77.46%
BenchmarkTCPOneShotTimeout-2            64828        29040  -55.20%
BenchmarkTCPOneShotTimeout-4            34591        18455  -46.65%
BenchmarkTCPOneShotTimeout-8            21989        12073  -45.10%
BenchmarkTCPOneShotTimeout-16           16848        16654   -1.15%
BenchmarkTCPOneShotTimeout-32           16796        17143   +2.07%

BenchmarkTCPPersistent                  81670         7782  -90.47%
BenchmarkTCPPersistent-2                26598         4808  -81.92%
BenchmarkTCPPersistent-4                15633         3674  -76.50%
BenchmarkTCPPersistent-8                18093         2407  -86.70%
BenchmarkTCPPersistent-16               17472         1875  -89.27%
BenchmarkTCPPersistent-32                7679         1637  -78.68%

BenchmarkTCPPersistentTimeout           83186         7859  -90.55%
BenchmarkTCPPersistentTimeout-2         26883         5952  -77.86%
BenchmarkTCPPersistentTimeout-4         15776         3823  -75.77%
BenchmarkTCPPersistentTimeout-8         18180         2567  -85.88%
BenchmarkTCPPersistentTimeout-16        17454         1903  -89.10%
BenchmarkTCPPersistentTimeout-32         7798         1627  -79.14%

R=golang-dev, iant, bradfitz, dave, rsc
CC=golang-dev
https://golang.org/cl/7579044
2013-03-14 19:06:35 +04:00
Russ Cox
295a4d8e64 runtime: ignore failure from madvise
When we release memory to the OS, if the OS doesn't want us
to release it (for example, because the program executed
mlockall(MCL_FUTURE)), madvise will fail. Ignore the failure
instead of crashing.

Fixes #3435.

R=ken2
CC=golang-dev
https://golang.org/cl/6998052
2012-12-22 15:06:28 -05:00
Jingcheng Zhang
70e967b7bc runtime: use "mp" and "gp" instead of "m" and "g" for local variable name to avoid confusion with the global "m" and "g".
R=golang-dev, minux.ma, rsc
CC=bradfitz, golang-dev
https://golang.org/cl/6939064
2012-12-19 00:30:29 +08:00
Shenghou Ma
7777bac6e4 runtime: use clock_gettime to get ns resolution for time.now & runtime.nanotime
For Linux/{386,arm}, FreeBSD/{386,amd64,arm}, NetBSD/{386,amd64}, OpenBSD/{386,amd64}.
Note: our Darwin implementation already has ns resolution.

Linux/386 (Core i7-2600 @ 3.40GHz, kernel 3.5.2-gentoo)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow          110          118   +7.27%

Linux/ARM (ARM Cortex-A8 @ 800MHz, kernel 2.6.32.28 android)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow          625          542  -13.28%

Linux/ARM (ARM Cortex-A9 @ 1GHz, Pandaboard)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow          992          909   -8.37%

FreeBSD 9-REL-p1/amd64 (Dell R610 Server with Xeon X5650 @ 2.67GHz)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow          699          695   -0.57%

FreeBSD 9-REL-p1/amd64 (Atom D525 @ 1.80GHz)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow         1553         1658   +6.76%

OpenBSD/amd64 (Dell E6410 with i5 CPU M 540 @ 2.53GHz)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow         1262         1236   -2.06%

OpenBSD/i386 (Asus eeePC 701 with Intel Celeron M 900MHz - locked to 631MHz)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow         5089         5043   -0.90%

NetBSD/i386 (VMware VM with Core i5 CPU @ 2.7GHz)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow          277          278   +0.36%

NetBSD/amd64 (VMware VM with Core i5 CPU @ 2.7Ghz)
benchmark       old ns/op    new ns/op    delta
BenchmarkNow          103          105   +1.94%

Thanks Maxim Khitrov, Joel Sing, and Dave Cheney for providing benchmark data.

R=jsing, dave, rsc
CC=golang-dev
https://golang.org/cl/6820120
2012-12-18 22:57:25 +08:00
Alan Donovan
532dee3842 runtime: discard SIGPROF delivered to non-Go threads.
Signal handlers are global resources but many language
environments (Go, C++ at Google, etc) assume they have sole
ownership of a particular handler.  Signal handlers in
mixed-language applications must therefore be robust against
unexpected delivery of certain signals, such as SIGPROF.

The default Go signal handler runtime·sigtramp assumes that it
will never be called on a non-Go thread, but this assumption
is violated by when linking in C++ code that spawns threads.
Specifically, the handler asserts the thread has an associated
"m" (Go scheduler).

This CL is a very simple workaround: discard SIGPROF delivered to non-Go threads.  runtime.badsignal(int32) now receives the signal number; if it returns without panicking (e.g. sig==SIGPROF) the signal is discarded.

I don't think there is any really satisfactory solution to the
problem of signal-based profiling in a mixed-language
application.  It's not only the issue of handler clobbering,
but also that a C++ SIGPROF handler called in a Go thread
can't unwind the Go stack (and vice versa).  The best we can
hope for is not crashing.

Note:
- I've ported this to all POSIX platforms, except ARM-linux which already ignores unexpected signals on m-less threads.
- I've avoided tail-calling runtime.badsignal because AFAICT the 6a/6l don't support it.
- I've avoided hoisting 'push sig' (common to both function calls) because it makes the code harder to read.
- Fixed an (apparently incorrect?) docstring.

R=iant, rsc, minux.ma
CC=golang-dev
https://golang.org/cl/6498057
2012-09-04 14:40:49 -04:00
Shenghou Ma
4f308edc86 runtime: use sched_getaffinity for runtime.NumCPU() on Linux
Fixes #3921.

R=iant
CC=golang-dev
https://golang.org/cl/6448132
2012-08-10 10:05:26 +08:00
Russ Cox
b23691148f runtime: print error on receipt of signal on non-Go thread
It's the best we can do before Go 1.

For issue 3250; not a fix but at least less mysterious.

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/5797068
2012-03-12 15:55:18 -04:00
Russ Cox
102274a30e runtime: size arena to fit in virtual address space limit
For Brad.
Now FreeBSD/386 binaries run on nearlyfreespeech.net.

Fixes #2302.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5700060
2012-02-24 15:28:51 -05:00
Russ Cox
240b1d5b44 runtime: linux signal masking
Fixes #3101 (Linux).

R=golang-dev, bradfitz, minux.ma
CC=golang-dev
https://golang.org/cl/5696043
2012-02-23 14:43:58 -05:00
Russ Cox
35586f718c os/signal: selective signal handling
Restore package os/signal, with new API:
Notify replaces Incoming, allowing clients
to ask for certain signals only.  Also, signals
go to everyone who asks, not just one client.

This could plausibly move into package os now
that there are no magic side effects as a result
of the import.

Update runtime for new API: move common Unix
signal handling code into signal_unix.c.
(It's so easy to do this now that we don't have
to edit Makefiles!)

Tested on darwin,linux 386,amd64.

Fixes #1266.

R=r, dsymonds, bradfitz, iant, borman
CC=golang-dev
https://golang.org/cl/3749041
2012-02-13 13:52:37 -05:00
Russ Cox
55889409f8 runtime: separate out auto-generated files, take 2
This is like the ill-fated CL 5493063 except that
I have written a shell script (autogen.sh) instead of
thinking I could possibly write a correct Makefile.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5496075
2011-12-19 15:51:13 -05:00
Russ Cox
86dcc431e9 runtime: hg revert -r 6ec0a5c12d75
That was the last build that was close to working.
I will try that change again next week.
Make is being very subtle today.

At the reverted-to CL, the ARM traceback appears
to be broken.  I'll look into that next week too.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5492063
2011-12-16 18:50:40 -05:00
Russ Cox
bd9243da22 runtime: separate out auto-generated files
R=golang-dev, r, r
CC=golang-dev
https://golang.org/cl/5493063
2011-12-16 17:04:32 -05:00
Russ Cox
851f30136d runtime: make more build-friendly
Collapse the arch,os-specific directories into the main directory
by renaming xxx/foo.c to foo_xxx.c, and so on.

There are no substantial edits here, except to the Makefile.
The assumption is that the Go tool will #define GOOS_darwin
and GOARCH_amd64 and will make any file named something
like signals_darwin.h available as signals_GOOS.h during the
build.  This replaces what used to be done with -I$(GOOS).

There is still work to be done to make runtime build with
standard tools, but this is a big step.  After this we will have
to write a script to generate all the generated files so they
can be checked in (instead of generated during the build).

R=r, iant, r, lucio.dere
CC=golang-dev
https://golang.org/cl/5490053
2011-12-16 15:33:58 -05:00