1
0
mirror of https://github.com/golang/go synced 2024-10-04 16:21:22 -06:00
Commit Graph

285 Commits

Author SHA1 Message Date
Shenghou Ma
d69d0fe92b liblink, cmd/dist, cmd/5l: introduce %^ and move C_* constants.
The helps certain diagnostics and also removed duplicated enums as a side effect.

LGTM=dave, rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/115060044
2014-08-06 00:31:22 -04:00
Shenghou Ma
45bd2f8c45 cmd/5l, cmd/6l, cmd/8l, cmd/ld: remove unused code, consolidate enums
LGTM=rsc
R=rsc, iant
CC=golang-codereviews
https://golang.org/cl/120220043
2014-08-06 00:25:52 -04:00
Shenghou Ma
61864c09fc cmd/5l: remove unused noop.c
LGTM=dave
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/116330043
2014-07-26 17:44:47 -04:00
Shenghou Ma
595dcef80a cmd/5l, cmd/6l, cmd/8l: remove mkenam.
Unused. cmd/dist will generate enams as liblink/anames[568].c.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/119940043
2014-07-26 17:43:10 -04:00
Shenghou Ma
783bcba84d cmd/5c, cmd/5g, cmd/5l, liblink: nacl/arm support
LGTM=dave, rsc
R=rsc, iant, dave
CC=golang-codereviews
https://golang.org/cl/108360043
2014-07-10 15:14:37 -04:00
Russ Cox
ebce79446d build: annotations and modifications for c2go
The main changes fall into a few patterns:

1. Replace #define with enum.

2. Add /*c2go */ comment giving effect of #define.
This is necessary for function-like #defines and
non-enum-able #defined constants.
(Not all compilers handle negative or large enums.)

3. Add extra braces in struct initializer.
(c2go does not implement the full rules.)

This is enough to let c2go typecheck the source tree.
There may be more changes once it is doing
other semantic analyses.

LGTM=minux, iant
R=minux, dave, iant
CC=golang-codereviews
https://golang.org/cl/106860045
2014-07-02 15:41:29 -04:00
Keith Randall
51b72d94de runtime: use duff zero and copy to initialize memory
benchmark                 old ns/op     new ns/op     delta
BenchmarkCopyFat512       1307          329           -74.83%
BenchmarkCopyFat256       666           169           -74.62%
BenchmarkCopyFat1024      2617          671           -74.36%
BenchmarkCopyFat128       343           89.0          -74.05%
BenchmarkCopyFat64        182           48.9          -73.13%
BenchmarkCopyFat32        103           28.8          -72.04%
BenchmarkClearFat128      102           46.6          -54.31%
BenchmarkClearFat512      344           167           -51.45%
BenchmarkClearFat64       50.5          26.5          -47.52%
BenchmarkClearFat256      147           87.2          -40.68%
BenchmarkClearFat32       22.7          16.4          -27.75%
BenchmarkClearFat1024     511           662           +29.55%

Fixes #7624

LGTM=rsc
R=golang-codereviews, khr, bradfitz, josharian, dave, rsc
CC=golang-codereviews
https://golang.org/cl/92760044
2014-05-07 13:17:10 -07:00
Shenghou Ma
d31d19765b runtime, cmd/ld, cmd/5l, run.bash: enable external linking on FreeBSD/ARM.
Update #7331

LGTM=dave, iant
R=golang-codereviews, dave, gobot, iant
CC=golang-codereviews
https://golang.org/cl/89520043
2014-04-21 00:08:59 -04:00
Russ Cox
5e8c922625 liblink, cmd/ld: reenable nosplit checking and test
The new code is adapted from the Go 1.2 nosplit code,
but it does not have the bug reported in issue 7623:

g% go run nosplit.go
g% go1.2 run nosplit.go
BUG
rejected incorrectly:
        main 0 call f; f 120

        linker output:
        # _/tmp/go-test-nosplit021064539
        main.main: nosplit stack overflow
                120	guaranteed after split check in main.main
                112	on entry to main.f
                -8	after main.f uses 120

g%

Fixes #6931.
Fixes #7623.

LGTM=iant
R=golang-codereviews, iant, ality
CC=golang-codereviews, r
https://golang.org/cl/88190043
2014-04-16 22:08:00 -04:00
Russ Cox
8d39e55c65 liblink: remove arch-specific constants from file format
The relocation and automatic variable types were using
arch-specific numbers. Introduce portable enumerations
instead.

To the best of my knowledge, these are the only arch-specific
bits left in the new object file format.

Remove now, before Go 1.3, because file formats are forever.

LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/87670044
2014-04-14 15:54:20 -04:00
Russ Cox
b700cb4974 cmd/gc: shorten temporary lifetimes when possible
The new channel and map runtime routines take pointers
to values, typically temporaries. Without help, the compiler
cannot tell when those temporaries stop being needed,
because it isn't sure what happened to the pointer.
Arrange to insert explicit VARKILL instructions for these
temporaries so that the liveness analysis can avoid seeing
them as "ambiguously live".

The change is made in order.c, which was already in charge of
introducing temporaries to preserve the order-of-evaluation
guarantees. Now its job has expanded to include introducing
temporaries as needed by runtime routines, and then also
inserting the VARKILL annotations for all these temporaries,
so that their lifetimes can be shortened.

In order to do its job for the map runtime routines, order.c arranges
that all map lookups or map assignments have the form:

        x = m[k]
        x, y = m[k]
        m[k] = x

where x, y, and k are simple variables (often temporaries).
Likewise, receiving from a channel is now always:

        x = <-c

In order to provide the map guarantee, order.c is responsible for
rewriting x op= y into x = x op y, so that m[k] += z becomes

        t = m[k]
        t2 = t + z
        m[k] = t2

While here, fix a few bugs in order.c's traversal: it was failing to
walk into select and switch case bodies, so order of evaluation
guarantees were not preserved in those situations.
Added tests to test/reorder2.go.

Fixes #7671.

In gc/popt's temporary-merging optimization, allow merging
of temporaries with their address taken as long as the liveness
ranges do not intersect. (There is a good chance of that now
that we have VARKILL annotations to limit the liveness range.)

Explicitly killing temporaries cuts the number of ambiguously
live temporaries that must be zeroed in the godoc binary from
860 to 711, or -17%. There is more work to be done, but this
is a good checkpoint.

Update #7345

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/81940043
2014-04-01 13:31:38 -04:00
Russ Cox
d9c6ae6ae8 all: final merge of NaCl tree
This CL replays the following one CL from the rsc-go13nacl repo.
This is the last replay CL: after this CL the main repo will have
everything the rsc-go13nacl repo did. Changes made to the main
repo after the rsc-go13nacl repo branched off probably mean that
NaCl doesn't actually work after this CL, but all the code is now moved
over and just needs to be redebugged.

---
cmd/6l, cmd/8l, cmd/ld: support for Native Client

See golang.org/s/go13nacl for design overview.

This CL is publicly visible but not CC'ed to golang-dev,
to avoid distracting from the preparation of the Go 1.2
release.

This CL and the others will be checked into my rsc-go13nacl
clone repo for now, and I will send CLs against the main
repo early in the Go 1.3 development.

R≡khr
https://golang.org/cl/15750044
---

LGTM=bradfitz, dave, iant
R=dave, bradfitz, iant
CC=golang-codereviews
https://golang.org/cl/69040044
2014-02-27 20:37:00 -05:00
Russ Cox
801e40a0a4 cmd/gc: rename AFATVARDEF to AVARDEF
The "fat" referred to being used for multiword values only.
We're going to use it for non-fat values sometimes too.

No change other than the renaming.

TBR=iant
CC=golang-codereviews
https://golang.org/cl/63650043
2014-02-13 22:17:22 -05:00
Anthony Martin
2cae0591cd cmd/cc, cmd/gc, cmd/ld: consolidate print format routines
We now use the %A, %D, %P, and %R routines from liblink
across the board.

Fixes #7178.
Fixes #7055.

LGTM=iant
R=golang-codereviews, gobot, rsc, dave, iant, remyoudompheng
CC=golang-codereviews
https://golang.org/cl/49170043
2014-02-12 14:29:11 -05:00
Shenghou Ma
6ebf59b953 include, linlink, cmd/6l, cmd/ld: part 1 of solaris/amd64 linker changes.
rsc suggested that we split the whole linker changes into three parts.
This is the first one, mostly dealing with adding Hsolaris.

LGTM=iant
R=golang-codereviews, iant, dave
CC=golang-codereviews
https://golang.org/cl/54210050
2014-02-09 16:45:38 -05:00
Elias Naur
9ed5995cfe liblink, cmd/5l: restore flag_shared
CL 56120043 fixed and cleaned up TLS on ARM after introducing liblink, but
left flag_shared broken. This CL restores the (unsupported) flag_shared
behaviour by simply rewriting access to $runtime.tlsgm(SB) with
runtime.tlsgm(SB), to compensate for the extra indirection when going from
the R_ARM_TLS_LE32 relocation to the R_ARM_TLS_IE32 relocation.

Also, remove unnecessary symbol lookup left after 56120043.

LGTM=iant
R=iant, rsc
CC=golang-codereviews
https://golang.org/cl/57000043
2014-02-03 14:49:57 -08:00
Elias Naur
1dd4da1459 liblink, cmd/5a, cmd/5l: restore cgo on older ARM processors
CL 56120043 fixed TLS handling on ARM after the introduction of
liblink but left older ARM processors broken.

Before liblink, the MRC instruction was replaced with a fallback
on older ARMs. CL 56120043 removed that, because the rewrite matched
bit patterns on the AWORD pseudo-instruction and could therefore change
unrelated AWORDs that happened to match.

This CL adds an AMRC instruction to encode both MRC and MCR previously
encoded as AWORDs. Then, in liblink, the AMRC instructions are either
rewritten to AWORD, or, on goarm < 7, replaced with a branch to the
fallback.

./all.bash completes successfully on an ARMv7 with either GOARM=7 or
GOARM=5. I have verified that the fallback is indeed present in both
runtime.save_gm and runtime.load_gm when GOARM=5 but not when GOARM=7.

If all goes well, this should fix the armv5 builders.

LGTM=iant
R=iant, rsc
CC=golang-codereviews
https://golang.org/cl/55540044
2014-02-03 14:07:54 -08:00
Russ Cox
a9f6db58ce cmd/ld: move instruction selection + layout into compilers, assemblers
- new object file reader/writer (liblink/objfile.c)
- remove old object file writing routines
- add pcdata iterator
- remove all trace of "line number stack" and "path fragments" from
  object files, linker (!!!)
- dwarf now writes a single "compilation unit" instead of one per package

This CL disables the check for chains of no-split functions that
could overflow the stack red zone. A future CL will attack the problem
of reenabling that check (issue 6931).

This CL is just the liblink and cmd/ld changes.
There are minor associated adjustments in CL 37030045.
Each depends on the other.

R=golang-dev, dave, iant
CC=golang-dev
https://golang.org/cl/39680043
2013-12-16 12:51:58 -05:00
Russ Cox
4230044bb8 runtime: remove non-extern decls of runtime.goarm
The linker is in charge of providing the one true declaration.

R=golang-dev, dave, r
CC=golang-dev
https://golang.org/cl/39560043
2013-12-09 19:35:07 -05:00
Russ Cox
7d507dc6e6 liblink: create new library based on linker code
There is an enormous amount of code moving around in this CL,
but the code is the same, and it is invoked in the same ways.
This CL is preparation for the new linker structure, not the new
structure itself.

The new library's definition is in include/link.h.

The main change is the use of a Link structure to hold all the
linker-relevant state, replacing the smattering of global variables.
The Link structure should both make it clearer which state must
be carried around and make it possible to parallelize more easily
later.

The main body of the linker has moved into the architecture-independent
cmd/ld directory. That includes the list of known header types, so the
distinction between Hplan9x32 and Hplan9x64 is removed (no other
header type distinguished 32- and 64-bit formats), and code for unused
formats such as ipaq kernels has been deleted.

The code being deleted from 5l, 6l, and 8l reappears in liblink or in ld.
Because multiple files are being merged in the liblink directory,
it is not possible to show the diffs nicely in hg.

The Prog and Addr structures have been unified into an
architecture-independent form and moved to link.h, where they will
be shared by all tools: the assemblers, the compilers, and the linkers.
The unification makes it possible to write architecture-independent
traversal of Prog lists, among other benefits.

The Sym structures cannot be unified: they are too fundamentally
different between the linker and the compilers. Instead, liblink defines
an LSym - a linker Sym - to be used in the Prog and Addr structures,
and the linker now refers exclusively to LSyms. The compilers will
keep using their own syms but will fill out the corresponding LSyms in
the Prog and Addr structures.

Although code from 5l, 6l, and 8l is now in a single library, the
code has been arranged so that only one architecture needs to
be linked into a particular program: 5l will not contain the code
needed for x86 instruction layout, for example.

The object file writing code in liblink/obj.c is from cmd/gc/obj.c.

Preparation for golang.org/s/go13linker work.

This CL does not build by itself. It depends on 35740044
and will be submitted at the same time.

R=iant
CC=golang-dev
https://golang.org/cl/35790044
2013-12-08 22:49:37 -05:00
Carl Shapiro
f056daf075 cmd/5g, cmd/5l, cmd/6g, cmd/6l, cmd/8g, cmd/8l, cmd/gc, runtime: generate pointer maps by liveness analysis
This change allows the garbage collector to examine stack
slots that are determined as live and containing a pointer
value by the garbage collector.  This results in a mean
reduction of 65% in the number of stack slots scanned during
an invocation of "GOGC=1 all.bash".

Unfortunately, this does not yet allow garbage collection to
be precise for the stack slots computed as live.  Pointers
confound the determination of what definitions reach a given
instruction.  In general, this problem is not solvable without
runtime cost but some advanced cooperation from the compiler
might mitigate common cases.

R=golang-dev, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/14430048
2013-12-05 17:35:22 -08:00
Russ Cox
2c98a3bc2e cmd/5l, runtime: fix divide for profiling tracebacks on ARM
Two bugs:
1. The first iteration of the traceback always uses LR when provided,
which it is (only) during a profiling signal, but in fact LR is correct
only if the stack frame has not been allocated yet. Otherwise an
intervening call may have changed LR, and the saved copy in the stack
frame should be used. Fix in traceback_arm.c.

2. The division runtime call adds 8 bytes to the stack. In order to
keep the traceback routines happy, it must copy the saved LR into
the new 0(SP). Change

        SUB $8, SP

into

        MOVW    0(SP), R11 // r11 is temporary, for use by linker
        MOVW.W  R11, -8(SP)

to update SP and 0(SP) atomically, so that the traceback always
sees a saved LR at 0(SP).

Fixes #6681.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/19910044
2013-10-31 18:15:55 +00:00
Russ Cox
b88148b9a0 undo CL 19810043 / 352f3b7c9664
The CL causes misc/cgo/test to fail randomly.
I suspect that the problem is the use of a division instruction
in usleep, which can be called while trying to acquire an m
and therefore cannot store the denominator in m.
The solution to that would be to rewrite the code to use a
magic multiply instead of a divide, but now we're getting
pretty far off the original code.

Go back to the original in preparation for a different,
less efficient but simpler fix.

««« original CL description
cmd/5l, runtime: make ARM integer division profiler-friendly

The implementation of division constructed non-standard
stack frames that could not be handled by the traceback
routines.

CL 13239052 left the frames non-standard but fixed them
for the specific case of a divide-by-zero panic.
A profiling signal can arrive at any time, so that fix
is not sufficient.

Change the division to store the extra argument in the M struct
instead of in a new stack slot. That keeps the frames bog standard
at all times.

Also fix a related bug in the traceback code: when starting
a traceback, the LR register should be ignored if the current
function has already allocated its stack frame and saved the
original LR on the stack. The stack copy should be used, as the
LR register may have been modified.

Combined, these make the torture test from issue 6681 pass.

Fixes #6681.

R=golang-dev, r, josharian
CC=golang-dev
https://golang.org/cl/19810043
»»»

TBR=r
CC=golang-dev
https://golang.org/cl/20350043
2013-10-31 17:18:57 +00:00
Russ Cox
b0db472ea2 cmd/5l, runtime: make ARM integer division profiler-friendly
The implementation of division constructed non-standard
stack frames that could not be handled by the traceback
routines.

CL 13239052 left the frames non-standard but fixed them
for the specific case of a divide-by-zero panic.
A profiling signal can arrive at any time, so that fix
is not sufficient.

Change the division to store the extra argument in the M struct
instead of in a new stack slot. That keeps the frames bog standard
at all times.

Also fix a related bug in the traceback code: when starting
a traceback, the LR register should be ignored if the current
function has already allocated its stack frame and saved the
original LR on the stack. The stack copy should be used, as the
LR register may have been modified.

Combined, these make the torture test from issue 6681 pass.

Fixes #6681.

R=golang-dev, r, josharian
CC=golang-dev
https://golang.org/cl/19810043
2013-10-30 18:50:34 +00:00
Dave Day
0a033a18ad cmd/gc: support -installsuffix in the compiler and builder
Add the -installsuffix flag to gc and {5,6,8}l, which overrides -race
for the suffix if both are supplied.
Pass this flag from the go tool for build and install.

R=rsc
CC=golang-dev
https://golang.org/cl/14246044
2013-10-03 13:48:47 +10:00
Russ Cox
3acddba2ec cmd/5l: fix handling of RET.EQ in wrapper function
Keith is too clever for me.

R=ken2
CC=golang-dev, khr
https://golang.org/cl/13272050
2013-09-13 03:50:50 +00:00
Russ Cox
7276c02b41 runtime, cmd/gc, cmd/ld: ignore method wrappers in recover
Bug #1:

Issue 5406 identified an interesting case:
        defer iface.M()
may end up calling a wrapper that copies an indirect receiver
from the iface value and then calls the real M method. That's
two calls down, not just one, and so recover() == nil always
in the real M method, even during a panic.

[For the purposes of this entire discussion, a wrapper's
implementation is a function containing an ordinary call, not
the optimized tail call form that is somtimes possible. The
tail call does not create a second frame, so it is already
handled correctly.]

Fix this bug by introducing g->panicwrap, which counts the
number of bytes on current stack segment that are due to
wrapper calls that should not count against the recover
check. All wrapper functions must now adjust g->panicwrap up
on entry and back down on exit. This adds slightly to their
expense; on the x86 it is a single instruction at entry and
exit; on the ARM it is three. However, the alternative is to
make a call to recover depend on being able to walk the stack,
which I very much want to avoid. We have enough problems
walking the stack for garbage collection and profiling.
Also, if performance is critical in a specific case, it is already
faster to use a pointer receiver and avoid this kind of wrapper
entirely.

Bug #2:

The old code, which did not consider the possibility of two
calls, already contained a check to see if the call had split
its stack and so the panic-created segment was one behind the
current segment. In the wrapper case, both of the two calls
might split their stacks, so the panic-created segment can be
two behind the current segment.

Fix this by propagating the Stktop.panic flag forward during
stack splits instead of looking backward during recover.

Fixes #5406.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/13367052
2013-09-12 14:00:16 -04:00
Russ Cox
1a6576db34 cmd/5l, cmd/6l, cmd/8l: refactor stack split code
Pull the stack split generation into its own function.
This will make an upcoming change to fix recover
easier to digest.

R=ken2
CC=golang-dev
https://golang.org/cl/13611044
2013-09-11 20:29:45 -04:00
Joel Sing
d0206101c8 cmd/5l,cmd/6l,cmd/8l: fix dragonflydynld path
R=golang-dev, bradfitz, dave
CC=golang-dev
https://golang.org/cl/13225043
2013-08-31 22:02:21 +10:00
Dmitriy Vyukov
79dca0327e libbio, all cmd: consistently use BGETC/BPUTC instead of Bgetc/Bputc
Also introduce BGET2/4, BPUT2/4 as they are widely used.
Slightly improve BGETC/BPUTC implementation.
This gives ~5% CPU time improvement on go install -a -p1 std.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.999s	0m15.841s	0m5.889s
0m22.845s	0m15.808s	0m5.850s
0m22.889s	0m15.832s	0m5.848s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/12745047
2013-08-30 15:46:12 +04:00
Joel Sing
e18ed3c111 cmd/5l,cmd/8l: unbreak arm and 386 linkers
Add dragonflydynld to 5l and 8l so that they compile again.

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12739048
2013-08-24 01:32:01 +10:00
Russ Cox
999a36f9af cmd/gc: &x panics if x does
See golang.org/s/go12nil.

This CL is about getting all the right checks inserted.
A followup CL will add an optimization pass to
remove redundant checks.

R=ken2
CC=golang-dev
https://golang.org/cl/12970043
2013-08-15 14:38:32 -04:00
Elias Naur
45233734e2 runtime.cmd/ld: Add ARM external linking and implement -shared in terms of external linking
This CL is an aggregate of 10271047, 10499043, 9733044. Descriptions of each follow:

10499043
runtime,cmd/ld: Merge TLS symbols and teach 5l about ARM TLS

This CL prepares for external linking support to ARM.

The pseudo-symbols runtime.g and runtime.m are merged into a single
runtime.tlsgm symbol. When external linking, the offset of a thread local
variable is stored at a memory location instead of being embedded into a offset
of a ldr instruction. With a single runtime.tlsgm symbol for both g and m, only
one such offset is needed.

The larger part of this CL moves TLS code from gcc compiled to internally
compiled. The TLS code now uses the modern MRC instruction, and 5l is taught
about TLS fallbacks in case the instruction is not available or appropriate.

10271047
This CL adds support for -linkmode external to 5l.

For 5l itself, use addrel to allow for D_CALL relocations to be handled by the
host linker. Of the cases listed in rsc's comment in issue 4069, only case 5 and
63 needed an update. One of the TODO: addrel cases was since replaced, and the
rest of the cases are either covered by indirection through addpool (cases with
LTO or LFROM flags) or stubs (case 74). The addpool cases are covered because
addpool emits AWORD instructions, which in turn are handled by case 11.

In the runtime, change the argv argument in the rt0* functions slightly to be a
pointer to the argv list, instead of relying on a particular location of argv.

9733044
The -shared flag to 6l outputs a shared library, implemented in Go
and callable from non-Go programs such as C.

The main part of this CL change the thread local storage model.
Go uses the fastest and least general mode, local exec. TLS data in shared
libraries normally requires at least the local dynamic mode, however, this CL
instead opts for using the initial exec mode. Initial exec mode is faster than
local dynamic mode and can be used in linux since the linker has reserved a
limited amount of TLS space for performance sensitive TLS code.

Initial exec mode requires an extra load from the GOT table to determine the
TLS offset. This penalty will not be paid if ld is not in -shared mode, since
TLS accesses will be reduced to local exec.

The elf sections .init_array and .rela.init_array are added to register the Go
runtime entry with cgo at library load time.

The "hidden" attribute is added to Cgo functions called from Go, since Go
does not generate call through the GOT table, and adding non-GOT relocations for
a global function is not supported by gcc. Cgo symbols don't need to be global
and avoiding the GOT table is also faster.

The changes to 8l are only removes code relevant to the old -shared mode where
internal linking was used.

This CL only address the low level linker work. It can be submitted by itself,
but to be useful, the runtime changes in CL 9738047 is also needed.

Design discussion at
https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/zmjXkGrEx6Q

Fixes #5590.

R=rsc
CC=golang-dev
https://golang.org/cl/12871044
2013-08-14 15:38:54 +00:00
Russ Cox
5636b60b70 cmd/5l: fix encoding of new MOVB, MOVH instructions
They are just like MOVW and should be setting only
two register fields, not three.

R=ken2
CC=golang-dev, remyoudompheng
https://golang.org/cl/12781043
2013-08-12 13:42:04 -04:00
Rémy Oudompheng
357f733695 cmd/5c, cmd/5g, cmd/5l: turn MOVB, MOVH into plain moves, optimize short arithmetic.
Pseudo-instructions MOVBS and MOVHS are used to clarify
the semantics of short integers vs. registers:
 * 8-bit and 16-bit values in registers are assumed to always
   be zero-extended or sign-extended depending on their type.
 * MOVB is truncation or move of an already extended value
   between registers.
 * MOVBU enforces zero-extension at the destination (register).
 * MOVBS enforces sign-extension at the destination (register).
And similarly for MOVH/MOVS/MOVHU.

The linker is adapted to assemble MOVB and MOVH to an ordinary
mov. Also a peephole pass in 5g that aims at eliminating
redundant zero/sign extensions is improved.

encoding/binary:
benchmark                              old ns/op    new ns/op    delta
BenchmarkReadSlice1000Int32s              220387       217185   -1.45%
BenchmarkReadStruct                        12839        12910   +0.55%
BenchmarkReadInts                           5692         5534   -2.78%
BenchmarkWriteInts                          6137         6016   -1.97%
BenchmarkPutUvarint32                        257          241   -6.23%
BenchmarkPutUvarint64                        812          754   -7.14%
benchmark                               old MB/s     new MB/s  speedup
BenchmarkReadSlice1000Int32s               18.15        18.42    1.01x
BenchmarkReadStruct                         5.45         5.42    0.99x
BenchmarkReadInts                           5.27         5.42    1.03x
BenchmarkWriteInts                          4.89         4.99    1.02x
BenchmarkPutUvarint32                      15.56        16.57    1.06x
BenchmarkPutUvarint64                       9.85        10.60    1.08x

crypto/des:
benchmark                              old ns/op    new ns/op    delta
BenchmarkEncrypt                            7002         5169  -26.18%
BenchmarkDecrypt                            7015         5195  -25.94%
benchmark                               old MB/s     new MB/s  speedup
BenchmarkEncrypt                            1.14         1.55    1.36x
BenchmarkDecrypt                            1.14         1.54    1.35x

strconv:
benchmark                              old ns/op    new ns/op    delta
BenchmarkAtof64Decimal                       457          385  -15.75%
BenchmarkAtof64Float                         574          479  -16.55%
BenchmarkAtof64FloatExp                     1035          906  -12.46%
BenchmarkAtof64Big                          1793         1457  -18.74%
BenchmarkAtof64RandomBits                   2267         2066   -8.87%
BenchmarkAtof64RandomFloats                 1416         1194  -15.68%
BenchmarkAtof32Decimal                       451          379  -15.96%
BenchmarkAtof32Float                         547          435  -20.48%
BenchmarkAtof32FloatExp                     1095          986   -9.95%
BenchmarkAtof32Random                       1154         1006  -12.82%
BenchmarkAtoi                               1415         1380   -2.47%
BenchmarkAtoiNeg                            1414         1401   -0.92%
BenchmarkAtoi64                             1744         1671   -4.19%
BenchmarkAtoi64Neg                          1737         1662   -4.32%

Fixes #1837.

R=rsc, dave, bradfitz
CC=golang-dev
https://golang.org/cl/12424043
2013-08-09 06:43:17 +02:00
Rémy Oudompheng
3670337097 cmd/5c, cmd/5g, cmd/5l: introduce MOVBS and MOVHS instructions.
MOVBS and MOVHS are defined as duplicates of MOVB and MOVH,
and perform sign-extension moving.
No change is made to code generation.

Update #1837

R=rsc, bradfitz
CC=golang-dev
https://golang.org/cl/12682043
2013-08-08 23:28:53 +02:00
Keith Randall
5a54696d78 cmd/ld: Put the textflag constants in a separate file.
We can then include this file in assembly to replace
cryptic constants like "7" with meaningful constants
like "(NOPROF|DUPOK|NOSPLIT)".

Converting just pkg/runtime/asm*.s for now.  Dropping NOPROF
and DUPOK from lots of places where they aren't needed.
More .s files to come in a subsequent changelist.

A nonzero number in the textflag field now means
"has not been converted yet".

R=golang-dev, daniel.morsing, rsc, khr
CC=golang-dev
https://golang.org/cl/12568043
2013-08-07 10:23:24 -07:00
Russ Cox
48769bf546 runtime: use funcdata to supply garbage collection information
This CL introduces a FUNCDATA number for runtime-specific
garbage collection metadata, changes the C and Go compilers
to emit that metadata, and changes the runtime to expect it.

The old pseudo-instructions that carried this information
are gone, as is the linker code to process them.

R=golang-dev, dvyukov, cshapiro
CC=golang-dev
https://golang.org/cl/11406044
2013-07-19 16:04:09 -04:00
Russ Cox
c3de91bb15 cmd/ld, runtime: use new contiguous pcln table
R=golang-dev, r, dave
CC=golang-dev
https://golang.org/cl/11494043
2013-07-18 10:43:22 -04:00
Russ Cox
567818224e cmd/5l, cmd/6l, cmd/8l: accept PCDATA instruction in input
The portable code in cmd/ld already knows how to process it,
we just have to ignore it during code generation.

R=ken2
CC=golang-dev
https://golang.org/cl/11363043
2013-07-16 16:23:11 -04:00
Russ Cox
5d363c6357 cmd/ld, runtime: new in-memory symbol table format
Design at http://golang.org/s/go12symtab.

This enables some cleanup of the garbage collector metadata
that will be done in future CLs.

This CL does not move the old symtab and pclntab back into
an unmapped section of the file. That's a bit tricky and will be
done separately.

Fixes #4020.

R=golang-dev, dave, cshapiro, iant, r
CC=golang-dev, nigeltao
https://golang.org/cl/11085043
2013-07-16 09:41:38 -04:00
Russ Cox
031c107cad cmd/ld: fix large stack split for preempt check
If the stack frame size is larger than the known-unmapped region at the
bottom of the address space, then the stack split prologue cannot use the usual
condition:

        SP - size >= stackguard

because SP - size may wrap around to a very large number.
Instead, if the stack frame is large, the prologue tests:

        SP - stackguard >= size

(This ends up being a few instructions more expensive, so we don't do it always.)

Preemption requests register by setting stackguard to a very large value, so
that the first test (SP - size >= stackguard) cannot possibly succeed.
Unfortunately, that same very large value causes a wraparound in the
second test (SP - stackguard >= size), making it succeed incorrectly.

To avoid *that* wraparound, we have to amend the test:

        stackguard != StackPreempt && SP - stackguard >= size

This test is only used for functions with large frames, which essentially
always split the stack, so the cost of the few instructions is noise.

This CL and CL 11085043 together fix the known issues with preemption,
at the beginning of a function, so we will be able to try turning it on again.

R=ken2
CC=golang-dev
https://golang.org/cl/11205043
2013-07-12 12:12:56 -04:00
Russ Cox
d6d83c918c cmd/ld: place read-only data in non-executable segment
R=golang-dev, dave, r
CC=golang-dev, nigeltao
https://golang.org/cl/10713043
2013-07-11 22:52:48 -04:00
Russ Cox
6c99b5c0d3 cmd/5l, cmd/6l, cmd/8l: increase error buffer size
STRINGSZ (200) is fine for lines generated by things like
instruction dumps, but an error containing a couple file
names can easily exceed that, especially on Macs with
the ridiculous default $TMPDIR.

R=ken2
CC=golang-dev
https://golang.org/cl/11199043
2013-07-11 22:49:15 -04:00
Russ Cox
6fa3c89b77 runtime: record proper goroutine state during stack split
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.

For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:

        TEXT myfunc
                check for split
                jmpcond nosplit
                call morestack
        nosplit:
                sub $xxx, sp

Now an extra instruction is inserted after the call:

        TEXT myfunc
        start:
                check for split
                jmpcond nosplit
                call morestack
                jmp start
        nosplit:
                sub $xxx, sp

The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.

The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.

The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.

Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)

runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
        morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
        sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow

goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
        /Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
        /Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
        /Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
        /Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
        /Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8

And here is a seg fault during oldstack:

SIGSEGV: segmentation violation
PC=0x1b2a6

runtime.oldstack()
        /Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
        /Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22

goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
        /Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
        /Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
        /Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
        strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8

goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
        /Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
        /Users/rsc/g/go/src/pkg/runtime/proc.c:166

rax     0x23ccc0
rbx     0x23ccc0
rcx     0x0
rdx     0x38
rdi     0x2102c0170
rsi     0x221032cfe0
rbp     0x221032cfa0
rsp     0x7fff5fbff5b0
r8      0x2102c0120
r9      0x221032cfa0
r10     0x221032c000
r11     0x104ce8
r12     0xe5c80
r13     0x1be82baac718
r14     0x13091135f7d69200
r15     0x0
rip     0x1b2a6
rflags  0x10246
cs      0x2b
fs      0x0
gs      0x0

Fixes #5723.

R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
2013-06-27 11:32:01 -04:00
Russ Cox
1f51d27922 cmd/gc: move genembedtramp into portable code
Requires adding new linker instruction
        RET	f(SB)
meaning return but then immediately call f.
This is what you'd use to implement a tail call after
fiddling with the arguments, but the compiler only
uses it in genwrapper.

This CL eliminates the copy-and-paste genembedtramp
functions from 5g/8g/6g and makes the code run on ARM
for the first time. It removes a small special case for function
generation, which should help Carl a bit, but at the same time
it does not bother to implement general tail call optimization,
which we do not want anyway.

Fixes #5627.

R=ken2
CC=golang-dev
https://golang.org/cl/10057044
2013-06-11 09:41:49 -04:00
Shenghou Ma
35e1deaebf cmd/5l: use BLX for BL (Rx).
Fixes #5111.
Update #4718
This CL makes BL (Rx) to use BLX Rx instead of:
MOV LR, PC
MOV PC, Rx

R=cshapiro, rsc
CC=dave, gobot, golang-dev
https://golang.org/cl/9669045
2013-06-11 03:04:24 +08:00
Shenghou Ma
ed6dce6f9d cmd/5l: use guaranteed undefined instruction for UNDEF to match [68]l.
R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/10085050
2013-06-11 02:02:42 +08:00
Shenghou Ma
faef52c214 all: fix typos
R=golang-dev, bradfitz, khr, r
CC=golang-dev
https://golang.org/cl/7461046
2013-06-09 21:50:24 +08:00
Lucio De Re
0b88587d22 cmd/[568]l/obj.c: NULL is not recognised in Plan 9 build, use nil instead.
Fixes #5591.

R=golang-dev, dave, minux.ma, cshapiro
CC=carl shapiro <cshapiro, golang-dev
https://golang.org/cl/9839046
2013-05-30 15:02:10 +10:00