1
0
mirror of https://github.com/golang/go synced 2024-10-04 10:21:21 -06:00
Commit Graph

496 Commits

Author SHA1 Message Date
Russ Cox
9a0a59f171 cmd/6g, cmd/8g: proginfo carry fixes
Bugs pointed out by cshapiro in CL 12637051.

R=cshapiro
CC=golang-dev
https://golang.org/cl/12815043
2013-08-12 21:02:55 -04:00
Russ Cox
b3b87143f2 cmd/gc: support for "portable" optimization logic
Code in gc/popt.c is compiled as part of 5g, 6g, and 8g,
meaning it can use arch-specific headers but there's
just one copy of the code.

This is the same arrangement we use for the portable
code generation logic in gc/pgen.c.

Move fixjmp and noreturn there to get the ball rolling.

R=ken2
CC=golang-dev
https://golang.org/cl/12789043
2013-08-12 19:14:02 -04:00
Russ Cox
ac0df6ce89 cmd/8g: factor out prog information
Like CL 12637051, but for 8g instead of 6g.
Fix a few minor 6g errors too.

R=ken2
CC=golang-dev
https://golang.org/cl/12778043
2013-08-12 13:05:40 -04:00
Russ Cox
24c8035fbe cmd/6g: move opt instruction decode into common function
Add new proginfo function that returns information about a
Prog*. The information includes various instruction
description bits as well as a list of required registers set
and used and indexing registers used.

Convert the large instruction switches to use proginfo.

This information was formerly duplicated in multiple
optimization passes, inconsistently. For example, the
information about which registers an instruction requires
appeared three times for most instructions.

Most of the switches were incomplete or incorrect in some way.
For example, the switch in copyu did not list cases for INCB,
JPS, MOVAPD, MOVBWSX, MOVBWZX, PCDATA, POPQ, PUSHQ, STD,
TESTB, TESTQ, and XCHGL. Those were all falling into the
"unknown instruction" default case and stopping the rewrite,
perhaps unnecessarily. Similarly, the switch in needc only
listed a handful of the instructions that use or set the carry bit.

We still need to decide whether to use proginfo to generalize
a few of the remaining smaller switches in peep.c.

If this goes well, we'll make similar changes in 8g and 5g.

R=ken2
CC=golang-dev
https://golang.org/cl/12637051
2013-08-11 21:46:38 -04:00
Russ Cox
7910cd68b5 cmd/gc: zero pointers on entry to function
On entry to a function, zero the results and zero the pointer
section of the local variables.

This is an intermediate step on the way to precise collection
of Go frames.

This can incur a significant (up to 30%) slowdown, but it also ensures
that the garbage collector never looks at a word in a Go frame
and sees a stale pointer value that could cause a space leak.
(C frames and assembly frames are still possibly problematic.)

This CL is required to start making collection of interface values
as precise as collection of pointer values are today.
Since we have to dereference the interface type to understand
whether the value is a pointer, it is critical that the type field be
initialized.

A future CL by Carl will make the garbage collection pointer
bitmaps context-sensitive. At that point it will be possible to
remove most of the zeroing. The only values that will still need
zeroing are values whose addresses escape the block scoping
of the function but do not escape to the heap.

benchmark                         old ns/op    new ns/op    delta
BenchmarkBinaryTree17            4420289180   4331060459   -2.02%
BenchmarkFannkuch11              3442469663   3277706251   -4.79%
BenchmarkFmtFprintfEmpty                100          142  +42.00%
BenchmarkFmtFprintfString               262          310  +18.32%
BenchmarkFmtFprintfInt                  213          281  +31.92%
BenchmarkFmtFprintfIntInt               355          431  +21.41%
BenchmarkFmtFprintfPrefixedInt          321          383  +19.31%
BenchmarkFmtFprintfFloat                444          533  +20.05%
BenchmarkFmtManyArgs                   1380         1559  +12.97%
BenchmarkGobDecode                 10240054     11794915  +15.18%
BenchmarkGobEncode                 17350274     19970478  +15.10%
BenchmarkGzip                     455179460    460699139   +1.21%
BenchmarkGunzip                   114271814    119291574   +4.39%
BenchmarkHTTPClientServer             89051        89894   +0.95%
BenchmarkJSONEncode                40486799     52691558  +30.15%
BenchmarkJSONDecode                94193361    112428781  +19.36%
BenchmarkMandelbrot200              4747060      4748043   +0.02%
BenchmarkGoParse                    6363798      6675098   +4.89%
BenchmarkRegexpMatchEasy0_32            129          171  +32.56%
BenchmarkRegexpMatchEasy0_1K            365          395   +8.22%
BenchmarkRegexpMatchEasy1_32            106          152  +43.40%
BenchmarkRegexpMatchEasy1_1K            952         1245  +30.78%
BenchmarkRegexpMatchMedium_32           198          283  +42.93%
BenchmarkRegexpMatchMedium_1K         79006       101097  +27.96%
BenchmarkRegexpMatchHard_32            3478         5115  +47.07%
BenchmarkRegexpMatchHard_1K          110245       163582  +48.38%
BenchmarkRevcomp                  777384355    793270857   +2.04%
BenchmarkTemplate                 136713089    157093609  +14.91%
BenchmarkTimeParse                     1511         1761  +16.55%
BenchmarkTimeFormat                     535          850  +58.88%

benchmark                          old MB/s     new MB/s  speedup
BenchmarkGobDecode                    74.95        65.07    0.87x
BenchmarkGobEncode                    44.24        38.43    0.87x
BenchmarkGzip                         42.63        42.12    0.99x
BenchmarkGunzip                      169.81       162.67    0.96x
BenchmarkJSONEncode                   47.93        36.83    0.77x
BenchmarkJSONDecode                   20.60        17.26    0.84x
BenchmarkGoParse                       9.10         8.68    0.95x
BenchmarkRegexpMatchEasy0_32         247.24       186.31    0.75x
BenchmarkRegexpMatchEasy0_1K        2799.20      2591.93    0.93x
BenchmarkRegexpMatchEasy1_32         299.31       210.44    0.70x
BenchmarkRegexpMatchEasy1_1K        1074.71       822.45    0.77x
BenchmarkRegexpMatchMedium_32          5.04         3.53    0.70x
BenchmarkRegexpMatchMedium_1K         12.96        10.13    0.78x
BenchmarkRegexpMatchHard_32            9.20         6.26    0.68x
BenchmarkRegexpMatchHard_1K            9.29         6.26    0.67x
BenchmarkRevcomp                     326.95       320.40    0.98x
BenchmarkTemplate                     14.19        12.35    0.87x

R=cshapiro
CC=golang-dev
https://golang.org/cl/12616045
2013-08-09 23:10:58 -04:00
Dmitriy Vyukov
8679d5f2b5 cmd/gc: record argument size for all indirect function calls
This is required to properly unwind reflect.methodValueCall/makeFuncStub.
Fixes #5954.
Stats for 'go install std':
61849 total INSTCALL
24655 currently have ArgSize metadata
27278 have ArgSize metadata with this change
godoc size before: 11351888, after: 11364288

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12163043
2013-07-31 20:00:33 +04:00
Russ Cox
48769bf546 runtime: use funcdata to supply garbage collection information
This CL introduces a FUNCDATA number for runtime-specific
garbage collection metadata, changes the C and Go compilers
to emit that metadata, and changes the runtime to expect it.

The old pseudo-instructions that carried this information
are gone, as is the linker code to process them.

R=golang-dev, dvyukov, cshapiro
CC=golang-dev
https://golang.org/cl/11406044
2013-07-19 16:04:09 -04:00
Russ Cox
7b3c8b7ac8 cmd/5g, cmd/6g, cmd/8g: insert arg size annotations on runtime calls
If calling a function in package runtime, emit argument size
information around the call in case the call is to a variadic C function.

R=ken2
CC=golang-dev
https://golang.org/cl/11371043
2013-07-16 16:25:10 -04:00
Russ Cox
7e97d39879 cmd/5g, cmd/6g, cmd/8g: fix line number of caller of deferred func
Deferred functions are not run by a call instruction. They are run by
the runtime editing registers to make the call start with a caller PC
returning to a
        CALL deferreturn
instruction.

That instruction has always had the line number of the function's
closing brace, but that instruction's line number is irrelevant.
Stack traces show the line number of the instruction before the
return PC, because normally that's what started the call. Not so here.
The instruction before the CALL deferreturn could be almost anywhere
in the function; it's unrelated and its line number is incorrect to show.

Fix the line number by inserting a true hardware no-op with the right
line number before the returned-to CALL instruction. That is, the deferred
calls now appear to start with a caller PC returning to the second instruction
in this sequence:
        NOP
        CALL deferreturn

The traceback will show the line number of the NOP, which we've set
to be the line number of the function's closing brace.

The NOP here is not the usual pseudo-instruction, which would be
elided by the linker. Instead it is the real hardware instruction:
XCHG AX, AX on 386 and amd64, and AND.EQ R0, R0, R0 on ARM.

Fixes #5856.

R=ken2, ken
CC=golang-dev
https://golang.org/cl/11223043
2013-07-12 13:47:55 -04:00
Daniel Morsing
3c3ce8e7fb cmd/6g, cmd/8g: prevent constant propagation of non-constant LEA.
Fixes #5809.

R=golang-dev, dave, rsc, nigeltao
CC=golang-dev
https://golang.org/cl/10785043
2013-07-05 16:11:22 +02:00
Russ Cox
b4e92cee97 cmd/gc: support x[i:j:k]
Design doc at golang.org/s/go12slice.
This is an experimental feature and may not be included in the release.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/10743046
2013-07-01 20:32:36 -04:00
Russ Cox
0713293374 cmd/5g, cmd/6g, cmd/8g: fix comment
Keeping the string "compactframe" because that's what
I always search for to find this code. But point to the real place too.

TBR=iant
CC=golang-dev
https://golang.org/cl/10676047
2013-06-28 12:06:25 -07:00
Russ Cox
6fa3c89b77 runtime: record proper goroutine state during stack split
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.

For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:

        TEXT myfunc
                check for split
                jmpcond nosplit
                call morestack
        nosplit:
                sub $xxx, sp

Now an extra instruction is inserted after the call:

        TEXT myfunc
        start:
                check for split
                jmpcond nosplit
                call morestack
                jmp start
        nosplit:
                sub $xxx, sp

The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.

The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.

The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.

Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)

runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
        morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
        sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow

goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
        /Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
        /Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
        /Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
        /Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
        /Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8

And here is a seg fault during oldstack:

SIGSEGV: segmentation violation
PC=0x1b2a6

runtime.oldstack()
        /Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
        /Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22

goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
        /Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
        /Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
        /Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
        strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8

goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
        /Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
        /Users/rsc/g/go/src/pkg/runtime/proc.c:166

rax     0x23ccc0
rbx     0x23ccc0
rcx     0x0
rdx     0x38
rdi     0x2102c0170
rsi     0x221032cfe0
rbp     0x221032cfa0
rsp     0x7fff5fbff5b0
r8      0x2102c0120
r9      0x221032cfa0
r10     0x221032c000
r11     0x104ce8
r12     0xe5c80
r13     0x1be82baac718
r14     0x13091135f7d69200
r15     0x0
rip     0x1b2a6
rflags  0x10246
cs      0x2b
fs      0x0
gs      0x0

Fixes #5723.

R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
2013-06-27 11:32:01 -04:00
Russ Cox
1f51d27922 cmd/gc: move genembedtramp into portable code
Requires adding new linker instruction
        RET	f(SB)
meaning return but then immediately call f.
This is what you'd use to implement a tail call after
fiddling with the arguments, but the compiler only
uses it in genwrapper.

This CL eliminates the copy-and-paste genembedtramp
functions from 5g/8g/6g and makes the code run on ARM
for the first time. It removes a small special case for function
generation, which should help Carl a bit, but at the same time
it does not bother to implement general tail call optimization,
which we do not want anyway.

Fixes #5627.

R=ken2
CC=golang-dev
https://golang.org/cl/10057044
2013-06-11 09:41:49 -04:00
Shenghou Ma
faef52c214 all: fix typos
R=golang-dev, bradfitz, khr, r
CC=golang-dev
https://golang.org/cl/7461046
2013-06-09 21:50:24 +08:00
Carl Shapiro
acd887ba57 cmd/5g, cmd/6g, cmd/8g: remove prototypes for proglist
Each of the backends has two prototypes for this function but
no corresponding definition.

R=golang-dev, bradfitz, khr
CC=golang-dev
https://golang.org/cl/9930045
2013-06-04 16:22:59 -07:00
Carl Shapiro
31be5deae4 cmd/5g, cmd/6g, cmd/8g: provide embedded trampolines with argument size information
An embedded trampoline is a function that exists to marshal
a receiver of type *S to a receiver of type *T when T is an
embedded field in S.

Embedded trampolines are generated by a special path through
the compiler and are not subject to the general analysis and
annotation done to functions.  Their effects must be provided
explicitly.

R=golang-dev, r, daniel.morsing, minux.ma
CC=golang-dev
https://golang.org/cl/9874043
2013-05-31 13:34:57 -07:00
Carl Shapiro
4e0a51c210 cmd/5l, cmd/6l, cmd/8l, cmd/gc, runtime: generate and use bitmaps of argument pointer locations
With this change the compiler emits a bitmap for each function
covering its stack frame arguments area.  If an argument word
is known to contain a pointer, a bit is set.  The garbage
collector reads this information when scanning the stack by
frames and uses it to ignores locations known to not contain a
pointer.

R=golang-dev, bradfitz, daniel.morsing, dvyukov, khr, khr, iant, cshapiro
CC=golang-dev
https://golang.org/cl/9223046
2013-05-28 17:59:10 -07:00
Rob Pike
4dcb13bb44 cmd/gc: fix some overflows in the compiler
Some 64-bit fields were run through 32-bit words, some counts were
not checked for overflow, and relocations must fit in 32 bits.
Tests to follow.

R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/9033043
2013-04-29 22:44:40 -07:00
Ian Lance Taylor
578dc3a96c cmd/5g, cmd/6g, cmd/8g: more nil ptr to large struct checks
R=r, ken, khr, daniel.morsing
CC=dsymonds, golang-dev, rickyz
https://golang.org/cl/8925043
2013-04-24 08:13:01 -07:00
David du Colombier
3a14b42edf cmd/6g: fix warnings on Plan 9
src/cmd/6g/peep.c:471 set and not used: r
src/cmd/6g/peep.c:560 overspecified class: regconsttyp GLOBL STATIC
src/cmd/6g/peep.c:761 more arguments than format IND STRUCT Prog
src/cmd/6g/reg.c:185 set and not used: r1
src/cmd/6g/reg.c:786 format mismatch d VLONG, arg 3
src/cmd/6g/reg.c:1064 format mismatch d VLONG, arg 5

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/8197044
2013-03-30 09:31:49 -07:00
Russ Cox
d3c758d7d2 cmd/gc: implement method values
R=ken2, ken
CC=golang-dev
https://golang.org/cl/7546052
2013-03-20 17:11:09 -04:00
Russ Cox
aa3efb28f0 cmd/gc: can stop tracking gotype in regopt
Now that the type information is in TYPE instructions
that are not rewritten by the optimization passes,
we don't have to try to preserve the type information
(no longer) attached to MOV instructions.

R=ken2
CC=golang-dev
https://golang.org/cl/7402054
2013-02-25 16:11:34 -05:00
Russ Cox
1d5dc4fd48 cmd/gc: emit explicit type information for local variables
The type information is (and for years has been) included
as an extra field in the address chunk of an instruction.
Unfortunately, suppose there is a string at a+24(FP) and
we have an instruction reading its length. It will say:

        MOVQ x+32(FP), AX

and the type of *that* argument is int (not slice), because
it is the length being read. This confuses the picture seen
by debuggers and now, worse, by the garbage collector.

Instead of attaching the type information to all uses,
emit an explicit list of TYPE instructions with the information.
The TYPE instructions are no-ops whose only role is to
provide an address to attach type information to.

For example, this function:

        func f(x, y, z int) (a, b string) {
                return
        }

now compiles into:

        --- prog list "f" ---
        0000 (/Users/rsc/x.go:3) TEXT    f+0(SB),$0-56
        0001 (/Users/rsc/x.go:3) LOCALS  ,
        0002 (/Users/rsc/x.go:3) TYPE    x+0(FP){int},$8
        0003 (/Users/rsc/x.go:3) TYPE    y+8(FP){int},$8
        0004 (/Users/rsc/x.go:3) TYPE    z+16(FP){int},$8
        0005 (/Users/rsc/x.go:3) TYPE    a+24(FP){string},$16
        0006 (/Users/rsc/x.go:3) TYPE    b+40(FP){string},$16
        0007 (/Users/rsc/x.go:3) MOVQ    $0,b+40(FP)
        0008 (/Users/rsc/x.go:3) MOVQ    $0,b+48(FP)
        0009 (/Users/rsc/x.go:3) MOVQ    $0,a+24(FP)
        0010 (/Users/rsc/x.go:3) MOVQ    $0,a+32(FP)
        0011 (/Users/rsc/x.go:4) RET     ,

The { } show the formerly hidden type information.
The { } syntax is used when printing from within the gc compiler.
It is not accepted by the assemblers.

The same type information is now included on global variables:

0055 (/Users/rsc/x.go:15) GLOBL   slice+0(SB){[]string},$24(AL*0)

This more accurate type information fixes a bug in the
garbage collector's precise heap collection.

The linker only cares about globals right now, but having the
local information should make things a little nicer for Carl
in the future.

Fixes #4907.

R=ken2
CC=golang-dev
https://golang.org/cl/7395056
2013-02-25 12:13:47 -05:00
Russ Cox
9f647288ef cmd/gc: avoid runtime code generation for closures
Change ARM context register to R7, to get out of the way
of the register allocator during the compilation of the
prologue statements (it wants to use R0 as a temporary).

Step 2 of http://golang.org/s/go11func.

R=ken2
CC=golang-dev
https://golang.org/cl/7369048
2013-02-22 14:25:50 -05:00
Russ Cox
6066fdcf38 cmd/6g, cmd/8g: switch to DX for indirect call block
runtime: add context argument to gogocall

Too many other things use AX, and at least one
(stack zeroing) cannot be moved onto a different
register. Use the less special DX instead.

Preparation for step 2 of http://golang.org/s/go11func.
Nothing interesting here, just split out so that we can
see it's correct before moving on.

R=ken2
CC=golang-dev
https://golang.org/cl/7395050
2013-02-22 10:47:54 -05:00
Russ Cox
1903ad7189 cmd/gc, reflect, runtime: switch to indirect func value representation
Step 1 of http://golang.org/s/go11func.

R=golang-dev, r, daniel.morsing, remyoudompheng
CC=golang-dev
https://golang.org/cl/7393045
2013-02-21 17:01:13 -05:00
Russ Cox
0cdc0b3b78 cmd/5g, cmd/6g: fix node dump formats
lvd changed the old %N to %+hN and these
never got updated.

R=ken2
CC=golang-dev
https://golang.org/cl/7391045
2013-02-21 12:53:25 -05:00
Robert Griesemer
3ee87d02b0 cmd/godoc: use go/build to determine package and example files
Also:
- faster code for example extraction
- simplify handling of command documentation:
  all "main" packages are treated as commands
- various minor cleanups along the way

For commands written in Go, any doc.go file containing
documentation must now be part of package main (rather
then package documentation), otherwise the documentation
won't show up in godoc (it will still build, though).

For commands written in C, documentation may still be
in doc.go files defining package documentation, but the
recommended way is to explicitly ignore those files with
a +build ignore constraint to define package main.

Fixes #4806.

R=adg, rsc, dave, bradfitz
CC=golang-dev
https://golang.org/cl/7333046
2013-02-19 11:19:58 -08:00
Shenghou Ma
255fb521da cmd/dist: add -Wstrict-prototypes to CFLAGS and fix all the compiler errors
Plan 9 compilers insist this but as we don't have Plan 9
builders, we'd better let gcc check the prototypes.

Inspired by CL 7289050.

R=golang-dev, seed, dave, rsc, lucio.dere
CC=akumar, golang-dev
https://golang.org/cl/7288056
2013-02-05 21:43:04 +08:00
Russ Cox
2c09d6992f cmd/gc: slightly better code generation
* Avoid treating CALL fn(SB) as justification for introducing
and tracking a registerized variable for fn(SB).

* Remove USED(n) after declaration and zeroing of n.
It was left over from when the compiler emitted more
aggressive set and not used errors, and it was keeping
the optimizer from removing a redundant zeroing of n
when n was a pointer or integer variable.

Update #597.

R=ken2
CC=golang-dev
https://golang.org/cl/7277048
2013-02-03 14:51:21 -05:00
Elias Naur
fe14ee52cc cmd/6c, cmd/6g: add flag to support large-model code generation
Added the -pic flag to 6c and 6g to avoid assembler instructions that
cannot use RIP-relative adressing. This is needed to support the -shared mode
in 6l.

See also:
https://golang.org/cl/6926049
https://golang.org/cl/6822078

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7064048
2013-02-01 08:35:33 -08:00
Rémy Oudompheng
d127ed5378 cmd/gc, cmd/6g: fix error on large stacks.
Fixes #4666.

R=golang-dev, daniel.morsing, rsc
CC=golang-dev
https://golang.org/cl/7141047
2013-01-18 22:36:43 +01:00
Matthew Dempsky
41ec481a53 cmd/6c: Improve peep hole optimization of rotate and shift instructions.
Update #4629.

$ cat shift2.c
unsigned int
shift(unsigned int x, unsigned int y)
{
        x = (x << 3);
        y = (y << 5);
        x = (x << 7);
        y = (y << 9);
        return x ^ y;
}

## BEFORE
$ go tool 6c -S shift2.c
(shift2.c:2)	TEXT	shift+0(SB),$0-8
(shift2.c:4)	MOVL	x+0(FP),!!AX
(shift2.c:4)	SALL	$3,!!AX
(shift2.c:4)	MOVL	AX,!!DX
(shift2.c:5)	MOVL	y+4(FP),!!AX
(shift2.c:5)	SALL	$5,!!AX
(shift2.c:5)	MOVL	AX,!!CX
(shift2.c:6)	MOVL	DX,!!AX
(shift2.c:6)	SALL	$7,!!AX
(shift2.c:6)	MOVL	AX,!!DX
(shift2.c:7)	MOVL	CX,!!AX
(shift2.c:7)	SALL	$9,!!AX
(shift2.c:7)	MOVL	AX,!!CX
(shift2.c:8)	MOVL	DX,!!AX
(shift2.c:8)	XORL	CX,!!AX
(shift2.c:8)	RET	,!!
(shift2.c:8)	RET	,!!
(shift2.c:8)	END	,!!

## AFTER
$ go tool 6c -S shift2.c
(shift2.c:2)	TEXT	shift+0(SB),$0-8
(shift2.c:4)	MOVL	x+0(FP),!!AX
(shift2.c:4)	SALL	$3,!!AX
(shift2.c:5)	MOVL	y+4(FP),!!CX
(shift2.c:5)	SALL	$5,!!CX
(shift2.c:6)	SALL	$7,!!AX
(shift2.c:7)	SALL	$9,!!CX
(shift2.c:8)	XORL	CX,!!AX
(shift2.c:8)	RET	,!!
(shift2.c:8)	RET	,!!
(shift2.c:8)	END	,!!

R=rsc, minux.ma, dave, nigeltao
CC=golang-dev
https://golang.org/cl/7066055
2013-01-18 16:33:25 -05:00
Daniel Morsing
b73a1a8e32 cmd/6g, cmd/8g: Allow optimization of return registers.
The peephole optimizer would keep hands off AX and X0 during returns, even though go doesn't return through registers.

R=dave, rsc
CC=golang-dev
https://golang.org/cl/7030046
2013-01-11 15:44:42 +01:00
Daniel Morsing
f1e4ee3f49 cmd/5g, cmd/6g, cmd/8g: flush return parameters in case of panic.
Fixes #4066.

R=rsc, minux.ma
CC=golang-dev
https://golang.org/cl/7040044
2013-01-04 17:07:21 +01:00
Russ Cox
e431398e09 undo CL 6938073 / 1542912cf09d
remove zerostack compiler experiment; will do at link time instead

««« original CL description
cmd/gc: add GOEXPERIMENT=zerostack to clear stack on function entry

This is expensive but it might be useful in cases where
people are suffering from false positives during garbage
collection and are willing to trade the CPU time for getting
rid of the false positives.

On the other hand it only eliminates false positives caused
by other function calls, not false positives caused by dead
temporaries stored in the current function call.

The 5g/6g/8g changes were pulled out of the history, from
the last time we needed to do this (to work around a goto bug).
The code in go.h, lex.c, pgen.c is new but tiny.

R=ken2
CC=golang-dev
https://golang.org/cl/6938073
»»»

R=ken2
CC=golang-dev
https://golang.org/cl/7002051
2012-12-22 11:18:04 -05:00
Rémy Oudompheng
81b46f1bcd cmd/6g: fix componentgen for funarg structs.
Fixes #4518.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6932045
2012-12-17 22:29:43 +01:00
Russ Cox
b7603cfc2c cmd/gc: add GOEXPERIMENT=zerostack to clear stack on function entry
This is expensive but it might be useful in cases where
people are suffering from false positives during garbage
collection and are willing to trade the CPU time for getting
rid of the false positives.

On the other hand it only eliminates false positives caused
by other function calls, not false positives caused by dead
temporaries stored in the current function call.

The 5g/6g/8g changes were pulled out of the history, from
the last time we needed to do this (to work around a goto bug).
The code in go.h, lex.c, pgen.c is new but tiny.

R=ken2
CC=golang-dev
https://golang.org/cl/6938073
2012-12-17 14:32:26 -05:00
Dave Cheney
b2797f2ae0 cmd/{5,6,8}g: reduce size of Prog and Addr
5g: Prog went from 128 bytes to 88 bytes
6g: Prog went from 174 bytes to 144 bytes
8g: Prog went from 124 bytes to 92 bytes

There may be a little more that can be squeezed out of Addr, but alignment will be a factor.

All: remove the unused pun field from Addr

R=rsc, minux.ma
CC=golang-dev
https://golang.org/cl/6922048
2012-12-14 06:20:24 +11:00
Rémy Oudompheng
a617d06252 cmd/6g, cmd/8g: simplify integer division code.
Change suggested by iant. The compiler generates
special code for a/b when a is -0x80...0 and b = -1.
A single instruction can cover the case where b is -1,
so only one comparison is needed.

Fixes #3551.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6922049
2012-12-12 08:35:08 +01:00
Rémy Oudompheng
45fe306ac8 cmd/[568]g: recycle ONAME nodes used in regopt to denote registers.
The reported decrease in memory usage is about 5%.

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/6902064
2012-12-09 19:10:52 +01:00
Rémy Oudompheng
4cc9de9147 cmd/gc: add division rewrite to walk pass.
This allows 5g and 8g to benefit from the rewrite as shifts
or magic multiplies. The 64-bit arithmetic is not handled there,
and left in 6g.

Update #2230.

R=golang-dev, dave, mtj, iant, rsc
CC=golang-dev
https://golang.org/cl/6819123
2012-11-26 23:45:22 +01:00
Rémy Oudompheng
16072c7497 cmd/6g: extend componentgen to small arrays and structs.
Fixes #4092.

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/6819083
2012-11-13 00:08:04 +01:00
Rémy Oudompheng
eb4f4d16ae cmd/5g, cmd/6g: pass the full torture test.
The patch adds more cases to agenr to allocate registers later,
and makes 6g generate addresses for sgen in something else than
SI and DI. It avoids a complex save/restore sequence that
amounts to allocate a register before descending in subtrees.

Fixes #4207.

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/6817080
2012-11-12 23:56:11 +01:00
Rémy Oudompheng
7c0cbbfa18 cmd/6g, cmd/8g: mark used registers in indirect addressing.
Fixes #4094.
Fixes #4353.

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/6810090
2012-11-07 21:36:15 +01:00
Rémy Oudompheng
1e233ad075 cmd/6g: fix use of large integers as indexes or array sizes.
A check for smallintconst was missing before generating the
comparisons.

Fixes #4348.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6815088
2012-11-06 22:53:57 +01:00
Rémy Oudompheng
0b2353edcb cmd/5g, cmd/6g: fix out of registers with array indexing.
Compiling expressions like:
    s[s[s[s[s[s[s[s[s[s[s[s[i]]]]]]]]]]]]
make 5g and 6g run out of registers. Such expressions can arise
if a slice is used to represent a permutation and the user wants
to iterate it.

This is due to the usual problem of allocating registers before
going down the expression tree, instead of allocating them in a
postfix way.

The functions cgenr and agenr (that generate a value to a newly
allocated register instead of an existing location), are either
introduced or modified when they already existed to allocate
the new register as late as possible, and sudoaddable is disabled
for OINDEX nodes so that igen/agenr is used instead.

Update #4207.

R=dave, daniel.morsing, rsc
CC=golang-dev
https://golang.org/cl/6733055
2012-11-02 07:50:59 +01:00
Russ Cox
3d40062c68 cmd/gc, cmd/ld: struct field tracking
This is an experiment in static analysis of Go programs
to understand which struct fields a program might use.
It is not part of the Go language specification, it must
be enabled explicitly when building the toolchain,
and it may be removed at any time.

After building the toolchain with GOEXPERIMENT=fieldtrack,
a specific field can be marked for tracking by including
`go:"track"` in the field tag:

        package pkg

        type T struct {
                F int `go:"track"`
                G int // untracked
        }

To simplify usage, only named struct types can have
tracked fields, and only exported fields can be tracked.

The implementation works by making each function begin
with a sequence of no-op USEFIELD instructions declaring
which tracked fields are accessed by a specific function.
After the linker's dead code elimination removes unused
functions, the fields referred to by the remaining
USEFIELD instructions are the ones reported as used by
the binary.

The -k option to the linker specifies the fully qualified
symbol name (such as my/pkg.list) of a string variable that
should be initialized with the field tracking information
for the program. The field tracking string is a sequence
of lines, each terminated by a \n and describing a single
tracked field referred to by the program. Each line is made
up of one or more tab-separated fields. The first field is
the name of the tracked field, fully qualified, as in
"my/pkg.T.F". Subsequent fields give a shortest path of
reverse references from that field to a global variable or
function, corresponding to one way in which the program
might reach that field.

A common source of false positives in field tracking is
types with large method sets, because a reference to the
type descriptor carries with it references to all methods.
To address this problem, the CL also introduces a comment
annotation

        //go:nointerface

that marks an upcoming method declaration as unavailable
for use in satisfying interfaces, both statically and
dynamically. Such a method is also invisible to package
reflect.

Again, all of this is disabled by default. It only turns on
if you have GOEXPERIMENT=fieldtrack set during make.bash.

R=iant, ken
CC=golang-dev
https://golang.org/cl/6749064
2012-11-02 00:17:21 -04:00
Rémy Oudompheng
022b361ae2 cmd/5g, cmd/6g, cmd/8g: remove width check for componentgen.
The move to 64-bit ints in 6g made componentgen ineffective.
In componentgen, the code already selects which values it can handle.

On amd64:
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    9477970000   9582314000   +1.10%
BenchmarkFannkuch11      5928750000   5255080000  -11.36%
BenchmarkGobDecode         37103040     31451120  -15.23%
BenchmarkGobEncode         16042490     16844730   +5.00%
BenchmarkGzip             811337400    741373600   -8.62%
BenchmarkGunzip           197928700    192844500   -2.57%
BenchmarkJSONEncode       224164100    140064200  -37.52%
BenchmarkJSONDecode       258346800    231829000  -10.26%
BenchmarkMandelbrot200      7561780      7601615   +0.53%
BenchmarkParse             12970340     11624360  -10.38%
BenchmarkRevcomp         1969917000   1699137000  -13.75%
BenchmarkTemplate         296182000    263117400  -11.16%

R=nigeltao, dave, daniel.morsing
CC=golang-dev
https://golang.org/cl/6821052
2012-11-01 14:36:08 +01:00
Rémy Oudompheng
335eef85c3 cmd/6g: fix crash in cgen_bmul.
Used to print:
../test/torture.go:116: internal compiler error: bad width: 0463 (../test/torture.go:116) MOVB    ,BX (0, 8)

R=nigeltao, rsc
CC=golang-dev
https://golang.org/cl/6736068
2012-10-26 00:29:44 +02:00
Rémy Oudompheng
7e144bcab0 cmd/5g, cmd/6g, cmd/8g: fix out of registers.
This patch is enough to fix compilation of
exp/types tests but only passes a stripped down
version of the appripriate torture test.

Update #4207.

R=dave, nigeltao, rsc, golang-dev
CC=golang-dev
https://golang.org/cl/6621061
2012-10-16 07:22:33 +02:00
Rémy Oudompheng
46fcfdaa7d cmd/6g: fix out of registers when chaining integer divisions.
Fixes #4201.

R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6622055
2012-10-07 00:30:29 +02:00
Rémy Oudompheng
617b7cf166 cmd/[568]g: header cleanup.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6573059
2012-09-27 08:34:00 +02:00
Rémy Oudompheng
6feb61325a cmd/6g, cmd/8g: fix two "out of fixed registers" cases.
In two cases, registers were allocated too early resulting
in exhausting of available registers when nesting these
operations.

The case of method calls was due to missing cases in igen,
which only makes calls but doesn't allocate a register for
the result.

The case of 8-bit multiplication was due to a wrong order
in register allocation when Ullman numbers were bigger on the
RHS.

Fixes #3907.
Fixes #4156.

R=rsc
CC=golang-dev, remy
https://golang.org/cl/6560054
2012-09-26 21:17:11 +02:00
Russ Cox
10ea6519e4 build: make int 64 bits on amd64
The assembly offsets were converted mechanically using
code.google.com/p/rsc/cmd/asmlint. The instruction
changes were done by hand.

Fixes #2188.

R=iant, r, bradfitz, remyoudompheng
CC=golang-dev
https://golang.org/cl/6550058
2012-09-24 20:57:01 -04:00
Rémy Oudompheng
33cceb09e2 cmd/{5g,6g,8g,6c}: remove unused macro, use %E to print etype.
R=golang-dev, rsc, dave
CC=golang-dev
https://golang.org/cl/6569044
2012-09-24 23:44:00 +02:00
Rémy Oudompheng
f4e76d5e02 cmd/6g, cmd/8g: add OINDREG, ODOT, ODOTPTR cases to igen.
Apart from reducing the number of LEAL/LEAQ instructions by about
30%, it gives 8g easier registerization in several cases,
for example in strconv. Performance with 6g is not affected.

Before (386):
src/pkg/strconv/decimal.go:22   TEXT  (*decimal).String+0(SB),$240-12
src/pkg/strconv/extfloat.go:540 TEXT  (*extFloat).ShortestDecimal+0(SB),$584-20

After (386):
src/pkg/strconv/decimal.go:22   TEXT  (*decimal).String+0(SB),$196-12
src/pkg/strconv/extfloat.go:540 TEXT  (*extFloat).ShortestDecimal+0(SB),$420-20

Benchmarks with GOARCH=386 (on a Core 2).

benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    7110191000   7079644000   -0.43%
BenchmarkFannkuch11      7769274000   7766514000   -0.04%
BenchmarkGobDecode         33454820     34755400   +3.89%
BenchmarkGobEncode         11675710     11007050   -5.73%
BenchmarkGzip            2013519000   1593855000  -20.84%
BenchmarkGunzip           253368200    242667600   -4.22%
BenchmarkJSONEncode       152443900    120763400  -20.78%
BenchmarkJSONDecode       304112800    247461800  -18.63%
BenchmarkMandelbrot200     29245520     29240490   -0.02%
BenchmarkParse              8484105      8088660   -4.66%
BenchmarkRevcomp         2695688000   2841263000   +5.40%
BenchmarkTemplate         363759800    277271200  -23.78%

benchmark                       old ns/op    new ns/op    delta
BenchmarkAtof64Decimal                127          129   +1.57%
BenchmarkAtof64Float                  166          164   -1.20%
BenchmarkAtof64FloatExp               308          300   -2.60%
BenchmarkAtof64Big                    584          571   -2.23%
BenchmarkAppendFloatDecimal           440          430   -2.27%
BenchmarkAppendFloat                  995          776  -22.01%
BenchmarkAppendFloatExp               897          746  -16.83%
BenchmarkAppendFloatNegExp            900          752  -16.44%
BenchmarkAppendFloatBig              1528         1228  -19.63%
BenchmarkAppendFloat32Integer         443          453   +2.26%
BenchmarkAppendFloat32ExactFraction   812          661  -18.60%
BenchmarkAppendFloat32Point          1002          773  -22.85%
BenchmarkAppendFloat32Exp             858          725  -15.50%
BenchmarkAppendFloat32NegExp          848          728  -14.15%
BenchmarkAppendFloat64Fixed1          447          431   -3.58%
BenchmarkAppendFloat64Fixed2          480          462   -3.75%
BenchmarkAppendFloat64Fixed3          461          457   -0.87%
BenchmarkAppendFloat64Fixed4          509          484   -4.91%

Update #1914.

R=rsc, nigeltao
CC=golang-dev, remy
https://golang.org/cl/6494107
2012-09-24 23:07:44 +02:00
Russ Cox
650160e36a cmd/gc: prepare for 64-bit ints
This CL makes the compiler understand that the type of
the len or cap of a map, slice, or string is 'int', not 'int32'.
It does not change the meaning of int, but it should make
the eventual change of the meaning of int in 6g a bit smoother.

Update #2188.

R=ken, dave, remyoudompheng
CC=golang-dev
https://golang.org/cl/6542059
2012-09-24 14:59:44 -04:00
Rémy Oudompheng
5e3fb887a3 cmd/[568]g: explain the purpose of various Reg fields.
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6554062
2012-09-24 20:55:11 +02:00
Rémy Oudompheng
36df358a30 cmd/6g: fix internal error with SSE registers.
Revision 63f7abcae015 introduced a bug caused by
code assuming registers started at X5, not X0.

Fixes #4138.

R=rsc
CC=golang-dev, remy
https://golang.org/cl/6558043
2012-09-23 18:22:03 +02:00
Russ Cox
05ac300830 cmd/gc: fix use of nil interface, slice
Fixes #3670.

R=ken2
CC=golang-dev
https://golang.org/cl/6542058
2012-09-22 20:42:11 -04:00
Russ Cox
658482d70f cmd/5g: fix register opt bug
The width was not being set on the address, which meant
that the optimizer could not find variables that overlapped
with it and mark them as having had their address taken.
This let to the compiler believing variables had been set
but never used and then optimizing away the set.

Fixes #4129.

R=ken2
CC=golang-dev
https://golang.org/cl/6552059
2012-09-22 10:01:35 -04:00
Rémy Oudompheng
413fbed341 cmd/6g: cosmetic improvements to regopt debugging.
R=rsc, golang-dev
CC=golang-dev
https://golang.org/cl/6528044
2012-09-21 20:20:26 +02:00
Russ Cox
57ad05db15 cmd/6g: use all 16 float registers, optimize float moves
Fixes #2446.

R=ken2
CC=golang-dev
https://golang.org/cl/6557044
2012-09-21 13:39:09 -04:00
Nigel Tao
a9a675ec35 cmd/6g, cmd/8g: clean up unnecessary switch code in componentgen.
Code higher up in the function already catches these cases.

R=remyoudompheng, rsc
CC=golang-dev
https://golang.org/cl/6496106
2012-09-12 21:47:05 +10:00
Rémy Oudompheng
b45b6fd1c7 cmd/6g, cmd/8g: do not LEA[LQ] interfaces when calling methods.
It is enough to load directly the data word and the itab word
from memory, so we save a LEA instruction for each method call,
and allow elimination of some extra temporaries.

Update #1914.

R=daniel.morsing, rsc
CC=golang-dev, remy
https://golang.org/cl/6501110
2012-09-11 08:45:23 +02:00
Rémy Oudompheng
ff642e290f cmd/6g, cmd/8g: eliminate extra agen for nil comparisons.
Removes an extra LEAL/LEAQ instructions there and usually saves
a useless temporary in the idiom
    if err := foo(); err != nil {...}

Generated code is also less involved:
    MOVQ err+n(SP), AX
    CMPQ AX, $0
(potentially CMPQ n(SP), $0) instead of
    LEAQ err+n(SP), AX
    CMPQ (AX), $0

Update #1914.

R=daniel.morsing, nigeltao, rsc
CC=golang-dev, remy
https://golang.org/cl/6493099
2012-09-11 08:08:40 +02:00
Nigel Tao
5d7ece6f44 6g: delete unnecessary OXXX initialization.
No longer necessary after https://golang.org/cl/6497073/
removed the `if(n5.op != OXXX) { regfree(&n5); }`.

R=remy, r
CC=golang-dev, rsc
https://golang.org/cl/6498101
2012-09-10 11:24:34 +10:00
Rémy Oudompheng
acbe6c94d7 cmd/6g: avoid taking the address of slices unnecessarily.
The main case where it happens is when evaluating &s[i] without
bounds checking, which usually happens during range loops (i=0).

This allows registerization of the corresponding variables,
saving 16 bytes of stack frame for each such range loop and a
LEAQ instruction.

R=golang-dev, rsc, dave
CC=golang-dev, remy
https://golang.org/cl/6497073
2012-09-07 06:54:42 +02:00
Nigel Tao
481e5c6ad0 cmd/gc: re-order some OFOO constants. Rename ORRC to ORROTC to be
consistent with OLROT. Delete unused OBAD, OLRC.

R=rsc, dave
CC=golang-dev
https://golang.org/cl/6489082
2012-09-06 10:47:25 +10:00
Rémy Oudompheng
8f3c2055bd cmd/6g, cmd/8g: eliminate short integer arithmetic when possible.
Fixes #3909.
Fixes #3910.

R=rsc, nigeltao
CC=golang-dev
https://golang.org/cl/6442114
2012-09-01 16:40:54 +02:00
Shenghou Ma
e80f6a4de1 cmd/6g: fix float32/64->uint64 conversion
CVTSS2SQ's rounding mode is controlled by the RC field of MXCSR;
as we specifically need truncate semantic, we should use CVTTSS2SQ.

    Fixes #3804.

R=rsc, r
CC=golang-dev
https://golang.org/cl/6352079
2012-08-23 14:35:26 +08:00
Nigel Tao
18e86644a3 cmd/gc: cache itab lookup in convT2I.
There may be further savings if convT2I can avoid the function call
if the cache is good and T is uintptr-shaped, a la convT2E, but that
will be a follow-up CL.

src/pkg/runtime:
benchmark                  old ns/op    new ns/op    delta
BenchmarkConvT2ISmall             43           15  -64.01%
BenchmarkConvT2IUintptr           45           14  -67.48%
BenchmarkConvT2ILarge            130          101  -22.31%

test/bench/go1:
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    8588997000   8499058000   -1.05%
BenchmarkFannkuch11      5300392000   5358093000   +1.09%
BenchmarkGobDecode         30295580     31040190   +2.46%
BenchmarkGobEncode         18102070     17675650   -2.36%
BenchmarkGzip             774191400    771591400   -0.34%
BenchmarkGunzip           245915100    247464100   +0.63%
BenchmarkJSONEncode       123577000    121423050   -1.74%
BenchmarkJSONDecode       451969800    596256200  +31.92%
BenchmarkMandelbrot200     10060050     10072880   +0.13%
BenchmarkParse             10989840     11037710   +0.44%
BenchmarkRevcomp         1782666000   1716864000   -3.69%
BenchmarkTemplate         798286600    723234400   -9.40%

R=rsc, bradfitz, go.peter.90, daniel.morsing, dave, uriel
CC=golang-dev
https://golang.org/cl/6337058
2012-07-03 09:09:05 +10:00
Nigel Tao
8f84328fdc cmd/gc: inline convT2E when T is uintptr-shaped.
GOARCH=amd64 benchmarks

src/pkg/runtime
benchmark                  old ns/op    new ns/op    delta
BenchmarkConvT2ESmall             10           10   +1.00%
BenchmarkConvT2EUintptr            9            0  -92.07%
BenchmarkConvT2EBig               74           74   -0.27%
BenchmarkConvT2I                  27           26   -3.62%
BenchmarkConvI2E                   4            4   -7.05%
BenchmarkConvI2I                  20           19   -2.99%

test/bench/go1
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    5930908000   5937260000   +0.11%
BenchmarkFannkuch11      3927057000   3933556000   +0.17%
BenchmarkGobDecode         21998090     21870620   -0.58%
BenchmarkGobEncode         12725310     12734480   +0.07%
BenchmarkGzip             567617600    567892800   +0.05%
BenchmarkGunzip           178284100    178706900   +0.24%
BenchmarkJSONEncode        87693550     86794300   -1.03%
BenchmarkJSONDecode       314212600    324115000   +3.15%
BenchmarkMandelbrot200      7016640      7073766   +0.81%
BenchmarkParse              7852100      7892085   +0.51%
BenchmarkRevcomp         1285663000   1286147000   +0.04%
BenchmarkTemplate         566823800    567606200   +0.14%

I'm not entirely sure why the JSON* numbers have changed, but
eyeballing the profile suggests that it could be spending less
and more time in runtime.{new,old}stack, so it could simply be
stack-split boundary noise.

R=rsc, dave, bsiegert, dsymonds
CC=golang-dev
https://golang.org/cl/6280049
2012-06-14 10:43:20 +10:00
Russ Cox
b185de82a4 cmd/gc: limit data disassembly to -SS
This makes -S useful again.

R=ken2
CC=golang-dev
https://golang.org/cl/6302054
2012-06-07 12:05:34 -04:00
Rémy Oudompheng
a7059cc793 cmd/[568]g: correct freeing of allocated Regs.
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6281050
2012-06-05 06:43:15 +02:00
Luuk van Dijk
40af78c19e cmd/gc: inline slice[arr,str] in the frontend (mostly).
R=rsc, ality, rogpeppe, minux.ma, dave
CC=golang-dev
https://golang.org/cl/5966075
2012-06-02 22:50:57 -04:00
Russ Cox
96b0594833 cmd/5g, cmd/6g, cmd/8g: delete clearstk
Dreg from https://golang.org/cl/4629042

R=ken2
CC=golang-dev
https://golang.org/cl/6259057
2012-06-01 10:10:59 -04:00
Russ Cox
001b75c942 cmd/gc: contiguous loop layout
Drop expecttaken function in favor of extra argument
to gbranch and bgen. Mark loop condition as likely to
be true, so that loops are generated inline.

The main benefit here is contiguous code when trying
to read the generated assembly. It has only minor effects
on the timing, and they mostly cancel the minor effects
that aligning function entry points had.  One exception:
both changes made Fannkuch faster.

Compared to before CL 6244066 (before aligned functions)
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    4222117400   4201958800   -0.48%
BenchmarkFannkuch11      3462631800   3215908600   -7.13%
BenchmarkGobDecode         20887622     20899164   +0.06%
BenchmarkGobEncode          9548772      9439083   -1.15%
BenchmarkGzip                151687       152060   +0.25%
BenchmarkGunzip                8742         8711   -0.35%
BenchmarkJSONEncode        62730560     62686700   -0.07%
BenchmarkJSONDecode       252569180    252368960   -0.08%
BenchmarkMandelbrot200      5267599      5252531   -0.29%
BenchmarkRevcomp25M       980813500    985248400   +0.45%
BenchmarkTemplate         361259100    357414680   -1.06%

Compared to tip (aligned functions):
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    4140739800   4201958800   +1.48%
BenchmarkFannkuch11      3259914400   3215908600   -1.35%
BenchmarkGobDecode         20620222     20899164   +1.35%
BenchmarkGobEncode          9384886      9439083   +0.58%
BenchmarkGzip                150333       152060   +1.15%
BenchmarkGunzip                8741         8711   -0.34%
BenchmarkJSONEncode        65210990     62686700   -3.87%
BenchmarkJSONDecode       249394860    252368960   +1.19%
BenchmarkMandelbrot200      5273394      5252531   -0.40%
BenchmarkRevcomp25M       996013800    985248400   -1.08%
BenchmarkTemplate         360620840    357414680   -0.89%

R=ken2
CC=golang-dev
https://golang.org/cl/6245069
2012-05-30 18:07:39 -04:00
Russ Cox
a768de8347 cmd/6g: avoid MOVSD between registers
MOVSD only copies the low half of the packed register pair,
while MOVAPD copies both halves.  I assume the internal
register renaming works better with the latter, since it makes
our code run 25% faster.

Before:
mandelbrot 16000
        gcc -O2 mandelbrot.c	28.44u 0.00s 28.45r
        gc mandelbrot	44.12u 0.00s 44.13r
        gc_B mandelbrot	44.17u 0.01s 44.19r

After:
mandelbrot 16000
        gcc -O2 mandelbrot.c	28.22u 0.00s 28.23r
        gc mandelbrot	32.81u 0.00s 32.82r
        gc_B mandelbrot	32.82u 0.00s 32.83r

R=ken2
CC=golang-dev
https://golang.org/cl/6248068
2012-05-30 14:41:19 -04:00
Russ Cox
de96df1b02 cmd/6g: change sbop swap logic
I added the nl->op == OLITERAL case during the recent
performance round, and while it helps for small integer constants,
it hurts for floating point constants.  In the Mandelbrot benchmark
it causes 2*Zr*Zi to compile like Zr*2*Zi:

        0x000000000042663d <+249>:	movsd  %xmm6,%xmm0
        0x0000000000426641 <+253>:	movsd  $2,%xmm1
        0x000000000042664a <+262>:	mulsd  %xmm1,%xmm0
        0x000000000042664e <+266>:	mulsd  %xmm5,%xmm0

instead of:

        0x0000000000426835 <+276>:	movsd  $2,%xmm0
        0x000000000042683e <+285>:	mulsd  %xmm6,%xmm0
        0x0000000000426842 <+289>:	mulsd  %xmm5,%xmm0

It is unclear why that has such a dramatic performance effect
in a tight loop, but it's obviously slightly better code, so go with it.

benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    5957470000   5973924000   +0.28%
BenchmarkFannkuch11      3811295000   3869128000   +1.52%
BenchmarkGobDecode         26001900     25670500   -1.27%
BenchmarkGobEncode         12051430     11948590   -0.85%
BenchmarkGzip                177432       174821   -1.47%
BenchmarkGunzip               10967        10756   -1.92%
BenchmarkJSONEncode        78924750     79746900   +1.04%
BenchmarkJSONDecode       313606400    307081600   -2.08%
BenchmarkMandelbrot200     13670860      8200725  -40.01%  !!!
BenchmarkRevcomp25M      1179194000   1206539000   +2.32%
BenchmarkTemplate         447931200    443948200   -0.89%
BenchmarkMD5Hash1K             2856         2873   +0.60%
BenchmarkMD5Hash8K            22083        22029   -0.24%

benchmark                  old MB/s     new MB/s  speedup
BenchmarkGobDecode            29.52        29.90    1.01x
BenchmarkGobEncode            63.69        64.24    1.01x
BenchmarkJSONEncode           24.59        24.33    0.99x
BenchmarkJSONDecode            6.19         6.32    1.02x
BenchmarkRevcomp25M          215.54       210.66    0.98x
BenchmarkTemplate              4.33         4.37    1.01x
BenchmarkMD5Hash1K           358.54       356.31    0.99x
BenchmarkMD5Hash8K           370.95       371.86    1.00x

R=ken2
CC=golang-dev
https://golang.org/cl/6261051
2012-05-30 10:22:33 -04:00
Russ Cox
fefae6eed1 cmd/6g, cmd/8g: move panicindex calls out of line
The old code generated for a bounds check was
                CMP
                JLT ok
                CALL panicindex
        ok:
                ...

The new code is (once the linker finishes with it):
                CMP
                JGE panic
                ...
        panic:
                CALL panicindex

which moves the calls out of line, putting more useful
code in each cache line.  This matters especially in tight
loops, such as in Fannkuch.  The benefit is more modest
elsewhere, but real.

From test/bench/go1, amd64:

benchmark                old ns/op    new ns/op    delta
BenchmarkBinaryTree17   6096092000   6088808000   -0.12%
BenchmarkFannkuch11     6151404000   4020463000  -34.64%
BenchmarkGobDecode        28990050     28894630   -0.33%
BenchmarkGobEncode        12406310     12136730   -2.17%
BenchmarkGzip               179923       179903   -0.01%
BenchmarkGunzip              11219        11130   -0.79%
BenchmarkJSONEncode       86429350     86515900   +0.10%
BenchmarkJSONDecode      334593800    315728400   -5.64%
BenchmarkRevcomp25M     1219763000   1180767000   -3.20%
BenchmarkTemplate        492947600    483646800   -1.89%

And 386:

benchmark                old ns/op    new ns/op    delta
BenchmarkBinaryTree17   6354902000   6243000000   -1.76%
BenchmarkFannkuch11     8043769000   7326965000   -8.91%
BenchmarkGobDecode        19010800     18941230   -0.37%
BenchmarkGobEncode        14077500     13792460   -2.02%
BenchmarkGzip               194087       193619   -0.24%
BenchmarkGunzip              12495        12457   -0.30%
BenchmarkJSONEncode      125636400    125451400   -0.15%
BenchmarkJSONDecode      696648600    685032800   -1.67%
BenchmarkRevcomp25M     2058088000   2052545000   -0.27%
BenchmarkTemplate        602140000    589876800   -2.04%

To implement this, two new instruction forms:

        JLT target      // same as always
        JLT $0, target  // branch expected not taken
        JLT $1, target  // branch expected taken

The linker could also emit the prediction prefixes, but it
does not: expected taken branches are reversed so that the
expected case is not taken (as in example above), and
the default expectaton for such a jump is not taken
already.

R=golang-dev, gri, r, dave
CC=golang-dev
https://golang.org/cl/6248049
2012-05-29 12:09:27 -04:00
Russ Cox
c6ce44822c cmd/gc: faster code, mainly for rotate
* Eliminate bounds check on known small shifts.
* Rewrite x<<s | x>>(32-s) as a rotate (constant s).
* More aggressive (but still minimal) range analysis.

R=ken, dave, iant
CC=golang-dev
https://golang.org/cl/6209077
2012-05-24 17:20:07 -04:00
Russ Cox
3d3b4906f9 cmd/6g: peephole fixes/additions
* Shift/rotate by constant doesn't have to stop subprop. (also in 8g)
* Remove redundant MOVLQZX instructions.
* An attempt at issuing loads early.
  Good for 0.5% on a good day, might not be worth keeping.
  Need to understand more about whether the x86
  looks ahead to what loads might be coming up.

R=ken2, ken
CC=golang-dev
https://golang.org/cl/6203091
2012-05-24 12:11:32 -04:00
Russ Cox
8f8640a057 cmd/6g: allow use of R14, R15 now
We stopped reserving them in 2009 or so.

R=ken
CC=golang-dev
https://golang.org/cl/6215061
2012-05-21 12:59:26 -04:00
Rémy Oudompheng
061061e77c cmd/6g: restore magic multiply for /=, %=.
Also enables turning /= 2 in a right shift.

Part of issue 2230.

R=rsc
CC=golang-dev, remy
https://golang.org/cl/6012049
2012-04-13 10:12:31 +02:00
Russ Cox
e530d6a1e0 6c, 6g, 6l: add MOVQL to make truncation explicit
Without an explicit signal for a truncation, copy propagation
will sometimes propagate a 32-bit truncation and end up
overwriting uses of the original 64-bit value.

The case that arose in practice is in C but I believe
that the same could plausibly happen in Go.
The main reason we didn't run into the same in Go
is that I (perhaps incorrectly?) drop MOVL AX, AX
during gins, so the truncation was never generated, so
it didn't confuse the optimizer.

Fixes #1315.
Fixes #3488.

R=ken2
CC=golang-dev
https://golang.org/cl/6002043
2012-04-10 12:51:59 -04:00
Rémy Oudompheng
f2ad374ae6 cmd/gc: don't believe that variables mentioned 256 times are unused.
Such variables would be put at 0(SP), leading to serious
corruptions at zero initialization.
Fixes #3084.

R=golang-dev, r
CC=golang-dev, remy
https://golang.org/cl/5683052
2012-02-21 16:38:01 +11:00
Russ Cox
8998835543 5g, 6g, 8g: flush modified globals aggressively
The alternative is to record enough information that the
trap handler know which registers contain cached globals
and can flush the registers back to their original locations.
That's significantly more work.

This only affects globals that have been written to.
Code that reads from a global should continue to registerize
as well as before.

Fixes #1304.

R=ken2
CC=golang-dev
https://golang.org/cl/5687046
2012-02-20 13:41:44 -05:00
Russ Cox
4e3f8e915f gc, ld: tag data as no-pointers and allocate in separate section
The garbage collector can avoid scanning this section, with
reduces collection time as well as the number of false positives.
Helps a little bit with issue 909, but certainly does not solve it.

R=ken2
CC=golang-dev
https://golang.org/cl/5671099
2012-02-19 03:19:52 -05:00
Shenghou Ma
6ed2b6c47d 5c, 6c, 8c, 6g, 8g: correct boundary checking
CL 5666043 fixed the same checking for 5g.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5666045
2012-02-15 08:59:03 -05:00
Russ Cox
f91cc3bdbb gc: optimize interface ==, !=
If the values being compared have different concrete types,
then they're clearly unequal without needing to invoke the
actual interface compare routine.  This speeds tests for
specific values, like if err == io.EOF, by about 3x.

benchmark                  old ns/op    new ns/op    delta
BenchmarkIfaceCmp100             843          287  -65.95%
BenchmarkIfaceCmpNil100          184          182   -1.09%

Fixes #2591.

R=ken2
CC=golang-dev
https://golang.org/cl/5651073
2012-02-11 00:19:24 -05:00
Russ Cox
ca5da31f83 6g: fix out of registers bug
Fix it twice: reuse registers more aggressively in cgen abop,
and also release R14 and R15, which are no longer m and g.

Fixes #2669.

R=ken2
CC=golang-dev
https://golang.org/cl/5655056
2012-02-10 22:19:34 -05:00
Jamie Gennis
fff732ea2c 6g,8g: make constant propagation inlining-friendly.
This changes makes constant propagation compare 'from' values using node
pointers rather than symbol names when checking to see whether a set
operation is redundant. When a function is inlined multiple times in a
calling function its arguments will share symbol names even though the values
are different. Prior to this fix the bug409 test would hit a case with 6g
where an LEAQ instruction was incorrectly eliminated from the second inlined
function call. 8g appears to have had the same bug, but the test did not fail
there.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5646044
2012-02-08 10:25:13 -05:00
Russ Cox
fec7fa8b9d build: delete make paraphernalia
As a convenience to people working on the tools,
leave Makefiles that invoke the go dist tool appropriately.
They are not used during the build.

R=golang-dev, bradfitz, n13m3y3r, gustavo
CC=golang-dev
https://golang.org/cl/5636050
2012-02-06 13:34:25 -05:00
Anthony Martin
e280035fc1 gc, cc: avoid using the wrong library when building the compilers
This can happen on Plan 9 if we we're building
with the 32-bit and 64-bit host compilers, one
after the other.

R=rsc
CC=golang-dev
https://golang.org/cl/5599053
2012-02-01 04:14:37 -08:00
Anthony Martin
6273d6e713 build: move the "-c" flag into HOST_CFLAGS
On Plan 9 this flag is used to discover
constant expressions in "if" statements.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5601060
2012-01-31 19:31:30 -08:00
Rob Pike
91cb3489ab go: move compilers into the go-tool directory
Also delete gotest, since it's messy to fix and slated for deletion anyway.
A couple of things outside src can't be tested any more. "go test" will be
fixed and these tests will be re-enabled. They're noisy for now.

Fixes #284.

R=rsc
CC=golang-dev
https://golang.org/cl/5598049
2012-01-30 14:46:31 -08:00
Luuk van Dijk
a6c49098bc gc: Nicer errors before miscompiling.
This fixes issue 2444.

A big cleanup of all 31/32bit size boundaries i'll leave for another cl though.  (see also issue 1700).

R=rsc
CC=golang-dev
https://golang.org/cl/5484058
2012-01-10 11:19:22 +01:00