With this change the compiler emits a bitmap for each function
covering its stack frame arguments area. If an argument word
is known to contain a pointer, a bit is set. The garbage
collector reads this information when scanning the stack by
frames and uses it to ignores locations known to not contain a
pointer.
R=golang-dev, bradfitz, daniel.morsing, dvyukov, khr, khr, iant, cshapiro
CC=golang-dev
https://golang.org/cl/9223046
Now that the type information is in TYPE instructions
that are not rewritten by the optimization passes,
we don't have to try to preserve the type information
(no longer) attached to MOV instructions.
R=ken2
CC=golang-dev
https://golang.org/cl/7402054
The type information is (and for years has been) included
as an extra field in the address chunk of an instruction.
Unfortunately, suppose there is a string at a+24(FP) and
we have an instruction reading its length. It will say:
MOVQ x+32(FP), AX
and the type of *that* argument is int (not slice), because
it is the length being read. This confuses the picture seen
by debuggers and now, worse, by the garbage collector.
Instead of attaching the type information to all uses,
emit an explicit list of TYPE instructions with the information.
The TYPE instructions are no-ops whose only role is to
provide an address to attach type information to.
For example, this function:
func f(x, y, z int) (a, b string) {
return
}
now compiles into:
--- prog list "f" ---
0000 (/Users/rsc/x.go:3) TEXT f+0(SB),$0-56
0001 (/Users/rsc/x.go:3) LOCALS ,
0002 (/Users/rsc/x.go:3) TYPE x+0(FP){int},$8
0003 (/Users/rsc/x.go:3) TYPE y+8(FP){int},$8
0004 (/Users/rsc/x.go:3) TYPE z+16(FP){int},$8
0005 (/Users/rsc/x.go:3) TYPE a+24(FP){string},$16
0006 (/Users/rsc/x.go:3) TYPE b+40(FP){string},$16
0007 (/Users/rsc/x.go:3) MOVQ $0,b+40(FP)
0008 (/Users/rsc/x.go:3) MOVQ $0,b+48(FP)
0009 (/Users/rsc/x.go:3) MOVQ $0,a+24(FP)
0010 (/Users/rsc/x.go:3) MOVQ $0,a+32(FP)
0011 (/Users/rsc/x.go:4) RET ,
The { } show the formerly hidden type information.
The { } syntax is used when printing from within the gc compiler.
It is not accepted by the assemblers.
The same type information is now included on global variables:
0055 (/Users/rsc/x.go:15) GLOBL slice+0(SB){[]string},$24(AL*0)
This more accurate type information fixes a bug in the
garbage collector's precise heap collection.
The linker only cares about globals right now, but having the
local information should make things a little nicer for Carl
in the future.
Fixes#4907.
R=ken2
CC=golang-dev
https://golang.org/cl/7395056
* Avoid treating CALL fn(SB) as justification for introducing
and tracking a registerized variable for fn(SB).
* Remove USED(n) after declaration and zeroing of n.
It was left over from when the compiler emitted more
aggressive set and not used errors, and it was keeping
the optimizer from removing a redundant zeroing of n
when n was a pointer or integer variable.
Update #597.
R=ken2
CC=golang-dev
https://golang.org/cl/7277048
A few USED(xxx) additions and a couple of deletions of variable
initialisations that go unused. One questionable correction,
mirrored in 8l/asm.c, where the result of invocation of a function
shouldn't be used.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6736054
5g: Prog went from 128 bytes to 88 bytes
6g: Prog went from 174 bytes to 144 bytes
8g: Prog went from 124 bytes to 92 bytes
There may be a little more that can be squeezed out of Addr, but alignment will be a factor.
All: remove the unused pun field from Addr
R=rsc, minux.ma
CC=golang-dev
https://golang.org/cl/6922048
The width was not being set on the address, which meant
that the optimizer could not find variables that overlapped
with it and mark them as having had their address taken.
This let to the compiler believing variables had been set
but never used and then optimizing away the set.
Fixes#4129.
R=ken2
CC=golang-dev
https://golang.org/cl/6552059
The alternative is to record enough information that the
trap handler know which registers contain cached globals
and can flush the registers back to their original locations.
That's significantly more work.
This only affects globals that have been written to.
Code that reads from a global should continue to registerize
as well as before.
Fixes#1304.
R=ken2
CC=golang-dev
https://golang.org/cl/5687046
The loop recognizer uses the standard dominance
frontiers but gets confused by dead code, which
has a (not explicitly set) rpo number of 0, meaning it
looks like the head of the function, so it dominates
everything. If the loop recognizer encounters dead
code while tracking backward through the graph
it fails to recognize where it started as a loop, and
then the optimizer does not registerize values loaded
inside that loop. Fix by checking rpo against rpo2r.
Separately, run a quick pass over the generated
code to squash JMPs to JMP instructions, which
are convenient to emit during code generation but
difficult to read when debugging the -S output.
A side effect of this pass is to eliminate dead code,
so the output files may be slightly smaller and the
optimizer may have less work to do.
There is no semantic effect, because the linkers
flatten JMP chains and delete dead instructions
when laying out the final code. Doing it here too
just makes the -S output easier to read and more
like what the final binary will contain.
The "dead code breaks loop finding" bug is thus
fixed twice over. It seemed prudent to fix loopit
separately just in case dead code ever sneaks back
in for one reason or another.
R=ken2
CC=golang-dev
https://golang.org/cl/5190043
My previous CL:
changeset: 9645:ce2e5f44b310
user: Russ Cox <rsc@golang.org>
date: Tue Sep 06 10:24:21 2011 -0400
summary: gc: unify stack frame layout
introduced a bug wherein no variables were
being registerized, making Go programs 2-3x
slower than they had been before.
This CL fixes that bug (along with some others
it was hiding) and adds a test that optimization
makes at least one test case faster.
R=ken2
CC=golang-dev
https://golang.org/cl/5174045
allocparams + tempname + compactframe
all knew about how to place stack variables.
Now only compactframe, renamed to allocauto,
does the work. Until the last minute, each PAUTO
variable is in its own space and has xoffset == 0.
This might break 5g. I get failures in concurrent
code running under qemu and I can't tell whether
it's 5g's fault or qemu's. We'll see what the real
ARM builders say.
R=ken2
CC=golang-dev
https://golang.org/cl/4973057
5g/cgen.c:
. USED(n4) as it is only mentioned in unreachable code later;
. dropped unused assignments;
. commented out unreachable code;
5g/cgen64.c:
5g/ggen.c:
. dropped unused assignments of function return value;
5g/gg.h:
. added varargck pragmas;
5g/peep.c:
. USED(p1) used only in unreacheable code;
. commented out unreachable code;
5g/reg.c:
. dropped unused assignment;
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4953048
#include "go.h" (or "gg.h")
becomes
#include <u.h>
#include <libc.h>
#include "go.h"
so that go.y can #include <stdio.h>
after <u.h> but before "go.h".
This is necessary on Plan 9.
R=ken2
CC=golang-dev
https://golang.org/cl/4971041
After allocparams and walk, remove unused auto variables
and re-layout the remaining in reverse alignment order.
R=rsc
CC=golang-dev
https://golang.org/cl/4568068
Input code like
0000 (x.go:2) TEXT main+0(SB),$36-0
0001 (x.go:3) MOVL $5,i+-8(SP)
0002 (x.go:3) MOVL $0,i+-4(SP)
0003 (x.go:4) MOVL $1,BX
0004 (x.go:4) MOVL i+-8(SP),AX
0005 (x.go:4) MOVL i+-4(SP),DX
0006 (x.go:4) MOVL AX,autotmp_0000+-20(SP)
0007 (x.go:4) MOVL DX,autotmp_0000+-16(SP)
0008 (x.go:4) MOVL autotmp_0000+-20(SP),CX
0009 (x.go:4) CMPL autotmp_0000+-16(SP),$0
0010 (x.go:4) JNE ,13
0011 (x.go:4) CMPL CX,$32
0012 (x.go:4) JCS ,14
0013 (x.go:4) MOVL $0,BX
0014 (x.go:4) SHLL CX,BX
0015 (x.go:4) MOVL BX,x+-12(SP)
0016 (x.go:5) MOVL x+-12(SP),AX
0017 (x.go:5) CDQ ,
0018 (x.go:5) MOVL AX,autotmp_0001+-28(SP)
0019 (x.go:5) MOVL DX,autotmp_0001+-24(SP)
0020 (x.go:5) MOVL autotmp_0001+-28(SP),AX
0021 (x.go:5) MOVL autotmp_0001+-24(SP),DX
0022 (x.go:5) MOVL AX,(SP)
0023 (x.go:5) MOVL DX,4(SP)
0024 (x.go:5) CALL ,runtime.printint+0(SB)
0025 (x.go:5) CALL ,runtime.printnl+0(SB)
0026 (x.go:6) RET ,
is problematic because the liveness range for
autotmp_0000 (0006-0009) is nested completely
inside a span where BX holds a live value (0003-0015).
Because the register allocator only looks at 0006-0009
to see which registers are used, it misses the fact that
BX is unavailable and uses it anyway.
The n->pun = anyregalloc() check in tempname is
a workaround for this bug, but I hit it again because
I did the tempname call before allocating BX, even
though I then used the temporary after storing in BX.
This should fix the real bug, and then we can remove
the workaround in tempname.
The code creates pseudo-variables for each register
and includes that information in the liveness propagation.
Then the regu fields can be populated using that more
complete information. With that approach, BX is marked
as in use on every line in the whole span 0003-0015,
so that the decision about autotmp_0000
(using only 0006-0009) still has all the information
it needs.
This is not specific to the 386, but it only happens in
generated code of the form
load R1
...
load var into R2
...
store R2 back into var
...
use R1
and for the most part the other compilers generate
the loads for a given compiled line before any of
the stores. Even so, this may not be the case everywhere,
so the change is worth making in all three.
R=ken2, ken, ken
CC=golang-dev
https://golang.org/cl/4529106
same as in issue below, never fixed on ARM
changeset: 5498:3fa1372ca694
user: Ken Thompson <ken@golang.org>
date: Thu May 20 17:31:28 2010 -0700
description:
fix issue 798
cannot allocate an audomatic temp
while real registers are allocated.
there is a chance that the automatic
will be allocated to one of the
allocated registers. the fix is to
not registerize such variables.
R=rsc
CC=golang-dev
https://golang.org/cl/1202042
R=ken2
CC=golang-dev
https://golang.org/cl/4226042