1
0
mirror of https://github.com/golang/go synced 2024-10-04 12:31:21 -06:00
Commit Graph

291 Commits

Author SHA1 Message Date
Russ Cox
f5184d3437 cmd/gc: correct handling of globals, func args, results
Globals, function arguments, and results are special cases in
registerization.

Globals must be flushed aggressively, because nearly any
operation can cause a panic, and the recovery code must see
the latest values. Globals also must be loaded aggressively,
because nearly any store through a pointer might be updating a
global: the compiler cannot see all the "address of"
operations on globals, especially exported globals. To
accomplish this, mark all globals as having their address
taken, which effectively disables registerization.

If a function contains a defer statement, the function results
must be flushed aggressively, because nearly any operation can
cause a panic, and the deferred code may call recover, causing
the original function to return the current values of its
function results. To accomplish this, mark all function
results as having their address taken if the function contains
any defer statements. This causes not just aggressive flushing
but also aggressive loading. The aggressive loading is
overkill but the best we can do in the current code.

Function arguments must be considered live at all safe points
in a function, because garbage collection always preserves
them: they must be up-to-date in order to be preserved
correctly. Accomplish this by marking them live at all call
sites. An earlier attempt at this marked function arguments as
having their address taken, which disabled registerization
completely, making programs slower. This CL's solution allows
registerization while preserving safety. The benchmark speedup
is caused by being able to registerize again (the earlier CL
lost the same amount).

benchmark                old ns/op     new ns/op     delta
BenchmarkEqualPort32     61.4          56.0          -8.79%

benchmark                old MB/s     new MB/s     speedup
BenchmarkEqualPort32     521.56       570.97       1.09x

Fixes #1304. (again)
Fixes #7944. (again)
Fixes #7984.
Fixes #7995.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant, r
https://golang.org/cl/97500044
2014-05-15 15:34:53 -04:00
Russ Cox
26ad5d4ff0 cmd/gc: fix liveness vs regopt mismatch for input variables
The inputs to a function are marked live at all times in the
liveness bitmaps, so that the garbage collector will not free
the things they point at and reuse the pointers, so that the
pointers shown in stack traces are guaranteed not to have
been recycled.

Unfortunately, no one told the register optimizer that the
inputs need to be preserved at all call sites. If a function
is done with a particular input value, the optimizer will stop
preserving it across calls. For single-word values this just
means that the value recorded might be stale. For multi-word
values like slices, the value recorded could be only partially stale:
it can happen that, say, the cap was updated but not the len,
or that the len was updated but not the base pointer.
Either of these possibilities (and others) would make the
garbage collector misinterpret memory, leading to memory
corruption.

This came up in a real program, in which the garbage collector's
'slice len ≤ slice cap' check caught the inconsistency.

Fixes #7944.

LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, khr
https://golang.org/cl/100370045
2014-05-12 17:19:02 -04:00
Josh Bleecher Snyder
03c0f3fea9 cmd/gc: alias more variables during register allocation
This is joint work with Daniel Morsing.

In order for the register allocator to alias two variables, they must have the same width, stack offset, and etype. Code generation was altering a variable's etype in a few places. This prevented the variable from being moved to a register, which in turn prevented peephole optimization. This failure to alias was very common, with almost 23,000 instances just running make.bash.

This phenomenon was not visible in the register allocation debug output because the variables that failed to alias had the same name. The debugging-only change to bits.c fixes this by printing the variable number with its name.

This CL fixes the source of all etype mismatches for 6g, all but one case for 8g, and depressingly few cases for 5g. (I believe that extending CL 6819083 to 5g is a prerequisite.) Fixing the remaining cases in 8g and 5g is work for the future.

The etype mismatch fixes are:

* [gc] Slicing changed the type of the base pointer into a uintptr in order to perform arithmetic on it. Instead, support addition directly on pointers.

* [*g] OSPTR was giving type uintptr to slice base pointers; undo that. This arose, for example, while compiling copy(dst, src).

* [8g] 64 bit float conversion was assigning int64 type during codegen, overwriting the existing uint64 type.

Note that some etype mismatches are appropriate, such as a struct with a single field or an array with a single element.

With these fixes, the number of registerizations that occur while running make.bash for 6g increases ~10%. Hello world binary size shrinks ~1.5%. Running all benchmarks in the standard library show performance improvements ranging from nominal to substantive (>10%); a full comparison using 6g on my laptop is available at https://gist.github.com/josharian/8f9b5beb46667c272064. The microbenchmarks must be taken with a grain of salt; see issue 7920. The few benchmarks that show real regressions are likely due to issue 7920. I manually examined the generated code for the top few regressions and none had any assembly output changes. The few benchmarks that show extraordinary improvements are likely also due to issue 7920.

Performance results from 8g appear similar to 6g.

5g shows no performance improvements. This is not surprising, given the discussion above.

Update #7316

LGTM=rsc
R=rsc, daniel.morsing, bradfitz
CC=dave, golang-codereviews
https://golang.org/cl/91850043
2014-05-12 17:10:36 -04:00
Josh Bleecher Snyder
1848d71445 cmd/gc: don't give credit for NOPs during register allocation
The register allocator decides which variables should be placed into registers by charging for each load/store and crediting for each use, and then selecting an allocation with minimal cost. NOPs will be eliminated, however, so using a variable in a NOP should not generate credit.

Issue 7867 arises from attempted registerization of multi-word variables because they are used in NOPs. By not crediting for that use, they will no longer be considered for registerization.

This fix could theoretically lead to better register allocation, but NOPs are rare relative to other instructions.

Fixes #7867.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/94810044
2014-05-09 09:55:17 -07:00
Russ Cox
0a8a719ded cmd/5g, cmd/6g, cmd/8g: preserve wide values in large functions
In large functions with many variables, the register optimizer
may give up and choose not to track certain variables at all.
In this case, the "nextinnode" information linking together
all the words from a given variable will be incomplete, and
the result may be that only some of a multiword value is
preserved across a call. That confuses the garbage collector,
so don't do that. Instead, mark those variables as having
their address taken, so that they will be preserved at all
calls. It's overkill, but correct.

Tested by hand using the 6g -S output to see that it does fix
the buggy generated code leading to the issue 7726 failure.

There is no automated test because I managed to break the
compiler while writing a test (see issue 7727). I will check
in a test along with the fix to issue 7727.

Fixes #7726.

LGTM=khr
R=khr, bradfitz, dave
CC=golang-codereviews
https://golang.org/cl/85200043
2014-04-16 13:59:42 -04:00
Russ Cox
844ec6bbe3 cmd/8g: fix liveness for 387 build (including plan9)
TBR=khr
CC=golang-codereviews
https://golang.org/cl/84570045
2014-04-06 10:30:02 -04:00
Rémy Oudompheng
c6a41a3559 cmd/6g, cmd/8g: disable Duff's device on NaCl.
Native Client forbids jumps/calls to arbitrary locations and
enforces a particular alignement, which makes the Duff's device
ineffective.

LGTM=khr
R=rsc, dave, khr
CC=golang-codereviews
https://golang.org/cl/84400043
2014-04-04 08:42:35 +02:00
David du Colombier
9f9c9abb7e cmd/8g, cmd/gc: fix warnings on Plan 9
warning: src/cmd/8g/ggen.c:35 non-interruptable temporary
warning: src/cmd/gc/walk.c:656 set and not used: l
warning: src/cmd/gc/walk.c:658 set and not used: l

LGTM=minux.ma
R=golang-codereviews, minux.ma
CC=golang-codereviews
https://golang.org/cl/83660043
2014-04-02 21:33:50 +02:00
Keith Randall
47acf16709 cmd/gc: Don't zero more than we need.
Don't merge with the zero range, we may
end up zeroing more than we need.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/83430044
2014-04-02 09:17:42 -07:00
Keith Randall
383963b506 runtime: zero at start of frame more efficiently.
Use Duff's device for zeroing.  Combine adjacent regions.

Update #7680
Update #7624

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/83200045
2014-04-01 19:44:07 -07:00
Russ Cox
9c8f11ff96 cmd/5g, cmd/8g: fix build
Botched during CL 83090046.

TBR=khr
CC=golang-codereviews
https://golang.org/cl/83070046
2014-04-01 20:24:53 -04:00
Russ Cox
daca06f2e3 cmd/gc: shorten more temporary lifetimes
1. In functions with heap-allocated result variables or with
defer statements, the return sequence requires more than
just a single RET instruction. There is an optimization that
arranges for all returns to jump to a single copy of the return
epilogue in this case. Unfortunately, that optimization is
fundamentally incompatible with PC-based liveness information:
it takes PCs at many different points in the function and makes
them all land at one PC, making the combined liveness information
at that target PC a mess. Disable this optimization, so that each
return site gets its own copy of the 'call deferreturn' and the
copying of result variables back from the heap.
This removes quite a few spurious 'ambiguously live' variables.

2. Let orderexpr allocate temporaries that are passed by address
to a function call and then die on return, so that we can arrange
an appropriate VARKILL.

2a. Do this for ... slices.

2b. Do this for closure structs.

2c. Do this for runtime.concatstring, which is the implementation
of large string additions. Change representation of OADDSTR to
an explicit list in typecheck to avoid reconstructing list in both
walk and order.

3. Let orderexpr allocate the temporary variable copies used for
range loops, so that they can be killed when the loop is over.
Similarly, let it allocate the temporary holding the map iterator.

CL 81940043 reduced the number of ambiguously live temps
in the godoc binary from 860 to 711.

This CL reduces the number to 121. Still more to do, but another
good checkpoint.

Update #7345

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83090046
2014-04-01 20:02:54 -04:00
Keith Randall
6c7cbf086c runtime: get rid of most uses of REP for copying/zeroing.
REP MOVSQ and REP STOSQ have a really high startup overhead.
Use a Duff's device to do the repetition instead.

benchmark                 old ns/op     new ns/op     delta
BenchmarkClearFat32       7.20          1.60          -77.78%
BenchmarkCopyFat32        6.88          2.38          -65.41%
BenchmarkClearFat64       7.15          3.20          -55.24%
BenchmarkCopyFat64        6.88          3.44          -50.00%
BenchmarkClearFat128      9.53          5.34          -43.97%
BenchmarkCopyFat128       9.27          5.56          -40.02%
BenchmarkClearFat256      13.8          9.53          -30.94%
BenchmarkCopyFat256       13.5          10.3          -23.70%
BenchmarkClearFat512      22.3          18.0          -19.28%
BenchmarkCopyFat512       22.0          19.7          -10.45%
BenchmarkCopyFat1024      36.5          38.4          +5.21%
BenchmarkClearFat1024     35.1          35.0          -0.28%

TODO: use for stack frame zeroing
TODO: REP prefixes are still used for "reverse" copying when src/dst
regions overlap.  Might be worth fixing.

LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews, r
https://golang.org/cl/81370046
2014-04-01 12:51:02 -07:00
Russ Cox
b700cb4974 cmd/gc: shorten temporary lifetimes when possible
The new channel and map runtime routines take pointers
to values, typically temporaries. Without help, the compiler
cannot tell when those temporaries stop being needed,
because it isn't sure what happened to the pointer.
Arrange to insert explicit VARKILL instructions for these
temporaries so that the liveness analysis can avoid seeing
them as "ambiguously live".

The change is made in order.c, which was already in charge of
introducing temporaries to preserve the order-of-evaluation
guarantees. Now its job has expanded to include introducing
temporaries as needed by runtime routines, and then also
inserting the VARKILL annotations for all these temporaries,
so that their lifetimes can be shortened.

In order to do its job for the map runtime routines, order.c arranges
that all map lookups or map assignments have the form:

        x = m[k]
        x, y = m[k]
        m[k] = x

where x, y, and k are simple variables (often temporaries).
Likewise, receiving from a channel is now always:

        x = <-c

In order to provide the map guarantee, order.c is responsible for
rewriting x op= y into x = x op y, so that m[k] += z becomes

        t = m[k]
        t2 = t + z
        m[k] = t2

While here, fix a few bugs in order.c's traversal: it was failing to
walk into select and switch case bodies, so order of evaluation
guarantees were not preserved in those situations.
Added tests to test/reorder2.go.

Fixes #7671.

In gc/popt's temporary-merging optimization, allow merging
of temporaries with their address taken as long as the liveness
ranges do not intersect. (There is a good chance of that now
that we have VARKILL annotations to limit the liveness range.)

Explicitly killing temporaries cuts the number of ambiguously
live temporaries that must be zeroed in the godoc binary from
860 to 711, or -17%. There is more work to be done, but this
is a good checkpoint.

Update #7345

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/81940043
2014-04-01 13:31:38 -04:00
Russ Cox
6722d45631 cmd/gc: liveness-related bug fixes
1. On entry to a function, only zero the ambiguously live stack variables.
Before, we were zeroing all stack variables containing pointers.
The zeroing is pretty inefficient right now (issue 7624), but there are also
too many stack variables detected as ambiguously live (issue 7345),
and that must be addressed before deciding how to improve the zeroing code.
(Changes in 5g/ggen.c, 6g/ggen.c, 8g/ggen.c, gc/pgen.c)

Fixes #7647.

2. Make the regopt word-based liveness analysis preserve the
whole-variable liveness property expected by the garbage collection
bitmap liveness analysis. That is, if the regopt liveness decides that
one word in a struct needs to be preserved, make sure it preserves
the entire struct. This is particularly important for multiword values
such as strings, slices, and interfaces, in which all the words need
to be present in order to understand the meaning.
(Changes in 5g/reg.c, 6g/reg.c, 8g/reg.c.)

Fixes #7591.

3. Make the regopt word-based liveness analysis treat a variable
as having its address taken - which makes it preserved across
all future calls - whenever n->addrtaken is set, for consistency
with the gc bitmap liveness analysis, even if there is no machine
instruction actually taking the address. In this case n->addrtaken
is incorrect (a nicer way to put it is overconservative), and ideally
there would be no such cases, but they can happen and the two
analyses need to agree.
(Changes in 5g/reg.c, 6g/reg.c, 8g/reg.c; test in bug484.go.)

Fixes crashes found by turning off "zero everything" in step 1.

4. Remove spurious VARDEF annotations. As the comment in
gc/pgen.c explains, the VARDEF must immediately precede
the initialization. It cannot be too early, and it cannot be too late.
In particular, if a function call sits between the VARDEF and the
actual machine instructions doing the initialization, the variable
will be treated as live during that function call even though it is
uninitialized, leading to problems.
(Changes in gc/gen.c; test in live.go.)

Fixes crashes found by turning off "zero everything" in step 1.

5. Do not treat loading the address of a wide value as a signal
that the value must be initialized. Instead depend on the existence
of a VARDEF or the first actual read/write of a word in the value.
If the load is in order to pass the address to a function that does
the actual initialization, treating the load as an implicit VARDEF
causes the same problems as described in step 4.
The alternative is to arrange to zero every such value before
passing it to the real initialization function, but this is a much
easier and more efficient change.
(Changes in gc/plive.c.)

Fixes crashes found by turning off "zero everything" in step 1.

6. Treat wide input parameters with their address taken as
initialized on entry to the function. Otherwise they look
"ambiguously live" and we will try to emit code to zero them.
(Changes in gc/plive.c.)

Fixes crashes found by turning off "zero everything" in step 1.

7. An array of length 0 has no pointers, even if the element type does.
Without this change, the zeroing code complains when asked to
clear a 0-length array.
(Changes in gc/reflect.c.)

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80160044
2014-03-27 14:05:57 -04:00
Rémy Oudompheng
0285d2b96b cmd/6g, cmd/8g: skip CONVNOP nodes in bgen.
Revision 3ae4607a43ff introduced CONVNOP layers
to fix type checking issues arising from comparisons.
The added complexity made 8g run out of registers
when compiling an equality function in go.net/ipv6.

A similar issue occurred in test/sizeof.go on
amd64p32 with 6g.

Fixes #7405.

LGTM=khr
R=rsc, dave, iant, khr
CC=golang-codereviews
https://golang.org/cl/78100044
2014-03-20 22:22:37 +01:00
Keith Randall
14664050b9 cmd/6g: do small zeroings with straightline code.
Removes most uses of the REP prefix, which has a high startup cost.

LGTM=iant
R=golang-codereviews, iant, khr
CC=golang-codereviews
https://golang.org/cl/77920043
2014-03-19 15:41:34 -07:00
Dave Cheney
e31a1ce109 cmd/gc, cmd/5g, cmd/6g, cmd/8g: introduce linkarchinit and add amd64p32 support
Replaces CL 70000043.

Introduce linkarchinit() from cmd/ld.

For cmd/6g, switch to the amd64p32 linker model if we are building under nacl/amd64p32.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/71330045
2014-03-07 15:33:44 +11:00
Russ Cox
c2dd33a46f cmd/ld: clear unused ctxt before morestack
For non-closure functions, the context register is uninitialized
on entry and will not be used, but morestack saves it and then the
garbage collector treats it as live. This can be a source of memory
leaks if the context register points at otherwise dead memory.
Avoid this by introducing a parallel set of morestack functions
that clear the context register, and use those for the non-closure functions.

I hope this will help with some of the finalizer flakiness, but it probably won't.

Fixes #7244.

LGTM=dvyukov
R=khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/71030044
2014-03-04 13:53:08 -05:00
Ian Lance Taylor
fa6375ea47 cmd/6g, cmd/8g: simplify calls to gvardef
The gvardef function does nothing if n->class == PEXTERN, so
we don't need to test for that before calling it.  This makes
the 6g/8g code more like the 5g code and clarifies that the
cases that do not test for n->class != PEXTERN are not buggy.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/68900044
2014-02-26 07:37:10 -08:00
Josh Bleecher Snyder
d97d37188a 5g, 8g: remove dead code
maxstksize is superfluous and appears to be vestigial. 6g does not use it.

c >= 4 cannot occur; c = w % 4.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/68750043
2014-02-25 14:43:53 -08:00
Dave Cheney
7c8280c9ef all: merge NaCl branch (part 1)
See golang.org/s/go13nacl for design overview.

This CL is the mostly mechanical changes from rsc's Go 1.2 based NaCl branch, specifically 39cb35750369 to 500771b477cf from https://code.google.com/r/rsc-go13nacl. This CL does not include working NaCl support, there are probably two or three more large merges to come.

CL 15750044 is not included as it involves more invasive changes to the linker which will need to be merged separately.

The exact change lists included are

15050047: syscall: support for Native Client
15360044: syscall: unzip implementation for Native Client
15370044: syscall: Native Client SRPC implementation
15400047: cmd/dist, cmd/go, go/build, test: support for Native Client
15410048: runtime: support for Native Client
15410049: syscall: file descriptor table for Native Client
15410050: syscall: in-memory file system for Native Client
15440048: all: update +build lines for Native Client port
15540045: cmd/6g, cmd/8g, cmd/gc: support for Native Client
15570045: os: support for Native Client
15680044: crypto/..., hash/crc32, reflect, sync/atomic: support for amd64p32
15690044: net: support for Native Client
15690048: runtime: support for fake time like on Go Playground
15690051: build: disable various tests on Native Client

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/68150047
2014-02-25 09:47:42 -05:00
Russ Cox
1ca1cbea65 cmd/5g, cmd/8g: zero ambiguously live values on entry
The code here is being restored after its deletion in CL 14430048.

I restored the copy in cmd/6g in CL 56430043 but neglected the
other two.

This is the reason that enabling precisestack only worked on amd64.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/66170043
2014-02-19 17:08:55 -05:00
Russ Cox
7a7c0ffb47 cmd/gc: correct liveness for fat variables
The VARDEF placement must be before the initialization
but after any final use. If you have something like s = ... using s ...
the rhs must be evaluated, then the VARDEF, then the lhs
assigned.

There is a large comment in pgen.c on gvardef explaining
this in more detail.

This CL also includes Ian's suggestions from earlier CLs,
namely commenting the use of mode in link.h and fixing
the precedence of the ~r check in dcl.c.

This CL enables the check that if liveness analysis decides
a variable is live on entry to the function, that variable must
be a function parameter (not a result, and not a local variable).
If this check fails, it indicates a bug in the liveness analysis or
in the generated code being analyzed.

The race detector generates invalid code for append(x, y...).
The code declares a temporary t and then uses cap(t) before
initializing t. The new liveness check catches this bug and
stops the compiler from writing out the buggy code.
Consequently, this CL disables the race detector tests in
run.bash until the race detector bug can be fixed
(golang.org/issue/7334).

Except for the race detector bug, the liveness analysis check
does not detect any problems (this CL and the previous CLs
fixed all the detected problems).

The net test still fails with GOGC=0 but the rest of the tests
now pass or time out (because GOGC=0 is so slow).

TBR=iant
CC=golang-codereviews
https://golang.org/cl/64170043
2014-02-15 10:58:55 -05:00
Russ Cox
91b1f7cb15 cmd/gc: handle variable initialization by block move in liveness
Any initialization of a variable by a block copy or block zeroing
or by multiple assignments (componentwise copying or zeroing
of a multiword variable) needs to emit a VARDEF. These cases were not.

Fixes #7205.

TBR=iant
CC=golang-codereviews
https://golang.org/cl/63650044
2014-02-13 22:45:16 -05:00
Russ Cox
7addda685d cmd/5g, cmd/8g: fix build
The test added in CL 63630043 fails on 5g and 8g because they
were not emitting the VARDEF instruction when clearing a fat
value by clearing the components. 6g had the call in the right place.

Hooray tests.

TBR=iant
CC=golang-codereviews
https://golang.org/cl/63660043
2014-02-13 22:30:35 -05:00
Russ Cox
801e40a0a4 cmd/gc: rename AFATVARDEF to AVARDEF
The "fat" referred to being used for multiword values only.
We're going to use it for non-fat values sometimes too.

No change other than the renaming.

TBR=iant
CC=golang-codereviews
https://golang.org/cl/63650043
2014-02-13 22:17:22 -05:00
Shenghou Ma
ca6186aa26 cmd/6c, cmd/8c, cmd/8g: fix print of pc (which is vlong).
While we're at it, fix a wrong for statement in cmd/8g.

LGTM=rsc
R=rsc, golang-codereviews
CC=golang-codereviews
https://golang.org/cl/62700044
2014-02-13 03:09:03 -05:00
Russ Cox
684332f47c cmd/5g: fix regopt bug in copyprop
copyau1 was assuming that it could deduce the type of the
middle register p->reg from the type of the left or right
argument: in CMPF F1, F2, the p->reg==2 must be a D_FREG
because p->from is F1, and in CMP R1, R2, the p->reg==2 must
be a D_REG because p->from is R1.

This heuristic fails for CMP $0, R2, which was causing copyau1
not to recognize p->reg==2 as a reference to R2, which was
keeping it from properly renaming the register use when
substituting registers.

cmd/5c has the right approach: look at the opcode p->as to
decide the kind of register. It is unclear where 5g's copyau1
came from; perhaps it was an attempt to avoid expanding 5c's
a2type to include new instructions used only by 5g.

Copy a2type from cmd/5c, expand to include additional instructions,
and make it crash the compiler if asked about an instruction
it does not understand (avoid silent bugs in the future if new
instructions are added).

Should fix current arm build breakage.

While we're here, fix the print statements dumping the pred and
succ info in the asm listing to pass an int arg to %.4ud
(Prog.pc is a vlong now, due to the liblink merge).

TBR=ken2
CC=golang-codereviews
https://golang.org/cl/62730043
2014-02-13 03:54:55 +00:00
Anthony Martin
2cae0591cd cmd/cc, cmd/gc, cmd/ld: consolidate print format routines
We now use the %A, %D, %P, and %R routines from liblink
across the board.

Fixes #7178.
Fixes #7055.

LGTM=iant
R=golang-codereviews, gobot, rsc, dave, iant, remyoudompheng
CC=golang-codereviews
https://golang.org/cl/49170043
2014-02-12 14:29:11 -05:00
Daniel Morsing
85e4cb2f10 cmd/6g, cmd/8g, cmd/5g: make the undefined instruction have no successors
The UNDEF instruction was listed in the instruction data as having the next instruction in the stream as its successor. This confused the optimizer into adding a load where it wasn't needed, in turn confusing the liveness analysis pass for GC bitmaps into thinking that the variable was live.

Fixes #7229.

LGTM=iant, rsc
R=golang-codereviews, bradfitz, iant, dave, rsc
CC=golang-codereviews
https://golang.org/cl/56910045
2014-02-11 20:25:40 +00:00
Rob Pike
289d46399b cmd/8g: don't crash if Prog->u.branch is nil
The code is copied from cmd/6g.
Empirically, all branch targets are nil in this code so
something is still wrong, but at least this stops 8g -S
from crashing.

Update #7178

LGTM=dave, iant
R=iant, dave
CC=golang-codereviews
https://golang.org/cl/58400043
2014-01-29 16:14:45 -08:00
Russ Cox
4acb70d377 cmd/gc: bypass DATA instruction for data initialized to integer constant
Eventually we will want to bypass DATA for everything,
but the relocations are not standardized well enough across
architectures to make that possible.

This did not help as much as I expected, but it is definitely better.
It shaves maybe 1-2% off all.bash depending on how much you
trust the timings of a single run:

Before: 241.139r 362.702u 112.967s
After:  234.339r 359.623u 111.045s

R=golang-codereviews, gobot, r, iant
CC=golang-codereviews
https://golang.org/cl/44650043
2013-12-20 14:24:39 -05:00
Russ Cox
870e821ded cmd/cc, cmd/gc: update compilers, assemblers for liblink changes
- add buffered stdout to all tools and provide to link ctxt.
- avoid extra \n before ! in .6 files written by assemblers
  (makes them match the C compilers).
- use linkwriteobj instead of linkouthist+linkwritefuncs.
- in assemblers and C compilers, record pc explicitly in Prog,
  for use by liblink.
- in C compilers, preserve jump target links.
- in Go compilers (gsubr.c) attach gotype directly to
  corresponding LSym* instead of rederiving from instruction stream.
- in Go compilers, emit just one definition for runtime.zerovalue
  from each compilation.

This CL consists entirely of small adjustments.
The heavy lifting is in CL 39680043.
Each depends on the other.

R=golang-dev, dave, iant
CC=golang-dev
https://golang.org/cl/37030045
2013-12-16 12:51:38 -05:00
David du Colombier
58005207d2 cmd/8c, cmd/8g, cmd/8l: fix Plan 9 warnings
warning: src/cmd/8c/list.c:124 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:134 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:142 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:152 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:156 format mismatch d VLONG, arg 4
warning: src/cmd/8c/list.c:160 format mismatch d VLONG, arg 4
warning: src/cmd/8c/list.c:165 format mismatch d VLONG, arg 4
warning: src/cmd/8c/list.c:167 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:172 format mismatch d VLONG, arg 4
warning: src/cmd/8c/list.c:174 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:178 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:184 format mismatch d VLONG, arg 3

warning: src/cmd/8g/list.c:91 format mismatch d VLONG, arg 4
warning: src/cmd/8g/list.c:100 format mismatch d VLONG, arg 4
warning: src/cmd/8g/list.c:114 format mismatch d VLONG, arg 5
warning: src/cmd/8g/list.c:118 format mismatch d VLONG, arg 5
warning: src/cmd/8g/list.c:122 format mismatch d VLONG, arg 5
warning: src/cmd/8g/list.c:126 format mismatch d VLONG, arg 5
warning: src/cmd/8g/list.c:136 format mismatch d VLONG, arg 4

warning: src/cmd/8l/list.c:107 format mismatch d VLONG, arg 4
warning: src/cmd/8l/list.c:125 format mismatch ux VLONG, arg 4
warning: src/cmd/8l/list.c:128 format mismatch ux VLONG, arg 4
warning: src/cmd/8l/list.c:130 format mismatch d VLONG, arg 4
warning: src/cmd/8l/list.c:134 format mismatch d VLONG, arg 5
warning: src/cmd/8l/list.c:138 format mismatch d VLONG, arg 6
warning: src/cmd/8l/list.c:143 format mismatch d VLONG, arg 5
warning: src/cmd/8l/list.c:148 format mismatch d VLONG, arg 5
warning: src/cmd/8l/list.c:150 format mismatch d VLONG, arg 4
warning: src/cmd/8l/list.c:154 format mismatch d VLONG, arg 4
warning: src/cmd/8l/list.c:158 format mismatch d VLONG, arg 4
warning: src/cmd/8l/obj.c:132 format mismatch ux VLONG, arg 2

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/39710043
2013-12-09 18:47:22 -05:00
Russ Cox
f606c1be80 cmd/5g, cmd/6g, cmd/8g: use liblink
Preparation for golang.org/s/go13linker work.

This CL does not build by itself. It depends on 35740044
and 35790044 and will be submitted at the same time.

R=iant
CC=golang-dev
https://golang.org/cl/34590045
2013-12-08 22:51:55 -05:00
Carl Shapiro
f056daf075 cmd/5g, cmd/5l, cmd/6g, cmd/6l, cmd/8g, cmd/8l, cmd/gc, runtime: generate pointer maps by liveness analysis
This change allows the garbage collector to examine stack
slots that are determined as live and containing a pointer
value by the garbage collector.  This results in a mean
reduction of 65% in the number of stack slots scanned during
an invocation of "GOGC=1 all.bash".

Unfortunately, this does not yet allow garbage collection to
be precise for the stack slots computed as live.  Pointers
confound the determination of what definitions reach a given
instruction.  In general, this problem is not solvable without
runtime cost but some advanced cooperation from the compiler
might mitigate common cases.

R=golang-dev, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/14430048
2013-12-05 17:35:22 -08:00
Russ Cox
aa0439ba65 cmd/gc: eliminate redundant &x.Field nil checks
This eliminates ~75% of the nil checks being emitted,
on all architectures. We can do better, but we need
a bit more general support from the compiler, and
I don't want to do that so close to Go 1.2.
What's here is simple but effective and safe.

A few small code generation cleanups were required
to make the analysis consistent on all systems about
which nil checks are omitted, at least in the test.

Fixes #6019.

R=ken2
CC=golang-dev
https://golang.org/cl/13334052
2013-09-17 16:54:22 -04:00
Rémy Oudompheng
ff416a3f19 cmd/gc: inline copy in frontend to call memmove directly.
A new node type OSPTR is added to refer to the data pointer of
strings and slices in a simple way during walk(). It will be
useful for future work on simplification of slice arithmetic.

benchmark                  old ns/op    new ns/op    delta
BenchmarkCopy1Byte                 9            8  -13.98%
BenchmarkCopy2Byte                14            8  -40.49%
BenchmarkCopy4Byte                13            8  -35.04%
BenchmarkCopy8Byte                13            8  -37.10%
BenchmarkCopy12Byte               14           12  -15.38%
BenchmarkCopy16Byte               14           12  -17.24%
BenchmarkCopy32Byte               19           14  -27.32%
BenchmarkCopy128Byte              31           26  -15.29%
BenchmarkCopy1024Byte            100           92   -7.50%
BenchmarkCopy1String              10            7  -28.99%
BenchmarkCopy2String              10            7  -28.06%
BenchmarkCopy4String              10            8  -22.69%
BenchmarkCopy8String              10            8  -23.30%
BenchmarkCopy12String             11           11   -5.88%
BenchmarkCopy16String             11           11   -5.08%
BenchmarkCopy32String             15           14   -6.58%
BenchmarkCopy128String            28           25  -10.60%
BenchmarkCopy1024String           95           95   +0.53%

R=golang-dev, bradfitz, cshapiro, dave, daniel.morsing, rsc, khr, khr
CC=golang-dev
https://golang.org/cl/9101048
2013-09-12 00:15:28 +02:00
Russ Cox
6d47de2f40 cmd/5g, cmd/6g, cmd/8g: remove O(n) reset loop in copyprop
Simpler version of CL 13084043.

R=ken2
CC=golang-dev
https://golang.org/cl/13602045
2013-09-11 15:22:11 -04:00
Russ Cox
a0bc379d46 undo CL 13084043 / ef4ee02a5853
There is a cleaner, simpler way.

««« original CL description
cmd/5g, cmd/6g, cmd/8g: faster compilation
Replace linked list walk with memset.
This reduces CPU time taken by 'go install -a std' by ~10%.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.714s	0m14.858s	0m6.138s
0m22.644s	0m14.875s	0m6.120s
0m22.604s	0m14.854s	0m6.081s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/13084043
»»»

TBR=dvyukov
CC=golang-dev
https://golang.org/cl/13352049
2013-09-11 15:14:11 -04:00
Russ Cox
af2a3193af cmd/5g, cmd/6g, cmd/8g: simplify for loop in bitmap generation
Lucio De Re reports that the more complex
loop miscompiles on Plan 9.

R=ken2
CC=golang-dev
https://golang.org/cl/13602043
2013-09-06 16:49:11 -04:00
Dmitriy Vyukov
79dca0327e libbio, all cmd: consistently use BGETC/BPUTC instead of Bgetc/Bputc
Also introduce BGET2/4, BPUT2/4 as they are widely used.
Slightly improve BGETC/BPUTC implementation.
This gives ~5% CPU time improvement on go install -a -p1 std.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.999s	0m15.841s	0m5.889s
0m22.845s	0m15.808s	0m5.850s
0m22.889s	0m15.832s	0m5.848s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/12745047
2013-08-30 15:46:12 +04:00
Lucio De Re
bf58d20999 cmd/8g: add descriptions for some missing instructions.
These instructions are emitted when GO386=387 or the target
i386 CPU does not have SSE2 capabilities.

Fixes #6215.

R=golang-dev, remyoudompheng
CC=golang-dev
https://golang.org/cl/12812045
2013-08-29 14:41:01 +02:00
Dmitriy Vyukov
06e686def6 cmd/5g, cmd/6g, cmd/8g: faster compilation
Replace linked list walk with memset.
This reduces CPU time taken by 'go install -a std' by ~10%.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.714s	0m14.858s	0m6.138s
0m22.644s	0m14.875s	0m6.120s
0m22.604s	0m14.854s	0m6.081s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/13084043
2013-08-21 14:20:28 +04:00
Russ Cox
3b4d792606 cmd/gc: separate "has pointers" from "needs zeroing" in stack frame
When the new call site-specific frame bitmaps are available,
we can cut the zeroing to just those values that need it due
to scope escaping.

R=cshapiro, cshapiro
CC=golang-dev
https://golang.org/cl/13045043
2013-08-16 21:45:59 -04:00
Carl Shapiro
d3b04f46b5 cmd/5g, cmd/6g, cmd/8g: update frame zeroing for new bitmap format
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12740046
2013-08-16 01:15:04 -04:00
Russ Cox
999a36f9af cmd/gc: &x panics if x does
See golang.org/s/go12nil.

This CL is about getting all the right checks inserted.
A followup CL will add an optimization pass to
remove redundant checks.

R=ken2
CC=golang-dev
https://golang.org/cl/12970043
2013-08-15 14:38:32 -04:00
Rémy Oudompheng
0d9db86f74 cmd/5g, cmd/6g, cmd/8g: restore occurrences of R replaced by nil in comments.
R=golang-dev, bradfitz, rsc
CC=golang-dev
https://golang.org/cl/12842043
2013-08-14 21:24:48 +02:00
Russ Cox
fa72679f07 cmd/gc: add temporary-merging optimization pass
The compilers assume they can generate temporary variables
as needed to preserve the right semantics or simplify code
generation and the back end will still generate good code.
This turns out not to be true. The back ends will only
track the first 128 variables per function and give up
on the remainder. That needs to be fixed too, in a later CL.

This CL merges temporary variables with equal types and
non-overlapping lifetimes using the greedy algorithm in
Poletto and Sarkar, "Linear Scan Register Allocation",
ACM TOPLAS 1999.

The result can be striking in the right functions.

Top 20 frame size changes in a 6g godoc binary by bytes saved:

5464 1984 (-3480, -63.7%) go/build.(*Context).Import
4456 1824 (-2632, -59.1%) go/printer.(*printer).expr1
2560   80 (-2480, -96.9%) time.nextStdChunk
3496 1608 (-1888, -54.0%) go/printer.(*printer).stmt
1896  272 (-1624, -85.7%) net/http.init
2688 1400 (-1288, -47.9%) fmt.(*pp).printReflectValue
2800 1512 (-1288, -46.0%) main.main
3296 2016 (-1280, -38.8%) crypto/tls.(*Conn).clientHandshake
1664  488 (-1176, -70.7%) time.loadZoneZip
1760  608 (-1152, -65.5%) time.parse
4104 3072 (-1032, -25.1%) runtime/pprof.writeHeap
1680  712 ( -968, -57.6%) go/ast.Walk
2488 1560 ( -928, -37.3%) crypto/x509.parseCertificate
1128  392 ( -736, -65.2%) math/big.nat.divLarge
1528  864 ( -664, -43.5%) go/printer.(*printer).fieldList
1360  712 ( -648, -47.6%) regexp/syntax.(*parser).factor
2104 1528 ( -576, -27.4%) encoding/asn1.parseField
1064  504 ( -560, -52.6%) encoding/xml.(*Decoder).text
 584   48 ( -536, -91.8%) html.init
1400  864 ( -536, -38.3%) go/doc.playExample

In the same godoc build, cuts the number of functions with
too many vars from 83 to 32.

R=ken2
CC=golang-dev
https://golang.org/cl/12829043
2013-08-13 00:09:31 -04:00