Drop expecttaken function in favor of extra argument
to gbranch and bgen. Mark loop condition as likely to
be true, so that loops are generated inline.
The main benefit here is contiguous code when trying
to read the generated assembly. It has only minor effects
on the timing, and they mostly cancel the minor effects
that aligning function entry points had. One exception:
both changes made Fannkuch faster.
Compared to before CL 6244066 (before aligned functions)
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4222117400 4201958800 -0.48%
BenchmarkFannkuch11 3462631800 3215908600 -7.13%
BenchmarkGobDecode 20887622 20899164 +0.06%
BenchmarkGobEncode 9548772 9439083 -1.15%
BenchmarkGzip 151687 152060 +0.25%
BenchmarkGunzip 8742 8711 -0.35%
BenchmarkJSONEncode 62730560 62686700 -0.07%
BenchmarkJSONDecode 252569180 252368960 -0.08%
BenchmarkMandelbrot200 5267599 5252531 -0.29%
BenchmarkRevcomp25M 980813500 985248400 +0.45%
BenchmarkTemplate 361259100 357414680 -1.06%
Compared to tip (aligned functions):
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4140739800 4201958800 +1.48%
BenchmarkFannkuch11 3259914400 3215908600 -1.35%
BenchmarkGobDecode 20620222 20899164 +1.35%
BenchmarkGobEncode 9384886 9439083 +0.58%
BenchmarkGzip 150333 152060 +1.15%
BenchmarkGunzip 8741 8711 -0.34%
BenchmarkJSONEncode 65210990 62686700 -3.87%
BenchmarkJSONDecode 249394860 252368960 +1.19%
BenchmarkMandelbrot200 5273394 5252531 -0.40%
BenchmarkRevcomp25M 996013800 985248400 -1.08%
BenchmarkTemplate 360620840 357414680 -0.89%
R=ken2
CC=golang-dev
https://golang.org/cl/6245069
Pulling function calls out to happen before the
expression being evaluated was causing illegal
reorderings even without inlining; with inlining
it got worse. This CL adds a separate ordering pass
to move things with a fixed order out of expressions
and into the statement sequence, where they will
not be reordered by walk.
Replaces lvd's CL 5534079.
Fixes#2740.
R=lvd
CC=golang-dev
https://golang.org/cl/5569062
Got rid of all the magic mystery globals. Now
for %N, %T, and %S, the flags +,- and # set a sticky
debug, sym and export mode, only visible in the new fmt.c.
Default is error mode. Handle h and l flags consistently with
the least side effects, so we can now change
things without worrying about unrelated things
breaking.
fixes#2361
R=rsc
CC=golang-dev
https://golang.org/cl/5316043
string literals used as package qualifiers are now prefixed with '@'
which obviates the need for the extra ':' before tags.
R=rsc, gri, lvd
CC=golang-dev
https://golang.org/cl/5129057
My previous CL:
changeset: 9645:ce2e5f44b310
user: Russ Cox <rsc@golang.org>
date: Tue Sep 06 10:24:21 2011 -0400
summary: gc: unify stack frame layout
introduced a bug wherein no variables were
being registerized, making Go programs 2-3x
slower than they had been before.
This CL fixes that bug (along with some others
it was hiding) and adds a test that optimization
makes at least one test case faster.
R=ken2
CC=golang-dev
https://golang.org/cl/5174045
allocparams + tempname + compactframe
all knew about how to place stack variables.
Now only compactframe, renamed to allocauto,
does the work. Until the last minute, each PAUTO
variable is in its own space and has xoffset == 0.
This might break 5g. I get failures in concurrent
code running under qemu and I can't tell whether
it's 5g's fault or qemu's. We'll see what the real
ARM builders say.
R=ken2
CC=golang-dev
https://golang.org/cl/4973057
Was keeping a pointer to the labeled statement in n->right,
which meant that generic traversals of the tree visited it twice.
That combined with aggressive flattening of the block
structure when possible during parsing meant that
the kinds of label: code label: code label: code sequences
generated by yacc were giving the recursion 2ⁿ paths
through the program.
Fixes#2212.
R=lvd
CC=golang-dev
https://golang.org/cl/4960050
gc/bits.c
. improved format with associated cast;
gc/closure.c
gc/dcl.c
gc/range.c
gc/reflect.c
gc/sinit.c
. dropped unnecessary assignments;
gc/gen.c
. dropped unnecessary assignment;
. added static qualifier to local function definition;
gc/go.h
. added varargck pragmas;
gc/lex.c
. used {} instead of ; in if statement to suppress warning;
. replaced exit(0) with exits(0);
. added compilation conditions for SIGBUS/SIGSEGV;
. dropped unnecessary assignment;
gc/mparith2.c
. dropped four unnecessary assignments/initialisations;
gc/obj.c
. added type cast to local pointer;
gc/pgen.c
. added cast and related print format;
gc/subr.c
. replaced exit(1) with exits("error");
. replaced unlink() with remove();
. renamed local cistrmp() as ucistrmp() to remove conflict with
Plan 9 function by the same name;
gc/swt.c
. added braces instead of ; as empty statment;
gc/typecheck.c
. added static qualifier to local function definition;
. dropped unnecessary assignments;
gc/walk.c
. dropped unnecessary assignments;
. added static qualifier to local function definitions;
R=rsc
CC=golang-dev
https://golang.org/cl/4964046
-s now means *disable* escape analysis.
Fix escape leaks for struct/slice/map literals.
Add ... tracking.
Rewrite new(T) and slice literal into stack allocation when safe.
Add annotations to reflect.
Reflect is too chummy with the compiler,
so changes like these affect it more than they should.
R=lvd, dave, gustavo
CC=golang-dev
https://golang.org/cl/4954043
#include "go.h" (or "gg.h")
becomes
#include <u.h>
#include <libc.h>
#include "go.h"
so that go.y can #include <stdio.h>
after <u.h> but before "go.h".
This is necessary on Plan 9.
R=ken2
CC=golang-dev
https://golang.org/cl/4971041
For now it's switch-on-and-offable with -s, and the effects can be inspected
with -m. Defaults are the old codepaths.
R=rsc
CC=golang-dev
https://golang.org/cl/4634073
After allocparams and walk, remove unused auto variables
and re-layout the remaining in reverse alignment order.
R=rsc
CC=golang-dev
https://golang.org/cl/4568068
Input code like
0000 (x.go:2) TEXT main+0(SB),$36-0
0001 (x.go:3) MOVL $5,i+-8(SP)
0002 (x.go:3) MOVL $0,i+-4(SP)
0003 (x.go:4) MOVL $1,BX
0004 (x.go:4) MOVL i+-8(SP),AX
0005 (x.go:4) MOVL i+-4(SP),DX
0006 (x.go:4) MOVL AX,autotmp_0000+-20(SP)
0007 (x.go:4) MOVL DX,autotmp_0000+-16(SP)
0008 (x.go:4) MOVL autotmp_0000+-20(SP),CX
0009 (x.go:4) CMPL autotmp_0000+-16(SP),$0
0010 (x.go:4) JNE ,13
0011 (x.go:4) CMPL CX,$32
0012 (x.go:4) JCS ,14
0013 (x.go:4) MOVL $0,BX
0014 (x.go:4) SHLL CX,BX
0015 (x.go:4) MOVL BX,x+-12(SP)
0016 (x.go:5) MOVL x+-12(SP),AX
0017 (x.go:5) CDQ ,
0018 (x.go:5) MOVL AX,autotmp_0001+-28(SP)
0019 (x.go:5) MOVL DX,autotmp_0001+-24(SP)
0020 (x.go:5) MOVL autotmp_0001+-28(SP),AX
0021 (x.go:5) MOVL autotmp_0001+-24(SP),DX
0022 (x.go:5) MOVL AX,(SP)
0023 (x.go:5) MOVL DX,4(SP)
0024 (x.go:5) CALL ,runtime.printint+0(SB)
0025 (x.go:5) CALL ,runtime.printnl+0(SB)
0026 (x.go:6) RET ,
is problematic because the liveness range for
autotmp_0000 (0006-0009) is nested completely
inside a span where BX holds a live value (0003-0015).
Because the register allocator only looks at 0006-0009
to see which registers are used, it misses the fact that
BX is unavailable and uses it anyway.
The n->pun = anyregalloc() check in tempname is
a workaround for this bug, but I hit it again because
I did the tempname call before allocating BX, even
though I then used the temporary after storing in BX.
This should fix the real bug, and then we can remove
the workaround in tempname.
The code creates pseudo-variables for each register
and includes that information in the liveness propagation.
Then the regu fields can be populated using that more
complete information. With that approach, BX is marked
as in use on every line in the whole span 0003-0015,
so that the decision about autotmp_0000
(using only 0006-0009) still has all the information
it needs.
This is not specific to the 386, but it only happens in
generated code of the form
load R1
...
load var into R2
...
store R2 back into var
...
use R1
and for the most part the other compilers generate
the loads for a given compiled line before any of
the stores. Even so, this may not be the case everywhere,
so the change is worth making in all three.
R=ken2, ken, ken
CC=golang-dev
https://golang.org/cl/4529106
Makes all.bash work after echo 4 >/proc/cpu/alignment,
which means kill the process on an unaligned access.
The default behavior on DreamPlug/GuruPlug/SheevaPlug
is to simulate an ARMv3 and just let the unaligned accesses
stop at the word boundary, resulting in all kinds of surprises.
Fixes#1240.
R=ken2
CC=golang-dev
https://golang.org/cl/4551064
No semantic changes here, but working
toward being able to align structs based
on the maximum alignment of the fields
inside instead of having a fixed alignment
for all structs (issue 482).
R=ken2
CC=golang-dev
https://golang.org/cl/3617041
cannot allocate an audomatic temp
while real registers are allocated.
there is a chance that the automatic
will be allocated to one of the
allocated registers. the fix is to
not registerize such variables.
R=rsc
CC=golang-dev
https://golang.org/cl/1202042
5g/6g/8g: add import statements to export metadata, mapping package path to package name.
recognize "" as the path of the package in export metadata.
use "" as the path of the package in object symbol names.
5c/6c/8c, 5a/6a/8a: rewrite leading . to "". so that ·Sin means Sin in this package.
5l/6l/8l: rewrite "" in symbol names as object files are read.
gotest: handle new symbol names.
gopack: handle new import lines in export metadata.
Collectively, these changes eliminate the assumption of a global
name space in the object file formats. Higher level pieces such as
reflect and the computation of type hashes still depend on the
assumption; we're not done yet.
R=ken2, r, ken3
CC=golang-dev
https://golang.org/cl/186263
because they are in package runtime.
another step to enforcing package boundaries.
R=r
DELTA=732 (114 added, 93 deleted, 525 changed)
OCL=35811
CL=35824