1
0
mirror of https://github.com/golang/go synced 2024-10-05 16:41:21 -06:00
Commit Graph

318 Commits

Author SHA1 Message Date
Russ Cox
7a7c0ffb47 cmd/gc: correct liveness for fat variables
The VARDEF placement must be before the initialization
but after any final use. If you have something like s = ... using s ...
the rhs must be evaluated, then the VARDEF, then the lhs
assigned.

There is a large comment in pgen.c on gvardef explaining
this in more detail.

This CL also includes Ian's suggestions from earlier CLs,
namely commenting the use of mode in link.h and fixing
the precedence of the ~r check in dcl.c.

This CL enables the check that if liveness analysis decides
a variable is live on entry to the function, that variable must
be a function parameter (not a result, and not a local variable).
If this check fails, it indicates a bug in the liveness analysis or
in the generated code being analyzed.

The race detector generates invalid code for append(x, y...).
The code declares a temporary t and then uses cap(t) before
initializing t. The new liveness check catches this bug and
stops the compiler from writing out the buggy code.
Consequently, this CL disables the race detector tests in
run.bash until the race detector bug can be fixed
(golang.org/issue/7334).

Except for the race detector bug, the liveness analysis check
does not detect any problems (this CL and the previous CLs
fixed all the detected problems).

The net test still fails with GOGC=0 but the rest of the tests
now pass or time out (because GOGC=0 is so slow).

TBR=iant
CC=golang-codereviews
https://golang.org/cl/64170043
2014-02-15 10:58:55 -05:00
Russ Cox
91b1f7cb15 cmd/gc: handle variable initialization by block move in liveness
Any initialization of a variable by a block copy or block zeroing
or by multiple assignments (componentwise copying or zeroing
of a multiword variable) needs to emit a VARDEF. These cases were not.

Fixes #7205.

TBR=iant
CC=golang-codereviews
https://golang.org/cl/63650044
2014-02-13 22:45:16 -05:00
Russ Cox
7addda685d cmd/5g, cmd/8g: fix build
The test added in CL 63630043 fails on 5g and 8g because they
were not emitting the VARDEF instruction when clearing a fat
value by clearing the components. 6g had the call in the right place.

Hooray tests.

TBR=iant
CC=golang-codereviews
https://golang.org/cl/63660043
2014-02-13 22:30:35 -05:00
Russ Cox
801e40a0a4 cmd/gc: rename AFATVARDEF to AVARDEF
The "fat" referred to being used for multiword values only.
We're going to use it for non-fat values sometimes too.

No change other than the renaming.

TBR=iant
CC=golang-codereviews
https://golang.org/cl/63650043
2014-02-13 22:17:22 -05:00
Shenghou Ma
ca6186aa26 cmd/6c, cmd/8c, cmd/8g: fix print of pc (which is vlong).
While we're at it, fix a wrong for statement in cmd/8g.

LGTM=rsc
R=rsc, golang-codereviews
CC=golang-codereviews
https://golang.org/cl/62700044
2014-02-13 03:09:03 -05:00
Russ Cox
684332f47c cmd/5g: fix regopt bug in copyprop
copyau1 was assuming that it could deduce the type of the
middle register p->reg from the type of the left or right
argument: in CMPF F1, F2, the p->reg==2 must be a D_FREG
because p->from is F1, and in CMP R1, R2, the p->reg==2 must
be a D_REG because p->from is R1.

This heuristic fails for CMP $0, R2, which was causing copyau1
not to recognize p->reg==2 as a reference to R2, which was
keeping it from properly renaming the register use when
substituting registers.

cmd/5c has the right approach: look at the opcode p->as to
decide the kind of register. It is unclear where 5g's copyau1
came from; perhaps it was an attempt to avoid expanding 5c's
a2type to include new instructions used only by 5g.

Copy a2type from cmd/5c, expand to include additional instructions,
and make it crash the compiler if asked about an instruction
it does not understand (avoid silent bugs in the future if new
instructions are added).

Should fix current arm build breakage.

While we're here, fix the print statements dumping the pred and
succ info in the asm listing to pass an int arg to %.4ud
(Prog.pc is a vlong now, due to the liblink merge).

TBR=ken2
CC=golang-codereviews
https://golang.org/cl/62730043
2014-02-13 03:54:55 +00:00
Anthony Martin
2cae0591cd cmd/cc, cmd/gc, cmd/ld: consolidate print format routines
We now use the %A, %D, %P, and %R routines from liblink
across the board.

Fixes #7178.
Fixes #7055.

LGTM=iant
R=golang-codereviews, gobot, rsc, dave, iant, remyoudompheng
CC=golang-codereviews
https://golang.org/cl/49170043
2014-02-12 14:29:11 -05:00
Daniel Morsing
85e4cb2f10 cmd/6g, cmd/8g, cmd/5g: make the undefined instruction have no successors
The UNDEF instruction was listed in the instruction data as having the next instruction in the stream as its successor. This confused the optimizer into adding a load where it wasn't needed, in turn confusing the liveness analysis pass for GC bitmaps into thinking that the variable was live.

Fixes #7229.

LGTM=iant, rsc
R=golang-codereviews, bradfitz, iant, dave, rsc
CC=golang-codereviews
https://golang.org/cl/56910045
2014-02-11 20:25:40 +00:00
Rob Pike
289d46399b cmd/8g: don't crash if Prog->u.branch is nil
The code is copied from cmd/6g.
Empirically, all branch targets are nil in this code so
something is still wrong, but at least this stops 8g -S
from crashing.

Update #7178

LGTM=dave, iant
R=iant, dave
CC=golang-codereviews
https://golang.org/cl/58400043
2014-01-29 16:14:45 -08:00
Russ Cox
4acb70d377 cmd/gc: bypass DATA instruction for data initialized to integer constant
Eventually we will want to bypass DATA for everything,
but the relocations are not standardized well enough across
architectures to make that possible.

This did not help as much as I expected, but it is definitely better.
It shaves maybe 1-2% off all.bash depending on how much you
trust the timings of a single run:

Before: 241.139r 362.702u 112.967s
After:  234.339r 359.623u 111.045s

R=golang-codereviews, gobot, r, iant
CC=golang-codereviews
https://golang.org/cl/44650043
2013-12-20 14:24:39 -05:00
Russ Cox
870e821ded cmd/cc, cmd/gc: update compilers, assemblers for liblink changes
- add buffered stdout to all tools and provide to link ctxt.
- avoid extra \n before ! in .6 files written by assemblers
  (makes them match the C compilers).
- use linkwriteobj instead of linkouthist+linkwritefuncs.
- in assemblers and C compilers, record pc explicitly in Prog,
  for use by liblink.
- in C compilers, preserve jump target links.
- in Go compilers (gsubr.c) attach gotype directly to
  corresponding LSym* instead of rederiving from instruction stream.
- in Go compilers, emit just one definition for runtime.zerovalue
  from each compilation.

This CL consists entirely of small adjustments.
The heavy lifting is in CL 39680043.
Each depends on the other.

R=golang-dev, dave, iant
CC=golang-dev
https://golang.org/cl/37030045
2013-12-16 12:51:38 -05:00
David du Colombier
58005207d2 cmd/8c, cmd/8g, cmd/8l: fix Plan 9 warnings
warning: src/cmd/8c/list.c:124 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:134 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:142 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:152 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:156 format mismatch d VLONG, arg 4
warning: src/cmd/8c/list.c:160 format mismatch d VLONG, arg 4
warning: src/cmd/8c/list.c:165 format mismatch d VLONG, arg 4
warning: src/cmd/8c/list.c:167 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:172 format mismatch d VLONG, arg 4
warning: src/cmd/8c/list.c:174 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:178 format mismatch d VLONG, arg 3
warning: src/cmd/8c/list.c:184 format mismatch d VLONG, arg 3

warning: src/cmd/8g/list.c:91 format mismatch d VLONG, arg 4
warning: src/cmd/8g/list.c:100 format mismatch d VLONG, arg 4
warning: src/cmd/8g/list.c:114 format mismatch d VLONG, arg 5
warning: src/cmd/8g/list.c:118 format mismatch d VLONG, arg 5
warning: src/cmd/8g/list.c:122 format mismatch d VLONG, arg 5
warning: src/cmd/8g/list.c:126 format mismatch d VLONG, arg 5
warning: src/cmd/8g/list.c:136 format mismatch d VLONG, arg 4

warning: src/cmd/8l/list.c:107 format mismatch d VLONG, arg 4
warning: src/cmd/8l/list.c:125 format mismatch ux VLONG, arg 4
warning: src/cmd/8l/list.c:128 format mismatch ux VLONG, arg 4
warning: src/cmd/8l/list.c:130 format mismatch d VLONG, arg 4
warning: src/cmd/8l/list.c:134 format mismatch d VLONG, arg 5
warning: src/cmd/8l/list.c:138 format mismatch d VLONG, arg 6
warning: src/cmd/8l/list.c:143 format mismatch d VLONG, arg 5
warning: src/cmd/8l/list.c:148 format mismatch d VLONG, arg 5
warning: src/cmd/8l/list.c:150 format mismatch d VLONG, arg 4
warning: src/cmd/8l/list.c:154 format mismatch d VLONG, arg 4
warning: src/cmd/8l/list.c:158 format mismatch d VLONG, arg 4
warning: src/cmd/8l/obj.c:132 format mismatch ux VLONG, arg 2

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/39710043
2013-12-09 18:47:22 -05:00
Russ Cox
f606c1be80 cmd/5g, cmd/6g, cmd/8g: use liblink
Preparation for golang.org/s/go13linker work.

This CL does not build by itself. It depends on 35740044
and 35790044 and will be submitted at the same time.

R=iant
CC=golang-dev
https://golang.org/cl/34590045
2013-12-08 22:51:55 -05:00
Carl Shapiro
f056daf075 cmd/5g, cmd/5l, cmd/6g, cmd/6l, cmd/8g, cmd/8l, cmd/gc, runtime: generate pointer maps by liveness analysis
This change allows the garbage collector to examine stack
slots that are determined as live and containing a pointer
value by the garbage collector.  This results in a mean
reduction of 65% in the number of stack slots scanned during
an invocation of "GOGC=1 all.bash".

Unfortunately, this does not yet allow garbage collection to
be precise for the stack slots computed as live.  Pointers
confound the determination of what definitions reach a given
instruction.  In general, this problem is not solvable without
runtime cost but some advanced cooperation from the compiler
might mitigate common cases.

R=golang-dev, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/14430048
2013-12-05 17:35:22 -08:00
Russ Cox
aa0439ba65 cmd/gc: eliminate redundant &x.Field nil checks
This eliminates ~75% of the nil checks being emitted,
on all architectures. We can do better, but we need
a bit more general support from the compiler, and
I don't want to do that so close to Go 1.2.
What's here is simple but effective and safe.

A few small code generation cleanups were required
to make the analysis consistent on all systems about
which nil checks are omitted, at least in the test.

Fixes #6019.

R=ken2
CC=golang-dev
https://golang.org/cl/13334052
2013-09-17 16:54:22 -04:00
Rémy Oudompheng
ff416a3f19 cmd/gc: inline copy in frontend to call memmove directly.
A new node type OSPTR is added to refer to the data pointer of
strings and slices in a simple way during walk(). It will be
useful for future work on simplification of slice arithmetic.

benchmark                  old ns/op    new ns/op    delta
BenchmarkCopy1Byte                 9            8  -13.98%
BenchmarkCopy2Byte                14            8  -40.49%
BenchmarkCopy4Byte                13            8  -35.04%
BenchmarkCopy8Byte                13            8  -37.10%
BenchmarkCopy12Byte               14           12  -15.38%
BenchmarkCopy16Byte               14           12  -17.24%
BenchmarkCopy32Byte               19           14  -27.32%
BenchmarkCopy128Byte              31           26  -15.29%
BenchmarkCopy1024Byte            100           92   -7.50%
BenchmarkCopy1String              10            7  -28.99%
BenchmarkCopy2String              10            7  -28.06%
BenchmarkCopy4String              10            8  -22.69%
BenchmarkCopy8String              10            8  -23.30%
BenchmarkCopy12String             11           11   -5.88%
BenchmarkCopy16String             11           11   -5.08%
BenchmarkCopy32String             15           14   -6.58%
BenchmarkCopy128String            28           25  -10.60%
BenchmarkCopy1024String           95           95   +0.53%

R=golang-dev, bradfitz, cshapiro, dave, daniel.morsing, rsc, khr, khr
CC=golang-dev
https://golang.org/cl/9101048
2013-09-12 00:15:28 +02:00
Russ Cox
6d47de2f40 cmd/5g, cmd/6g, cmd/8g: remove O(n) reset loop in copyprop
Simpler version of CL 13084043.

R=ken2
CC=golang-dev
https://golang.org/cl/13602045
2013-09-11 15:22:11 -04:00
Russ Cox
a0bc379d46 undo CL 13084043 / ef4ee02a5853
There is a cleaner, simpler way.

««« original CL description
cmd/5g, cmd/6g, cmd/8g: faster compilation
Replace linked list walk with memset.
This reduces CPU time taken by 'go install -a std' by ~10%.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.714s	0m14.858s	0m6.138s
0m22.644s	0m14.875s	0m6.120s
0m22.604s	0m14.854s	0m6.081s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/13084043
»»»

TBR=dvyukov
CC=golang-dev
https://golang.org/cl/13352049
2013-09-11 15:14:11 -04:00
Russ Cox
af2a3193af cmd/5g, cmd/6g, cmd/8g: simplify for loop in bitmap generation
Lucio De Re reports that the more complex
loop miscompiles on Plan 9.

R=ken2
CC=golang-dev
https://golang.org/cl/13602043
2013-09-06 16:49:11 -04:00
Dmitriy Vyukov
79dca0327e libbio, all cmd: consistently use BGETC/BPUTC instead of Bgetc/Bputc
Also introduce BGET2/4, BPUT2/4 as they are widely used.
Slightly improve BGETC/BPUTC implementation.
This gives ~5% CPU time improvement on go install -a -p1 std.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.999s	0m15.841s	0m5.889s
0m22.845s	0m15.808s	0m5.850s
0m22.889s	0m15.832s	0m5.848s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/12745047
2013-08-30 15:46:12 +04:00
Lucio De Re
bf58d20999 cmd/8g: add descriptions for some missing instructions.
These instructions are emitted when GO386=387 or the target
i386 CPU does not have SSE2 capabilities.

Fixes #6215.

R=golang-dev, remyoudompheng
CC=golang-dev
https://golang.org/cl/12812045
2013-08-29 14:41:01 +02:00
Dmitriy Vyukov
06e686def6 cmd/5g, cmd/6g, cmd/8g: faster compilation
Replace linked list walk with memset.
This reduces CPU time taken by 'go install -a std' by ~10%.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.714s	0m14.858s	0m6.138s
0m22.644s	0m14.875s	0m6.120s
0m22.604s	0m14.854s	0m6.081s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/13084043
2013-08-21 14:20:28 +04:00
Russ Cox
3b4d792606 cmd/gc: separate "has pointers" from "needs zeroing" in stack frame
When the new call site-specific frame bitmaps are available,
we can cut the zeroing to just those values that need it due
to scope escaping.

R=cshapiro, cshapiro
CC=golang-dev
https://golang.org/cl/13045043
2013-08-16 21:45:59 -04:00
Carl Shapiro
d3b04f46b5 cmd/5g, cmd/6g, cmd/8g: update frame zeroing for new bitmap format
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12740046
2013-08-16 01:15:04 -04:00
Russ Cox
999a36f9af cmd/gc: &x panics if x does
See golang.org/s/go12nil.

This CL is about getting all the right checks inserted.
A followup CL will add an optimization pass to
remove redundant checks.

R=ken2
CC=golang-dev
https://golang.org/cl/12970043
2013-08-15 14:38:32 -04:00
Rémy Oudompheng
0d9db86f74 cmd/5g, cmd/6g, cmd/8g: restore occurrences of R replaced by nil in comments.
R=golang-dev, bradfitz, rsc
CC=golang-dev
https://golang.org/cl/12842043
2013-08-14 21:24:48 +02:00
Russ Cox
fa72679f07 cmd/gc: add temporary-merging optimization pass
The compilers assume they can generate temporary variables
as needed to preserve the right semantics or simplify code
generation and the back end will still generate good code.
This turns out not to be true. The back ends will only
track the first 128 variables per function and give up
on the remainder. That needs to be fixed too, in a later CL.

This CL merges temporary variables with equal types and
non-overlapping lifetimes using the greedy algorithm in
Poletto and Sarkar, "Linear Scan Register Allocation",
ACM TOPLAS 1999.

The result can be striking in the right functions.

Top 20 frame size changes in a 6g godoc binary by bytes saved:

5464 1984 (-3480, -63.7%) go/build.(*Context).Import
4456 1824 (-2632, -59.1%) go/printer.(*printer).expr1
2560   80 (-2480, -96.9%) time.nextStdChunk
3496 1608 (-1888, -54.0%) go/printer.(*printer).stmt
1896  272 (-1624, -85.7%) net/http.init
2688 1400 (-1288, -47.9%) fmt.(*pp).printReflectValue
2800 1512 (-1288, -46.0%) main.main
3296 2016 (-1280, -38.8%) crypto/tls.(*Conn).clientHandshake
1664  488 (-1176, -70.7%) time.loadZoneZip
1760  608 (-1152, -65.5%) time.parse
4104 3072 (-1032, -25.1%) runtime/pprof.writeHeap
1680  712 ( -968, -57.6%) go/ast.Walk
2488 1560 ( -928, -37.3%) crypto/x509.parseCertificate
1128  392 ( -736, -65.2%) math/big.nat.divLarge
1528  864 ( -664, -43.5%) go/printer.(*printer).fieldList
1360  712 ( -648, -47.6%) regexp/syntax.(*parser).factor
2104 1528 ( -576, -27.4%) encoding/asn1.parseField
1064  504 ( -560, -52.6%) encoding/xml.(*Decoder).text
 584   48 ( -536, -91.8%) html.init
1400  864 ( -536, -38.3%) go/doc.playExample

In the same godoc build, cuts the number of functions with
too many vars from 83 to 32.

R=ken2
CC=golang-dev
https://golang.org/cl/12829043
2013-08-13 00:09:31 -04:00
Russ Cox
dbf96addfb cmd/gc: move flow graph into portable opt
Now there's only one copy of the flow graph construction
and dominator computation, and different optimizations
can attach different annotations to the instructions.

R=ken2
CC=golang-dev
https://golang.org/cl/12797045
2013-08-12 22:02:10 -04:00
Russ Cox
9a0a59f171 cmd/6g, cmd/8g: proginfo carry fixes
Bugs pointed out by cshapiro in CL 12637051.

R=cshapiro
CC=golang-dev
https://golang.org/cl/12815043
2013-08-12 21:02:55 -04:00
Russ Cox
b3b87143f2 cmd/gc: support for "portable" optimization logic
Code in gc/popt.c is compiled as part of 5g, 6g, and 8g,
meaning it can use arch-specific headers but there's
just one copy of the code.

This is the same arrangement we use for the portable
code generation logic in gc/pgen.c.

Move fixjmp and noreturn there to get the ball rolling.

R=ken2
CC=golang-dev
https://golang.org/cl/12789043
2013-08-12 19:14:02 -04:00
Russ Cox
ac0df6ce89 cmd/8g: factor out prog information
Like CL 12637051, but for 8g instead of 6g.
Fix a few minor 6g errors too.

R=ken2
CC=golang-dev
https://golang.org/cl/12778043
2013-08-12 13:05:40 -04:00
Russ Cox
7910cd68b5 cmd/gc: zero pointers on entry to function
On entry to a function, zero the results and zero the pointer
section of the local variables.

This is an intermediate step on the way to precise collection
of Go frames.

This can incur a significant (up to 30%) slowdown, but it also ensures
that the garbage collector never looks at a word in a Go frame
and sees a stale pointer value that could cause a space leak.
(C frames and assembly frames are still possibly problematic.)

This CL is required to start making collection of interface values
as precise as collection of pointer values are today.
Since we have to dereference the interface type to understand
whether the value is a pointer, it is critical that the type field be
initialized.

A future CL by Carl will make the garbage collection pointer
bitmaps context-sensitive. At that point it will be possible to
remove most of the zeroing. The only values that will still need
zeroing are values whose addresses escape the block scoping
of the function but do not escape to the heap.

benchmark                         old ns/op    new ns/op    delta
BenchmarkBinaryTree17            4420289180   4331060459   -2.02%
BenchmarkFannkuch11              3442469663   3277706251   -4.79%
BenchmarkFmtFprintfEmpty                100          142  +42.00%
BenchmarkFmtFprintfString               262          310  +18.32%
BenchmarkFmtFprintfInt                  213          281  +31.92%
BenchmarkFmtFprintfIntInt               355          431  +21.41%
BenchmarkFmtFprintfPrefixedInt          321          383  +19.31%
BenchmarkFmtFprintfFloat                444          533  +20.05%
BenchmarkFmtManyArgs                   1380         1559  +12.97%
BenchmarkGobDecode                 10240054     11794915  +15.18%
BenchmarkGobEncode                 17350274     19970478  +15.10%
BenchmarkGzip                     455179460    460699139   +1.21%
BenchmarkGunzip                   114271814    119291574   +4.39%
BenchmarkHTTPClientServer             89051        89894   +0.95%
BenchmarkJSONEncode                40486799     52691558  +30.15%
BenchmarkJSONDecode                94193361    112428781  +19.36%
BenchmarkMandelbrot200              4747060      4748043   +0.02%
BenchmarkGoParse                    6363798      6675098   +4.89%
BenchmarkRegexpMatchEasy0_32            129          171  +32.56%
BenchmarkRegexpMatchEasy0_1K            365          395   +8.22%
BenchmarkRegexpMatchEasy1_32            106          152  +43.40%
BenchmarkRegexpMatchEasy1_1K            952         1245  +30.78%
BenchmarkRegexpMatchMedium_32           198          283  +42.93%
BenchmarkRegexpMatchMedium_1K         79006       101097  +27.96%
BenchmarkRegexpMatchHard_32            3478         5115  +47.07%
BenchmarkRegexpMatchHard_1K          110245       163582  +48.38%
BenchmarkRevcomp                  777384355    793270857   +2.04%
BenchmarkTemplate                 136713089    157093609  +14.91%
BenchmarkTimeParse                     1511         1761  +16.55%
BenchmarkTimeFormat                     535          850  +58.88%

benchmark                          old MB/s     new MB/s  speedup
BenchmarkGobDecode                    74.95        65.07    0.87x
BenchmarkGobEncode                    44.24        38.43    0.87x
BenchmarkGzip                         42.63        42.12    0.99x
BenchmarkGunzip                      169.81       162.67    0.96x
BenchmarkJSONEncode                   47.93        36.83    0.77x
BenchmarkJSONDecode                   20.60        17.26    0.84x
BenchmarkGoParse                       9.10         8.68    0.95x
BenchmarkRegexpMatchEasy0_32         247.24       186.31    0.75x
BenchmarkRegexpMatchEasy0_1K        2799.20      2591.93    0.93x
BenchmarkRegexpMatchEasy1_32         299.31       210.44    0.70x
BenchmarkRegexpMatchEasy1_1K        1074.71       822.45    0.77x
BenchmarkRegexpMatchMedium_32          5.04         3.53    0.70x
BenchmarkRegexpMatchMedium_1K         12.96        10.13    0.78x
BenchmarkRegexpMatchHard_32            9.20         6.26    0.68x
BenchmarkRegexpMatchHard_1K            9.29         6.26    0.67x
BenchmarkRevcomp                     326.95       320.40    0.98x
BenchmarkTemplate                     14.19        12.35    0.87x

R=cshapiro
CC=golang-dev
https://golang.org/cl/12616045
2013-08-09 23:10:58 -04:00
Dmitriy Vyukov
8679d5f2b5 cmd/gc: record argument size for all indirect function calls
This is required to properly unwind reflect.methodValueCall/makeFuncStub.
Fixes #5954.
Stats for 'go install std':
61849 total INSTCALL
24655 currently have ArgSize metadata
27278 have ArgSize metadata with this change
godoc size before: 11351888, after: 11364288

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12163043
2013-07-31 20:00:33 +04:00
Russ Cox
48769bf546 runtime: use funcdata to supply garbage collection information
This CL introduces a FUNCDATA number for runtime-specific
garbage collection metadata, changes the C and Go compilers
to emit that metadata, and changes the runtime to expect it.

The old pseudo-instructions that carried this information
are gone, as is the linker code to process them.

R=golang-dev, dvyukov, cshapiro
CC=golang-dev
https://golang.org/cl/11406044
2013-07-19 16:04:09 -04:00
Daniel Morsing
85a7c090c4 cmd/8g: Make clearfat non-interleaved with pointer calculations.
clearfat (used to zero initialize structures) will use AX for x86 block ops. If we write to AX while calculating the dest pointer, we will fill the structure with incorrect values.
Since 64-bit arithmetic uses AX to synthesize a 64-bit register, getting an adress by indexing with 64-bit ops can clobber the register.

Fixes #5820.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/11383043
2013-07-17 11:04:34 +02:00
Russ Cox
7b3c8b7ac8 cmd/5g, cmd/6g, cmd/8g: insert arg size annotations on runtime calls
If calling a function in package runtime, emit argument size
information around the call in case the call is to a variadic C function.

R=ken2
CC=golang-dev
https://golang.org/cl/11371043
2013-07-16 16:25:10 -04:00
Russ Cox
7e97d39879 cmd/5g, cmd/6g, cmd/8g: fix line number of caller of deferred func
Deferred functions are not run by a call instruction. They are run by
the runtime editing registers to make the call start with a caller PC
returning to a
        CALL deferreturn
instruction.

That instruction has always had the line number of the function's
closing brace, but that instruction's line number is irrelevant.
Stack traces show the line number of the instruction before the
return PC, because normally that's what started the call. Not so here.
The instruction before the CALL deferreturn could be almost anywhere
in the function; it's unrelated and its line number is incorrect to show.

Fix the line number by inserting a true hardware no-op with the right
line number before the returned-to CALL instruction. That is, the deferred
calls now appear to start with a caller PC returning to the second instruction
in this sequence:
        NOP
        CALL deferreturn

The traceback will show the line number of the NOP, which we've set
to be the line number of the function's closing brace.

The NOP here is not the usual pseudo-instruction, which would be
elided by the linker. Instead it is the real hardware instruction:
XCHG AX, AX on 386 and amd64, and AND.EQ R0, R0, R0 on ARM.

Fixes #5856.

R=ken2, ken
CC=golang-dev
https://golang.org/cl/11223043
2013-07-12 13:47:55 -04:00
Daniel Morsing
3c3ce8e7fb cmd/6g, cmd/8g: prevent constant propagation of non-constant LEA.
Fixes #5809.

R=golang-dev, dave, rsc, nigeltao
CC=golang-dev
https://golang.org/cl/10785043
2013-07-05 16:11:22 +02:00
Russ Cox
b4e92cee97 cmd/gc: support x[i:j:k]
Design doc at golang.org/s/go12slice.
This is an experimental feature and may not be included in the release.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/10743046
2013-07-01 20:32:36 -04:00
Russ Cox
0713293374 cmd/5g, cmd/6g, cmd/8g: fix comment
Keeping the string "compactframe" because that's what
I always search for to find this code. But point to the real place too.

TBR=iant
CC=golang-dev
https://golang.org/cl/10676047
2013-06-28 12:06:25 -07:00
Russ Cox
6fa3c89b77 runtime: record proper goroutine state during stack split
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.

For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:

        TEXT myfunc
                check for split
                jmpcond nosplit
                call morestack
        nosplit:
                sub $xxx, sp

Now an extra instruction is inserted after the call:

        TEXT myfunc
        start:
                check for split
                jmpcond nosplit
                call morestack
                jmp start
        nosplit:
                sub $xxx, sp

The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.

The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.

The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.

Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)

runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
        morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
        sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow

goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
        /Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
        /Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
        /Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
        /Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
        /Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8

And here is a seg fault during oldstack:

SIGSEGV: segmentation violation
PC=0x1b2a6

runtime.oldstack()
        /Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
        /Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22

goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
        /Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
        /Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
        /Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
        strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8

goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
        /Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
        /Users/rsc/g/go/src/pkg/runtime/proc.c:166

rax     0x23ccc0
rbx     0x23ccc0
rcx     0x0
rdx     0x38
rdi     0x2102c0170
rsi     0x221032cfe0
rbp     0x221032cfa0
rsp     0x7fff5fbff5b0
r8      0x2102c0120
r9      0x221032cfa0
r10     0x221032c000
r11     0x104ce8
r12     0xe5c80
r13     0x1be82baac718
r14     0x13091135f7d69200
r15     0x0
rip     0x1b2a6
rflags  0x10246
cs      0x2b
fs      0x0
gs      0x0

Fixes #5723.

R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
2013-06-27 11:32:01 -04:00
Russ Cox
1f51d27922 cmd/gc: move genembedtramp into portable code
Requires adding new linker instruction
        RET	f(SB)
meaning return but then immediately call f.
This is what you'd use to implement a tail call after
fiddling with the arguments, but the compiler only
uses it in genwrapper.

This CL eliminates the copy-and-paste genembedtramp
functions from 5g/8g/6g and makes the code run on ARM
for the first time. It removes a small special case for function
generation, which should help Carl a bit, but at the same time
it does not bother to implement general tail call optimization,
which we do not want anyway.

Fixes #5627.

R=ken2
CC=golang-dev
https://golang.org/cl/10057044
2013-06-11 09:41:49 -04:00
Shenghou Ma
faef52c214 all: fix typos
R=golang-dev, bradfitz, khr, r
CC=golang-dev
https://golang.org/cl/7461046
2013-06-09 21:50:24 +08:00
Carl Shapiro
acd887ba57 cmd/5g, cmd/6g, cmd/8g: remove prototypes for proglist
Each of the backends has two prototypes for this function but
no corresponding definition.

R=golang-dev, bradfitz, khr
CC=golang-dev
https://golang.org/cl/9930045
2013-06-04 16:22:59 -07:00
Carl Shapiro
31be5deae4 cmd/5g, cmd/6g, cmd/8g: provide embedded trampolines with argument size information
An embedded trampoline is a function that exists to marshal
a receiver of type *S to a receiver of type *T when T is an
embedded field in S.

Embedded trampolines are generated by a special path through
the compiler and are not subject to the general analysis and
annotation done to functions.  Their effects must be provided
explicitly.

R=golang-dev, r, daniel.morsing, minux.ma
CC=golang-dev
https://golang.org/cl/9874043
2013-05-31 13:34:57 -07:00
Carl Shapiro
4e0a51c210 cmd/5l, cmd/6l, cmd/8l, cmd/gc, runtime: generate and use bitmaps of argument pointer locations
With this change the compiler emits a bitmap for each function
covering its stack frame arguments area.  If an argument word
is known to contain a pointer, a bit is set.  The garbage
collector reads this information when scanning the stack by
frames and uses it to ignores locations known to not contain a
pointer.

R=golang-dev, bradfitz, daniel.morsing, dvyukov, khr, khr, iant, cshapiro
CC=golang-dev
https://golang.org/cl/9223046
2013-05-28 17:59:10 -07:00
Rob Pike
4dcb13bb44 cmd/gc: fix some overflows in the compiler
Some 64-bit fields were run through 32-bit words, some counts were
not checked for overflow, and relocations must fit in 32 bits.
Tests to follow.

R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/9033043
2013-04-29 22:44:40 -07:00
Ian Lance Taylor
578dc3a96c cmd/5g, cmd/6g, cmd/8g: more nil ptr to large struct checks
R=r, ken, khr, daniel.morsing
CC=dsymonds, golang-dev, rickyz
https://golang.org/cl/8925043
2013-04-24 08:13:01 -07:00
Russ Cox
d3c758d7d2 cmd/gc: implement method values
R=ken2, ken
CC=golang-dev
https://golang.org/cl/7546052
2013-03-20 17:11:09 -04:00
Rémy Oudompheng
4c203172a2 cmd/8g: fix code generation of int64(0) == int64(0).
The code would violate the contract of cmp64.

Fixes #5002.

R=rsc, golang-dev
CC=golang-dev
https://golang.org/cl/7593043
2013-03-07 21:47:45 +01:00
Russ Cox
aa3efb28f0 cmd/gc: can stop tracking gotype in regopt
Now that the type information is in TYPE instructions
that are not rewritten by the optimization passes,
we don't have to try to preserve the type information
(no longer) attached to MOV instructions.

R=ken2
CC=golang-dev
https://golang.org/cl/7402054
2013-02-25 16:11:34 -05:00
Russ Cox
1d5dc4fd48 cmd/gc: emit explicit type information for local variables
The type information is (and for years has been) included
as an extra field in the address chunk of an instruction.
Unfortunately, suppose there is a string at a+24(FP) and
we have an instruction reading its length. It will say:

        MOVQ x+32(FP), AX

and the type of *that* argument is int (not slice), because
it is the length being read. This confuses the picture seen
by debuggers and now, worse, by the garbage collector.

Instead of attaching the type information to all uses,
emit an explicit list of TYPE instructions with the information.
The TYPE instructions are no-ops whose only role is to
provide an address to attach type information to.

For example, this function:

        func f(x, y, z int) (a, b string) {
                return
        }

now compiles into:

        --- prog list "f" ---
        0000 (/Users/rsc/x.go:3) TEXT    f+0(SB),$0-56
        0001 (/Users/rsc/x.go:3) LOCALS  ,
        0002 (/Users/rsc/x.go:3) TYPE    x+0(FP){int},$8
        0003 (/Users/rsc/x.go:3) TYPE    y+8(FP){int},$8
        0004 (/Users/rsc/x.go:3) TYPE    z+16(FP){int},$8
        0005 (/Users/rsc/x.go:3) TYPE    a+24(FP){string},$16
        0006 (/Users/rsc/x.go:3) TYPE    b+40(FP){string},$16
        0007 (/Users/rsc/x.go:3) MOVQ    $0,b+40(FP)
        0008 (/Users/rsc/x.go:3) MOVQ    $0,b+48(FP)
        0009 (/Users/rsc/x.go:3) MOVQ    $0,a+24(FP)
        0010 (/Users/rsc/x.go:3) MOVQ    $0,a+32(FP)
        0011 (/Users/rsc/x.go:4) RET     ,

The { } show the formerly hidden type information.
The { } syntax is used when printing from within the gc compiler.
It is not accepted by the assemblers.

The same type information is now included on global variables:

0055 (/Users/rsc/x.go:15) GLOBL   slice+0(SB){[]string},$24(AL*0)

This more accurate type information fixes a bug in the
garbage collector's precise heap collection.

The linker only cares about globals right now, but having the
local information should make things a little nicer for Carl
in the future.

Fixes #4907.

R=ken2
CC=golang-dev
https://golang.org/cl/7395056
2013-02-25 12:13:47 -05:00
Russ Cox
9f647288ef cmd/gc: avoid runtime code generation for closures
Change ARM context register to R7, to get out of the way
of the register allocator during the compilation of the
prologue statements (it wants to use R0 as a temporary).

Step 2 of http://golang.org/s/go11func.

R=ken2
CC=golang-dev
https://golang.org/cl/7369048
2013-02-22 14:25:50 -05:00
Russ Cox
6066fdcf38 cmd/6g, cmd/8g: switch to DX for indirect call block
runtime: add context argument to gogocall

Too many other things use AX, and at least one
(stack zeroing) cannot be moved onto a different
register. Use the less special DX instead.

Preparation for step 2 of http://golang.org/s/go11func.
Nothing interesting here, just split out so that we can
see it's correct before moving on.

R=ken2
CC=golang-dev
https://golang.org/cl/7395050
2013-02-22 10:47:54 -05:00
Russ Cox
1903ad7189 cmd/gc, reflect, runtime: switch to indirect func value representation
Step 1 of http://golang.org/s/go11func.

R=golang-dev, r, daniel.morsing, remyoudompheng
CC=golang-dev
https://golang.org/cl/7393045
2013-02-21 17:01:13 -05:00
Robert Griesemer
3ee87d02b0 cmd/godoc: use go/build to determine package and example files
Also:
- faster code for example extraction
- simplify handling of command documentation:
  all "main" packages are treated as commands
- various minor cleanups along the way

For commands written in Go, any doc.go file containing
documentation must now be part of package main (rather
then package documentation), otherwise the documentation
won't show up in godoc (it will still build, though).

For commands written in C, documentation may still be
in doc.go files defining package documentation, but the
recommended way is to explicitly ignore those files with
a +build ignore constraint to define package main.

Fixes #4806.

R=adg, rsc, dave, bradfitz
CC=golang-dev
https://golang.org/cl/7333046
2013-02-19 11:19:58 -08:00
Russ Cox
ac1015e7f3 cmd/8g: fix sse2 compare code gen
Fixes #4785.

R=ken2
CC=golang-dev
https://golang.org/cl/7300109
2013-02-14 14:49:04 -05:00
Russ Cox
7594440ef1 cmd/8g: add a few missing splitclean
Fixes #887.

R=ken2
CC=golang-dev
https://golang.org/cl/7303061
2013-02-07 17:55:25 -05:00
Shenghou Ma
255fb521da cmd/dist: add -Wstrict-prototypes to CFLAGS and fix all the compiler errors
Plan 9 compilers insist this but as we don't have Plan 9
builders, we'd better let gcc check the prototypes.

Inspired by CL 7289050.

R=golang-dev, seed, dave, rsc, lucio.dere
CC=akumar, golang-dev
https://golang.org/cl/7288056
2013-02-05 21:43:04 +08:00
Russ Cox
2c09d6992f cmd/gc: slightly better code generation
* Avoid treating CALL fn(SB) as justification for introducing
and tracking a registerized variable for fn(SB).

* Remove USED(n) after declaration and zeroing of n.
It was left over from when the compiler emitted more
aggressive set and not used errors, and it was keeping
the optimizer from removing a redundant zeroing of n
when n was a pointer or integer variable.

Update #597.

R=ken2
CC=golang-dev
https://golang.org/cl/7277048
2013-02-03 14:51:21 -05:00
Anthony Martin
2bddbf5e8f cmd/8g, cmd/dist, cmd/gc: fix warnings on Plan 9
cmd/8g/gsubr.c: unreachable code
cmd/8g/reg.c: overspecifed class
cmd/dist/plan9.c: unused parameter
cmd/gc/fmt.c: stkdelta is now a vlong
cmd/gc/racewalk.c: used but not set

R=golang-dev, seed, rsc
CC=golang-dev
https://golang.org/cl/7067052
2013-01-18 19:08:00 -08:00
Daniel Morsing
b73a1a8e32 cmd/6g, cmd/8g: Allow optimization of return registers.
The peephole optimizer would keep hands off AX and X0 during returns, even though go doesn't return through registers.

R=dave, rsc
CC=golang-dev
https://golang.org/cl/7030046
2013-01-11 15:44:42 +01:00
Daniel Morsing
f1e4ee3f49 cmd/5g, cmd/6g, cmd/8g: flush return parameters in case of panic.
Fixes #4066.

R=rsc, minux.ma
CC=golang-dev
https://golang.org/cl/7040044
2013-01-04 17:07:21 +01:00
Rémy Oudompheng
cf77dd37e7 cmd/8g: extend elimination of temporaries to SSE2 code.
Before:
(erf.go:188)    TEXT     Erf+0(SB),$220
(erf.go:265)    TEXT     Erfc+0(SB),$204
(lgamma.go:174) TEXT     Lgamma+0(SB),$948

After:
(erf.go:188)    TEXT     Erf+0(SB),$84
(erf.go:265)    TEXT     Erfc+0(SB),$84
(lgamma.go:174) TEXT     Lgamma+0(SB),$44

SSE before vs. SSE after:

benchmark             old ns/op    new ns/op    delta
BenchmarkAcosh               81           49  -39.14%
BenchmarkAsinh              109          109   +0.00%
BenchmarkAtanh               73           74   +0.68%
BenchmarkLgamma             138           42  -69.20%
BenchmarkModf                24           15  -36.95%
BenchmarkSqrtGo             565          556   -1.59%

R=rsc
CC=golang-dev
https://golang.org/cl/7028048
2013-01-03 00:44:31 +01:00
Rémy Oudompheng
15f2c01f44 cmd/8g: fix possibly uninitialized variable in foptoas.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7045043
2013-01-02 23:20:52 +01:00
Rémy Oudompheng
9afb34b42e cmd/dist, cmd/8g: implement GO386=387/sse to choose FPU flavour.
A new environment variable GO386 is introduced to choose between
code generation targeting 387 or SSE2. No auto-detection is
performed and the setting defaults to 387 to preserve previous
behaviour.

The patch is a reorganization of CL6549052 by rsc.

Fixes #3912.

R=minux.ma, rsc
CC=golang-dev
https://golang.org/cl/6962043
2013-01-02 22:55:23 +01:00
Russ Cox
e431398e09 undo CL 6938073 / 1542912cf09d
remove zerostack compiler experiment; will do at link time instead

««« original CL description
cmd/gc: add GOEXPERIMENT=zerostack to clear stack on function entry

This is expensive but it might be useful in cases where
people are suffering from false positives during garbage
collection and are willing to trade the CPU time for getting
rid of the false positives.

On the other hand it only eliminates false positives caused
by other function calls, not false positives caused by dead
temporaries stored in the current function call.

The 5g/6g/8g changes were pulled out of the history, from
the last time we needed to do this (to work around a goto bug).
The code in go.h, lex.c, pgen.c is new but tiny.

R=ken2
CC=golang-dev
https://golang.org/cl/6938073
»»»

R=ken2
CC=golang-dev
https://golang.org/cl/7002051
2012-12-22 11:18:04 -05:00
Rémy Oudompheng
4d3cbfdefa cmd/8g: introduce temporaries in byte multiplication.
Also restore the smallintconst case for binary ops.

Fixes #3835.

R=daniel.morsing, rsc
CC=golang-dev
https://golang.org/cl/6999043
2012-12-21 23:46:16 +01:00
Russ Cox
b7603cfc2c cmd/gc: add GOEXPERIMENT=zerostack to clear stack on function entry
This is expensive but it might be useful in cases where
people are suffering from false positives during garbage
collection and are willing to trade the CPU time for getting
rid of the false positives.

On the other hand it only eliminates false positives caused
by other function calls, not false positives caused by dead
temporaries stored in the current function call.

The 5g/6g/8g changes were pulled out of the history, from
the last time we needed to do this (to work around a goto bug).
The code in go.h, lex.c, pgen.c is new but tiny.

R=ken2
CC=golang-dev
https://golang.org/cl/6938073
2012-12-17 14:32:26 -05:00
Dave Cheney
b2797f2ae0 cmd/{5,6,8}g: reduce size of Prog and Addr
5g: Prog went from 128 bytes to 88 bytes
6g: Prog went from 174 bytes to 144 bytes
8g: Prog went from 124 bytes to 92 bytes

There may be a little more that can be squeezed out of Addr, but alignment will be a factor.

All: remove the unused pun field from Addr

R=rsc, minux.ma
CC=golang-dev
https://golang.org/cl/6922048
2012-12-14 06:20:24 +11:00
Rémy Oudompheng
a617d06252 cmd/6g, cmd/8g: simplify integer division code.
Change suggested by iant. The compiler generates
special code for a/b when a is -0x80...0 and b = -1.
A single instruction can cover the case where b is -1,
so only one comparison is needed.

Fixes #3551.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6922049
2012-12-12 08:35:08 +01:00
Rémy Oudompheng
45fe306ac8 cmd/[568]g: recycle ONAME nodes used in regopt to denote registers.
The reported decrease in memory usage is about 5%.

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/6902064
2012-12-09 19:10:52 +01:00
Rémy Oudompheng
f134742f24 cmd/5g, cmd/8g: fix internal error on 64-bit indices statically bounded
Fixes #4448.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6855100
2012-11-27 21:37:38 +01:00
Rémy Oudompheng
4cc9de9147 cmd/gc: add division rewrite to walk pass.
This allows 5g and 8g to benefit from the rewrite as shifts
or magic multiplies. The 64-bit arithmetic is not handled there,
and left in 6g.

Update #2230.

R=golang-dev, dave, mtj, iant, rsc
CC=golang-dev
https://golang.org/cl/6819123
2012-11-26 23:45:22 +01:00
Rémy Oudompheng
1bd4a7dbcb cmd/8g: fix erroneous LEAL nil.
Fixes #4399.

R=golang-dev, nigeltao
CC=golang-dev
https://golang.org/cl/6845053
2012-11-21 08:39:45 +01:00
Rémy Oudompheng
fa316ba4d8 cmd/8g: eliminate obviously useless temps before regopt.
This patch introduces a sort of pre-regopt peephole optimization.
When a temporary is introduced that just holds a value for the
duration of the next instruction and is otherwise unused, we
elide it to make the job of regopt easier.

Since x86 has very few registers, this situation happens very
often. The result is large savings in stack variables for
arithmetic-heavy functions.

crypto/aes

benchmark                 old ns/op    new ns/op    delta
BenchmarkEncrypt               1301          392  -69.87%
BenchmarkDecrypt               1309          368  -71.89%
BenchmarkExpand                2913         1036  -64.44%
benchmark                  old MB/s     new MB/s  speedup
BenchmarkEncrypt              12.29        40.74    3.31x
BenchmarkDecrypt              12.21        43.37    3.55x

crypto/md5

benchmark                 old ns/op    new ns/op    delta
BenchmarkHash8Bytes            1761          914  -48.10%
BenchmarkHash1K               16912         5570  -67.06%
BenchmarkHash8K              123895        38286  -69.10%
benchmark                  old MB/s     new MB/s  speedup
BenchmarkHash8Bytes            4.54         8.75    1.93x
BenchmarkHash1K               60.55       183.83    3.04x
BenchmarkHash8K               66.12       213.97    3.24x

bench/go1

benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    8364835000   8303154000   -0.74%
BenchmarkFannkuch11      7511723000   6381729000  -15.04%
BenchmarkGobDecode         27764090     27103270   -2.38%
BenchmarkGobEncode         11240880     11184370   -0.50%
BenchmarkGzip            1470224000    856668400  -41.73%
BenchmarkGunzip           240660800    201697300  -16.19%
BenchmarkJSONEncode       155225800    185571900  +19.55%
BenchmarkJSONDecode       243347900    282123000  +15.93%
BenchmarkMandelbrot200     12240970     12201880   -0.32%
BenchmarkParse              8837445      8765210   -0.82%
BenchmarkRevcomp         2556310000   1868566000  -26.90%
BenchmarkTemplate         389298000    379792000   -2.44%
benchmark                  old MB/s     new MB/s  speedup
BenchmarkGobDecode            27.64        28.32    1.02x
BenchmarkGobEncode            68.28        68.63    1.01x
BenchmarkGzip                 13.20        22.65    1.72x
BenchmarkGunzip               80.63        96.21    1.19x
BenchmarkJSONEncode           12.50        10.46    0.84x
BenchmarkJSONDecode            7.97         6.88    0.86x
BenchmarkParse                 6.55         6.61    1.01x
BenchmarkRevcomp              99.43       136.02    1.37x
BenchmarkTemplate              4.98         5.11    1.03x

Fixes #4035.

R=golang-dev, minux.ma, rsc
CC=golang-dev
https://golang.org/cl/6828056
2012-11-13 07:39:18 +01:00
Rémy Oudompheng
7c0cbbfa18 cmd/6g, cmd/8g: mark used registers in indirect addressing.
Fixes #4094.
Fixes #4353.

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/6810090
2012-11-07 21:36:15 +01:00
Russ Cox
3d40062c68 cmd/gc, cmd/ld: struct field tracking
This is an experiment in static analysis of Go programs
to understand which struct fields a program might use.
It is not part of the Go language specification, it must
be enabled explicitly when building the toolchain,
and it may be removed at any time.

After building the toolchain with GOEXPERIMENT=fieldtrack,
a specific field can be marked for tracking by including
`go:"track"` in the field tag:

        package pkg

        type T struct {
                F int `go:"track"`
                G int // untracked
        }

To simplify usage, only named struct types can have
tracked fields, and only exported fields can be tracked.

The implementation works by making each function begin
with a sequence of no-op USEFIELD instructions declaring
which tracked fields are accessed by a specific function.
After the linker's dead code elimination removes unused
functions, the fields referred to by the remaining
USEFIELD instructions are the ones reported as used by
the binary.

The -k option to the linker specifies the fully qualified
symbol name (such as my/pkg.list) of a string variable that
should be initialized with the field tracking information
for the program. The field tracking string is a sequence
of lines, each terminated by a \n and describing a single
tracked field referred to by the program. Each line is made
up of one or more tab-separated fields. The first field is
the name of the tracked field, fully qualified, as in
"my/pkg.T.F". Subsequent fields give a shortest path of
reverse references from that field to a global variable or
function, corresponding to one way in which the program
might reach that field.

A common source of false positives in field tracking is
types with large method sets, because a reference to the
type descriptor carries with it references to all methods.
To address this problem, the CL also introduces a comment
annotation

        //go:nointerface

that marks an upcoming method declaration as unavailable
for use in satisfying interfaces, both statically and
dynamically. Such a method is also invisible to package
reflect.

Again, all of this is disabled by default. It only turns on
if you have GOEXPERIMENT=fieldtrack set during make.bash.

R=iant, ken
CC=golang-dev
https://golang.org/cl/6749064
2012-11-02 00:17:21 -04:00
Rémy Oudompheng
022b361ae2 cmd/5g, cmd/6g, cmd/8g: remove width check for componentgen.
The move to 64-bit ints in 6g made componentgen ineffective.
In componentgen, the code already selects which values it can handle.

On amd64:
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    9477970000   9582314000   +1.10%
BenchmarkFannkuch11      5928750000   5255080000  -11.36%
BenchmarkGobDecode         37103040     31451120  -15.23%
BenchmarkGobEncode         16042490     16844730   +5.00%
BenchmarkGzip             811337400    741373600   -8.62%
BenchmarkGunzip           197928700    192844500   -2.57%
BenchmarkJSONEncode       224164100    140064200  -37.52%
BenchmarkJSONDecode       258346800    231829000  -10.26%
BenchmarkMandelbrot200      7561780      7601615   +0.53%
BenchmarkParse             12970340     11624360  -10.38%
BenchmarkRevcomp         1969917000   1699137000  -13.75%
BenchmarkTemplate         296182000    263117400  -11.16%

R=nigeltao, dave, daniel.morsing
CC=golang-dev
https://golang.org/cl/6821052
2012-11-01 14:36:08 +01:00
Rémy Oudompheng
7e144bcab0 cmd/5g, cmd/6g, cmd/8g: fix out of registers.
This patch is enough to fix compilation of
exp/types tests but only passes a stripped down
version of the appripriate torture test.

Update #4207.

R=dave, nigeltao, rsc, golang-dev
CC=golang-dev
https://golang.org/cl/6621061
2012-10-16 07:22:33 +02:00
Rémy Oudompheng
2de064b63c cmd/8g: do not take the address of string/slice for &s[i]
A similar change was made in 6g recently.

LEALs in cmd/go: 31440 before, 27867 after.

benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    7065794000   6723617000   -4.84%
BenchmarkFannkuch11      7767395000   7477945000   -3.73%
BenchmarkGobDecode         34708140     34857820   +0.43%
BenchmarkGobEncode         10998780     10960060   -0.35%
BenchmarkGzip            1603630000   1471052000   -8.27%
BenchmarkGunzip           242573900    240650400   -0.79%
BenchmarkJSONEncode       120842200    117966100   -2.38%
BenchmarkJSONDecode       247254900    249103100   +0.75%
BenchmarkMandelbrot200     29237330     29241790   +0.02%
BenchmarkParse              8111320      8096865   -0.18%
BenchmarkRevcomp         2595780000   2694153000   +3.79%
BenchmarkTemplate         276679600    264497000   -4.40%

benchmark                              old ns/op    new ns/op    delta
BenchmarkAppendFloatDecimal                  429          416   -3.03%
BenchmarkAppendFloat                         780          740   -5.13%
BenchmarkAppendFloatExp                      746          700   -6.17%
BenchmarkAppendFloatNegExp                   752          694   -7.71%
BenchmarkAppendFloatBig                     1228         1108   -9.77%
BenchmarkAppendFloat32Integer                457          416   -8.97%
BenchmarkAppendFloat32ExactFraction          662          631   -4.68%
BenchmarkAppendFloat32Point                  771          735   -4.67%
BenchmarkAppendFloat32Exp                    722          672   -6.93%
BenchmarkAppendFloat32NegExp                 724          659   -8.98%
BenchmarkAppendFloat64Fixed1                 429          400   -6.76%
BenchmarkAppendFloat64Fixed2                 463          442   -4.54%

Update #1914.

R=golang-dev, daniel.morsing, rsc
CC=golang-dev
https://golang.org/cl/6574043
2012-10-02 08:19:27 +02:00
Rémy Oudompheng
617b7cf166 cmd/[568]g: header cleanup.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6573059
2012-09-27 08:34:00 +02:00
Rémy Oudompheng
6feb61325a cmd/6g, cmd/8g: fix two "out of fixed registers" cases.
In two cases, registers were allocated too early resulting
in exhausting of available registers when nesting these
operations.

The case of method calls was due to missing cases in igen,
which only makes calls but doesn't allocate a register for
the result.

The case of 8-bit multiplication was due to a wrong order
in register allocation when Ullman numbers were bigger on the
RHS.

Fixes #3907.
Fixes #4156.

R=rsc
CC=golang-dev, remy
https://golang.org/cl/6560054
2012-09-26 21:17:11 +02:00
Rémy Oudompheng
33cceb09e2 cmd/{5g,6g,8g,6c}: remove unused macro, use %E to print etype.
R=golang-dev, rsc, dave
CC=golang-dev
https://golang.org/cl/6569044
2012-09-24 23:44:00 +02:00
Rémy Oudompheng
f4e76d5e02 cmd/6g, cmd/8g: add OINDREG, ODOT, ODOTPTR cases to igen.
Apart from reducing the number of LEAL/LEAQ instructions by about
30%, it gives 8g easier registerization in several cases,
for example in strconv. Performance with 6g is not affected.

Before (386):
src/pkg/strconv/decimal.go:22   TEXT  (*decimal).String+0(SB),$240-12
src/pkg/strconv/extfloat.go:540 TEXT  (*extFloat).ShortestDecimal+0(SB),$584-20

After (386):
src/pkg/strconv/decimal.go:22   TEXT  (*decimal).String+0(SB),$196-12
src/pkg/strconv/extfloat.go:540 TEXT  (*extFloat).ShortestDecimal+0(SB),$420-20

Benchmarks with GOARCH=386 (on a Core 2).

benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    7110191000   7079644000   -0.43%
BenchmarkFannkuch11      7769274000   7766514000   -0.04%
BenchmarkGobDecode         33454820     34755400   +3.89%
BenchmarkGobEncode         11675710     11007050   -5.73%
BenchmarkGzip            2013519000   1593855000  -20.84%
BenchmarkGunzip           253368200    242667600   -4.22%
BenchmarkJSONEncode       152443900    120763400  -20.78%
BenchmarkJSONDecode       304112800    247461800  -18.63%
BenchmarkMandelbrot200     29245520     29240490   -0.02%
BenchmarkParse              8484105      8088660   -4.66%
BenchmarkRevcomp         2695688000   2841263000   +5.40%
BenchmarkTemplate         363759800    277271200  -23.78%

benchmark                       old ns/op    new ns/op    delta
BenchmarkAtof64Decimal                127          129   +1.57%
BenchmarkAtof64Float                  166          164   -1.20%
BenchmarkAtof64FloatExp               308          300   -2.60%
BenchmarkAtof64Big                    584          571   -2.23%
BenchmarkAppendFloatDecimal           440          430   -2.27%
BenchmarkAppendFloat                  995          776  -22.01%
BenchmarkAppendFloatExp               897          746  -16.83%
BenchmarkAppendFloatNegExp            900          752  -16.44%
BenchmarkAppendFloatBig              1528         1228  -19.63%
BenchmarkAppendFloat32Integer         443          453   +2.26%
BenchmarkAppendFloat32ExactFraction   812          661  -18.60%
BenchmarkAppendFloat32Point          1002          773  -22.85%
BenchmarkAppendFloat32Exp             858          725  -15.50%
BenchmarkAppendFloat32NegExp          848          728  -14.15%
BenchmarkAppendFloat64Fixed1          447          431   -3.58%
BenchmarkAppendFloat64Fixed2          480          462   -3.75%
BenchmarkAppendFloat64Fixed3          461          457   -0.87%
BenchmarkAppendFloat64Fixed4          509          484   -4.91%

Update #1914.

R=rsc, nigeltao
CC=golang-dev, remy
https://golang.org/cl/6494107
2012-09-24 23:07:44 +02:00
Rémy Oudompheng
14f3276c9d cmd/8g: don't create redundant temporaries in bgen.
Comparisons used to create temporaries for arguments
even if they were already variables or addressable.
Removing the extra ones reduces pressure on regopt.

benchmark                 old ns/op    new ns/op    delta
BenchmarkGobDecode         50787620     49908980   -1.73%
BenchmarkGobEncode         19870190     19473030   -2.00%
BenchmarkGzip            3214321000   3067929000   -4.55%
BenchmarkGunzip           496792800    465828600   -6.23%
BenchmarkJSONEncode       232524800    263864400  +13.48%
BenchmarkJSONDecode       622038400    506600600  -18.56%
BenchmarkMandelbrot200     23937310     45913060  +91.81%
BenchmarkParse             14364450     13997010   -2.56%
BenchmarkRevcomp         6919028000   6480009000   -6.35%
BenchmarkTemplate         594458800    539528200   -9.24%

benchmark                  old MB/s     new MB/s  speedup
BenchmarkGobDecode            15.11        15.38    1.02x
BenchmarkGobEncode            38.63        39.42    1.02x
BenchmarkGzip                  6.04         6.33    1.05x
BenchmarkGunzip               39.06        41.66    1.07x
BenchmarkJSONEncode            8.35         7.35    0.88x
BenchmarkJSONDecode            3.12         3.83    1.23x
BenchmarkParse                 4.03         4.14    1.03x
BenchmarkRevcomp              36.73        39.22    1.07x
BenchmarkTemplate              3.26         3.60    1.10x

R=mtj, daniel.morsing, rsc
CC=golang-dev
https://golang.org/cl/6547064
2012-09-24 21:29:24 +02:00
Russ Cox
650160e36a cmd/gc: prepare for 64-bit ints
This CL makes the compiler understand that the type of
the len or cap of a map, slice, or string is 'int', not 'int32'.
It does not change the meaning of int, but it should make
the eventual change of the meaning of int in 6g a bit smoother.

Update #2188.

R=ken, dave, remyoudompheng
CC=golang-dev
https://golang.org/cl/6542059
2012-09-24 14:59:44 -04:00
Rémy Oudompheng
5e3fb887a3 cmd/[568]g: explain the purpose of various Reg fields.
R=golang-dev, rsc
CC=golang-dev, remy
https://golang.org/cl/6554062
2012-09-24 20:55:11 +02:00
Russ Cox
05ac300830 cmd/gc: fix use of nil interface, slice
Fixes #3670.

R=ken2
CC=golang-dev
https://golang.org/cl/6542058
2012-09-22 20:42:11 -04:00
Russ Cox
658482d70f cmd/5g: fix register opt bug
The width was not being set on the address, which meant
that the optimizer could not find variables that overlapped
with it and mark them as having had their address taken.
This let to the compiler believing variables had been set
but never used and then optimizing away the set.

Fixes #4129.

R=ken2
CC=golang-dev
https://golang.org/cl/6552059
2012-09-22 10:01:35 -04:00
Lucio De Re
b29ed23ab5 build: fix various 'set and not used' for Plan 9
R=dave, minux.ma, rsc
CC=golang-dev
https://golang.org/cl/6501134
2012-09-17 17:25:26 -04:00
Nigel Tao
a9a675ec35 cmd/6g, cmd/8g: clean up unnecessary switch code in componentgen.
Code higher up in the function already catches these cases.

R=remyoudompheng, rsc
CC=golang-dev
https://golang.org/cl/6496106
2012-09-12 21:47:05 +10:00
Rémy Oudompheng
b45b6fd1c7 cmd/6g, cmd/8g: do not LEA[LQ] interfaces when calling methods.
It is enough to load directly the data word and the itab word
from memory, so we save a LEA instruction for each method call,
and allow elimination of some extra temporaries.

Update #1914.

R=daniel.morsing, rsc
CC=golang-dev, remy
https://golang.org/cl/6501110
2012-09-11 08:45:23 +02:00
Rémy Oudompheng
ff642e290f cmd/6g, cmd/8g: eliminate extra agen for nil comparisons.
Removes an extra LEAL/LEAQ instructions there and usually saves
a useless temporary in the idiom
    if err := foo(); err != nil {...}

Generated code is also less involved:
    MOVQ err+n(SP), AX
    CMPQ AX, $0
(potentially CMPQ n(SP), $0) instead of
    LEAQ err+n(SP), AX
    CMPQ (AX), $0

Update #1914.

R=daniel.morsing, nigeltao, rsc
CC=golang-dev, remy
https://golang.org/cl/6493099
2012-09-11 08:08:40 +02:00
Rémy Oudompheng
ae0862c1ec cmd/8g: import componentgen from 6g.
This makes the compilers code more similar and improves
code generation a lot.

The number of LEAL instructions generated for cmd/go drops
by 60%.

% GOARCH=386 go build -gcflags -S -a cmd/go | grep LEAL | wc -l
Before:       89774
After:        47548

benchmark                              old ns/op    new ns/op    delta
BenchmarkAppendFloatDecimal                  540          444  -17.78%
BenchmarkAppendFloat                        1160         1035  -10.78%
BenchmarkAppendFloatExp                     1060          922  -13.02%
BenchmarkAppendFloatNegExp                  1053          920  -12.63%
BenchmarkAppendFloatBig                     1773         1558  -12.13%
BenchmarkFormatInt                         13065        12481   -4.47%
BenchmarkAppendInt                         10981         9900   -9.84%
BenchmarkFormatUint                         3804         3650   -4.05%
BenchmarkAppendUint                         3506         3303   -5.79%
BenchmarkUnquoteEasy                         714          683   -4.34%
BenchmarkUnquoteHard                        5117         2915  -43.03%

Update #1914.

R=nigeltao, rsc, golang-dev
CC=golang-dev, remy
https://golang.org/cl/6489067
2012-09-09 20:30:08 +02:00
Rémy Oudompheng
8f3c2055bd cmd/6g, cmd/8g: eliminate short integer arithmetic when possible.
Fixes #3909.
Fixes #3910.

R=rsc, nigeltao
CC=golang-dev
https://golang.org/cl/6442114
2012-09-01 16:40:54 +02:00
Nigel Tao
251199c430 cmd/8g: roll back the small integer constant optimizations introduced
in 13416:67c0b8c8fb29 "faster code, mainly for rotate" [1]. The codegen
can run out of registers if there are too many small-int arithmetic ops.

An alternative approach is to copy 6g's sbop/abop codegen to 8g, but
this change is less risky.

Fixes #3835.

[1] http://code.google.com/p/go/source/diff?spec=svn67c0b8c8fb29b1b7b6221977af6b89cae787b941&name=67c0b8c8fb29&r=67c0b8c8fb29b1b7b6221977af6b89cae787b941&format=side&path=/src/cmd/8g/cgen.c

R=rsc, remyoudompheng, r
CC=golang-dev
https://golang.org/cl/6450163
2012-08-23 16:17:22 +10:00
Rémy Oudompheng
823962c521 cmd/8g: fix miscompilation due to BADWIDTH.
Fixes #3899.

R=rsc
CC=golang-dev, remy
https://golang.org/cl/6453084
2012-08-03 22:05:51 +02:00
Nigel Tao
18e86644a3 cmd/gc: cache itab lookup in convT2I.
There may be further savings if convT2I can avoid the function call
if the cache is good and T is uintptr-shaped, a la convT2E, but that
will be a follow-up CL.

src/pkg/runtime:
benchmark                  old ns/op    new ns/op    delta
BenchmarkConvT2ISmall             43           15  -64.01%
BenchmarkConvT2IUintptr           45           14  -67.48%
BenchmarkConvT2ILarge            130          101  -22.31%

test/bench/go1:
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    8588997000   8499058000   -1.05%
BenchmarkFannkuch11      5300392000   5358093000   +1.09%
BenchmarkGobDecode         30295580     31040190   +2.46%
BenchmarkGobEncode         18102070     17675650   -2.36%
BenchmarkGzip             774191400    771591400   -0.34%
BenchmarkGunzip           245915100    247464100   +0.63%
BenchmarkJSONEncode       123577000    121423050   -1.74%
BenchmarkJSONDecode       451969800    596256200  +31.92%
BenchmarkMandelbrot200     10060050     10072880   +0.13%
BenchmarkParse             10989840     11037710   +0.44%
BenchmarkRevcomp         1782666000   1716864000   -3.69%
BenchmarkTemplate         798286600    723234400   -9.40%

R=rsc, bradfitz, go.peter.90, daniel.morsing, dave, uriel
CC=golang-dev
https://golang.org/cl/6337058
2012-07-03 09:09:05 +10:00
Nigel Tao
8f84328fdc cmd/gc: inline convT2E when T is uintptr-shaped.
GOARCH=amd64 benchmarks

src/pkg/runtime
benchmark                  old ns/op    new ns/op    delta
BenchmarkConvT2ESmall             10           10   +1.00%
BenchmarkConvT2EUintptr            9            0  -92.07%
BenchmarkConvT2EBig               74           74   -0.27%
BenchmarkConvT2I                  27           26   -3.62%
BenchmarkConvI2E                   4            4   -7.05%
BenchmarkConvI2I                  20           19   -2.99%

test/bench/go1
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    5930908000   5937260000   +0.11%
BenchmarkFannkuch11      3927057000   3933556000   +0.17%
BenchmarkGobDecode         21998090     21870620   -0.58%
BenchmarkGobEncode         12725310     12734480   +0.07%
BenchmarkGzip             567617600    567892800   +0.05%
BenchmarkGunzip           178284100    178706900   +0.24%
BenchmarkJSONEncode        87693550     86794300   -1.03%
BenchmarkJSONDecode       314212600    324115000   +3.15%
BenchmarkMandelbrot200      7016640      7073766   +0.81%
BenchmarkParse              7852100      7892085   +0.51%
BenchmarkRevcomp         1285663000   1286147000   +0.04%
BenchmarkTemplate         566823800    567606200   +0.14%

I'm not entirely sure why the JSON* numbers have changed, but
eyeballing the profile suggests that it could be spending less
and more time in runtime.{new,old}stack, so it could simply be
stack-split boundary noise.

R=rsc, dave, bsiegert, dsymonds
CC=golang-dev
https://golang.org/cl/6280049
2012-06-14 10:43:20 +10:00