1
0
mirror of https://github.com/golang/go synced 2024-11-12 12:40:28 -07:00
Commit Graph

310 Commits

Author SHA1 Message Date
Russ Cox
4acb70d377 cmd/gc: bypass DATA instruction for data initialized to integer constant
Eventually we will want to bypass DATA for everything,
but the relocations are not standardized well enough across
architectures to make that possible.

This did not help as much as I expected, but it is definitely better.
It shaves maybe 1-2% off all.bash depending on how much you
trust the timings of a single run:

Before: 241.139r 362.702u 112.967s
After:  234.339r 359.623u 111.045s

R=golang-codereviews, gobot, r, iant
CC=golang-codereviews
https://golang.org/cl/44650043
2013-12-20 14:24:39 -05:00
Russ Cox
870e821ded cmd/cc, cmd/gc: update compilers, assemblers for liblink changes
- add buffered stdout to all tools and provide to link ctxt.
- avoid extra \n before ! in .6 files written by assemblers
  (makes them match the C compilers).
- use linkwriteobj instead of linkouthist+linkwritefuncs.
- in assemblers and C compilers, record pc explicitly in Prog,
  for use by liblink.
- in C compilers, preserve jump target links.
- in Go compilers (gsubr.c) attach gotype directly to
  corresponding LSym* instead of rederiving from instruction stream.
- in Go compilers, emit just one definition for runtime.zerovalue
  from each compilation.

This CL consists entirely of small adjustments.
The heavy lifting is in CL 39680043.
Each depends on the other.

R=golang-dev, dave, iant
CC=golang-dev
https://golang.org/cl/37030045
2013-12-16 12:51:38 -05:00
Russ Cox
f606c1be80 cmd/5g, cmd/6g, cmd/8g: use liblink
Preparation for golang.org/s/go13linker work.

This CL does not build by itself. It depends on 35740044
and 35790044 and will be submitted at the same time.

R=iant
CC=golang-dev
https://golang.org/cl/34590045
2013-12-08 22:51:55 -05:00
Carl Shapiro
f056daf075 cmd/5g, cmd/5l, cmd/6g, cmd/6l, cmd/8g, cmd/8l, cmd/gc, runtime: generate pointer maps by liveness analysis
This change allows the garbage collector to examine stack
slots that are determined as live and containing a pointer
value by the garbage collector.  This results in a mean
reduction of 65% in the number of stack slots scanned during
an invocation of "GOGC=1 all.bash".

Unfortunately, this does not yet allow garbage collection to
be precise for the stack slots computed as live.  Pointers
confound the determination of what definitions reach a given
instruction.  In general, this problem is not solvable without
runtime cost but some advanced cooperation from the compiler
might mitigate common cases.

R=golang-dev, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/14430048
2013-12-05 17:35:22 -08:00
Russ Cox
aa0439ba65 cmd/gc: eliminate redundant &x.Field nil checks
This eliminates ~75% of the nil checks being emitted,
on all architectures. We can do better, but we need
a bit more general support from the compiler, and
I don't want to do that so close to Go 1.2.
What's here is simple but effective and safe.

A few small code generation cleanups were required
to make the analysis consistent on all systems about
which nil checks are omitted, at least in the test.

Fixes #6019.

R=ken2
CC=golang-dev
https://golang.org/cl/13334052
2013-09-17 16:54:22 -04:00
Rémy Oudompheng
ff416a3f19 cmd/gc: inline copy in frontend to call memmove directly.
A new node type OSPTR is added to refer to the data pointer of
strings and slices in a simple way during walk(). It will be
useful for future work on simplification of slice arithmetic.

benchmark                  old ns/op    new ns/op    delta
BenchmarkCopy1Byte                 9            8  -13.98%
BenchmarkCopy2Byte                14            8  -40.49%
BenchmarkCopy4Byte                13            8  -35.04%
BenchmarkCopy8Byte                13            8  -37.10%
BenchmarkCopy12Byte               14           12  -15.38%
BenchmarkCopy16Byte               14           12  -17.24%
BenchmarkCopy32Byte               19           14  -27.32%
BenchmarkCopy128Byte              31           26  -15.29%
BenchmarkCopy1024Byte            100           92   -7.50%
BenchmarkCopy1String              10            7  -28.99%
BenchmarkCopy2String              10            7  -28.06%
BenchmarkCopy4String              10            8  -22.69%
BenchmarkCopy8String              10            8  -23.30%
BenchmarkCopy12String             11           11   -5.88%
BenchmarkCopy16String             11           11   -5.08%
BenchmarkCopy32String             15           14   -6.58%
BenchmarkCopy128String            28           25  -10.60%
BenchmarkCopy1024String           95           95   +0.53%

R=golang-dev, bradfitz, cshapiro, dave, daniel.morsing, rsc, khr, khr
CC=golang-dev
https://golang.org/cl/9101048
2013-09-12 00:15:28 +02:00
Russ Cox
6d47de2f40 cmd/5g, cmd/6g, cmd/8g: remove O(n) reset loop in copyprop
Simpler version of CL 13084043.

R=ken2
CC=golang-dev
https://golang.org/cl/13602045
2013-09-11 15:22:11 -04:00
Russ Cox
a0bc379d46 undo CL 13084043 / ef4ee02a5853
There is a cleaner, simpler way.

««« original CL description
cmd/5g, cmd/6g, cmd/8g: faster compilation
Replace linked list walk with memset.
This reduces CPU time taken by 'go install -a std' by ~10%.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.714s	0m14.858s	0m6.138s
0m22.644s	0m14.875s	0m6.120s
0m22.604s	0m14.854s	0m6.081s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/13084043
»»»

TBR=dvyukov
CC=golang-dev
https://golang.org/cl/13352049
2013-09-11 15:14:11 -04:00
Russ Cox
af2a3193af cmd/5g, cmd/6g, cmd/8g: simplify for loop in bitmap generation
Lucio De Re reports that the more complex
loop miscompiles on Plan 9.

R=ken2
CC=golang-dev
https://golang.org/cl/13602043
2013-09-06 16:49:11 -04:00
Dmitriy Vyukov
79dca0327e libbio, all cmd: consistently use BGETC/BPUTC instead of Bgetc/Bputc
Also introduce BGET2/4, BPUT2/4 as they are widely used.
Slightly improve BGETC/BPUTC implementation.
This gives ~5% CPU time improvement on go install -a -p1 std.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.999s	0m15.841s	0m5.889s
0m22.845s	0m15.808s	0m5.850s
0m22.889s	0m15.832s	0m5.848s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/12745047
2013-08-30 15:46:12 +04:00
Rémy Oudompheng
4fc7ff497d cmd/5g: avoid clash between R13 and F3 registers.
Fixes #6247.

R=golang-dev, lucio.dere, bradfitz
CC=golang-dev
https://golang.org/cl/13216043
2013-08-27 21:09:16 +02:00
Dmitriy Vyukov
06e686def6 cmd/5g, cmd/6g, cmd/8g: faster compilation
Replace linked list walk with memset.
This reduces CPU time taken by 'go install -a std' by ~10%.
Before:
real		user		sys
0m23.561s	0m16.625s	0m5.848s
0m23.766s	0m16.624s	0m5.846s
0m23.742s	0m16.621s	0m5.868s
after:
0m22.714s	0m14.858s	0m6.138s
0m22.644s	0m14.875s	0m6.120s
0m22.604s	0m14.854s	0m6.081s

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/13084043
2013-08-21 14:20:28 +04:00
Russ Cox
3b4d792606 cmd/gc: separate "has pointers" from "needs zeroing" in stack frame
When the new call site-specific frame bitmaps are available,
we can cut the zeroing to just those values that need it due
to scope escaping.

R=cshapiro, cshapiro
CC=golang-dev
https://golang.org/cl/13045043
2013-08-16 21:45:59 -04:00
Carl Shapiro
d3b04f46b5 cmd/5g, cmd/6g, cmd/8g: update frame zeroing for new bitmap format
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12740046
2013-08-16 01:15:04 -04:00
Russ Cox
999a36f9af cmd/gc: &x panics if x does
See golang.org/s/go12nil.

This CL is about getting all the right checks inserted.
A followup CL will add an optimization pass to
remove redundant checks.

R=ken2
CC=golang-dev
https://golang.org/cl/12970043
2013-08-15 14:38:32 -04:00
Rémy Oudompheng
0d9db86f74 cmd/5g, cmd/6g, cmd/8g: restore occurrences of R replaced by nil in comments.
R=golang-dev, bradfitz, rsc
CC=golang-dev
https://golang.org/cl/12842043
2013-08-14 21:24:48 +02:00
Russ Cox
72636eb506 cmd/5g: fix temp-merging on ARM
mkvar was taking care of the "LeftAddr" case,
effectively hiding it from the temp-merging optimization.

Move it into prog.c.

R=ken2
CC=golang-dev
https://golang.org/cl/12884045
2013-08-14 00:34:18 -04:00
Russ Cox
fa72679f07 cmd/gc: add temporary-merging optimization pass
The compilers assume they can generate temporary variables
as needed to preserve the right semantics or simplify code
generation and the back end will still generate good code.
This turns out not to be true. The back ends will only
track the first 128 variables per function and give up
on the remainder. That needs to be fixed too, in a later CL.

This CL merges temporary variables with equal types and
non-overlapping lifetimes using the greedy algorithm in
Poletto and Sarkar, "Linear Scan Register Allocation",
ACM TOPLAS 1999.

The result can be striking in the right functions.

Top 20 frame size changes in a 6g godoc binary by bytes saved:

5464 1984 (-3480, -63.7%) go/build.(*Context).Import
4456 1824 (-2632, -59.1%) go/printer.(*printer).expr1
2560   80 (-2480, -96.9%) time.nextStdChunk
3496 1608 (-1888, -54.0%) go/printer.(*printer).stmt
1896  272 (-1624, -85.7%) net/http.init
2688 1400 (-1288, -47.9%) fmt.(*pp).printReflectValue
2800 1512 (-1288, -46.0%) main.main
3296 2016 (-1280, -38.8%) crypto/tls.(*Conn).clientHandshake
1664  488 (-1176, -70.7%) time.loadZoneZip
1760  608 (-1152, -65.5%) time.parse
4104 3072 (-1032, -25.1%) runtime/pprof.writeHeap
1680  712 ( -968, -57.6%) go/ast.Walk
2488 1560 ( -928, -37.3%) crypto/x509.parseCertificate
1128  392 ( -736, -65.2%) math/big.nat.divLarge
1528  864 ( -664, -43.5%) go/printer.(*printer).fieldList
1360  712 ( -648, -47.6%) regexp/syntax.(*parser).factor
2104 1528 ( -576, -27.4%) encoding/asn1.parseField
1064  504 ( -560, -52.6%) encoding/xml.(*Decoder).text
 584   48 ( -536, -91.8%) html.init
1400  864 ( -536, -38.3%) go/doc.playExample

In the same godoc build, cuts the number of functions with
too many vars from 83 to 32.

R=ken2
CC=golang-dev
https://golang.org/cl/12829043
2013-08-13 00:09:31 -04:00
Russ Cox
dbf96addfb cmd/gc: move flow graph into portable opt
Now there's only one copy of the flow graph construction
and dominator computation, and different optimizations
can attach different annotations to the instructions.

R=ken2
CC=golang-dev
https://golang.org/cl/12797045
2013-08-12 22:02:10 -04:00
Russ Cox
b3b87143f2 cmd/gc: support for "portable" optimization logic
Code in gc/popt.c is compiled as part of 5g, 6g, and 8g,
meaning it can use arch-specific headers but there's
just one copy of the code.

This is the same arrangement we use for the portable
code generation logic in gc/pgen.c.

Move fixjmp and noreturn there to get the ball rolling.

R=ken2
CC=golang-dev
https://golang.org/cl/12789043
2013-08-12 19:14:02 -04:00
Russ Cox
a07218385c cmd/5g: factor out prog information
Like CL 12637051, but for 5g instead of 6g.

R=ken2
CC=golang-dev
https://golang.org/cl/12779043
2013-08-12 13:42:23 -04:00
Russ Cox
7910cd68b5 cmd/gc: zero pointers on entry to function
On entry to a function, zero the results and zero the pointer
section of the local variables.

This is an intermediate step on the way to precise collection
of Go frames.

This can incur a significant (up to 30%) slowdown, but it also ensures
that the garbage collector never looks at a word in a Go frame
and sees a stale pointer value that could cause a space leak.
(C frames and assembly frames are still possibly problematic.)

This CL is required to start making collection of interface values
as precise as collection of pointer values are today.
Since we have to dereference the interface type to understand
whether the value is a pointer, it is critical that the type field be
initialized.

A future CL by Carl will make the garbage collection pointer
bitmaps context-sensitive. At that point it will be possible to
remove most of the zeroing. The only values that will still need
zeroing are values whose addresses escape the block scoping
of the function but do not escape to the heap.

benchmark                         old ns/op    new ns/op    delta
BenchmarkBinaryTree17            4420289180   4331060459   -2.02%
BenchmarkFannkuch11              3442469663   3277706251   -4.79%
BenchmarkFmtFprintfEmpty                100          142  +42.00%
BenchmarkFmtFprintfString               262          310  +18.32%
BenchmarkFmtFprintfInt                  213          281  +31.92%
BenchmarkFmtFprintfIntInt               355          431  +21.41%
BenchmarkFmtFprintfPrefixedInt          321          383  +19.31%
BenchmarkFmtFprintfFloat                444          533  +20.05%
BenchmarkFmtManyArgs                   1380         1559  +12.97%
BenchmarkGobDecode                 10240054     11794915  +15.18%
BenchmarkGobEncode                 17350274     19970478  +15.10%
BenchmarkGzip                     455179460    460699139   +1.21%
BenchmarkGunzip                   114271814    119291574   +4.39%
BenchmarkHTTPClientServer             89051        89894   +0.95%
BenchmarkJSONEncode                40486799     52691558  +30.15%
BenchmarkJSONDecode                94193361    112428781  +19.36%
BenchmarkMandelbrot200              4747060      4748043   +0.02%
BenchmarkGoParse                    6363798      6675098   +4.89%
BenchmarkRegexpMatchEasy0_32            129          171  +32.56%
BenchmarkRegexpMatchEasy0_1K            365          395   +8.22%
BenchmarkRegexpMatchEasy1_32            106          152  +43.40%
BenchmarkRegexpMatchEasy1_1K            952         1245  +30.78%
BenchmarkRegexpMatchMedium_32           198          283  +42.93%
BenchmarkRegexpMatchMedium_1K         79006       101097  +27.96%
BenchmarkRegexpMatchHard_32            3478         5115  +47.07%
BenchmarkRegexpMatchHard_1K          110245       163582  +48.38%
BenchmarkRevcomp                  777384355    793270857   +2.04%
BenchmarkTemplate                 136713089    157093609  +14.91%
BenchmarkTimeParse                     1511         1761  +16.55%
BenchmarkTimeFormat                     535          850  +58.88%

benchmark                          old MB/s     new MB/s  speedup
BenchmarkGobDecode                    74.95        65.07    0.87x
BenchmarkGobEncode                    44.24        38.43    0.87x
BenchmarkGzip                         42.63        42.12    0.99x
BenchmarkGunzip                      169.81       162.67    0.96x
BenchmarkJSONEncode                   47.93        36.83    0.77x
BenchmarkJSONDecode                   20.60        17.26    0.84x
BenchmarkGoParse                       9.10         8.68    0.95x
BenchmarkRegexpMatchEasy0_32         247.24       186.31    0.75x
BenchmarkRegexpMatchEasy0_1K        2799.20      2591.93    0.93x
BenchmarkRegexpMatchEasy1_32         299.31       210.44    0.70x
BenchmarkRegexpMatchEasy1_1K        1074.71       822.45    0.77x
BenchmarkRegexpMatchMedium_32          5.04         3.53    0.70x
BenchmarkRegexpMatchMedium_1K         12.96        10.13    0.78x
BenchmarkRegexpMatchHard_32            9.20         6.26    0.68x
BenchmarkRegexpMatchHard_1K            9.29         6.26    0.67x
BenchmarkRevcomp                     326.95       320.40    0.98x
BenchmarkTemplate                     14.19        12.35    0.87x

R=cshapiro
CC=golang-dev
https://golang.org/cl/12616045
2013-08-09 23:10:58 -04:00
Rémy Oudompheng
357f733695 cmd/5c, cmd/5g, cmd/5l: turn MOVB, MOVH into plain moves, optimize short arithmetic.
Pseudo-instructions MOVBS and MOVHS are used to clarify
the semantics of short integers vs. registers:
 * 8-bit and 16-bit values in registers are assumed to always
   be zero-extended or sign-extended depending on their type.
 * MOVB is truncation or move of an already extended value
   between registers.
 * MOVBU enforces zero-extension at the destination (register).
 * MOVBS enforces sign-extension at the destination (register).
And similarly for MOVH/MOVS/MOVHU.

The linker is adapted to assemble MOVB and MOVH to an ordinary
mov. Also a peephole pass in 5g that aims at eliminating
redundant zero/sign extensions is improved.

encoding/binary:
benchmark                              old ns/op    new ns/op    delta
BenchmarkReadSlice1000Int32s              220387       217185   -1.45%
BenchmarkReadStruct                        12839        12910   +0.55%
BenchmarkReadInts                           5692         5534   -2.78%
BenchmarkWriteInts                          6137         6016   -1.97%
BenchmarkPutUvarint32                        257          241   -6.23%
BenchmarkPutUvarint64                        812          754   -7.14%
benchmark                               old MB/s     new MB/s  speedup
BenchmarkReadSlice1000Int32s               18.15        18.42    1.01x
BenchmarkReadStruct                         5.45         5.42    0.99x
BenchmarkReadInts                           5.27         5.42    1.03x
BenchmarkWriteInts                          4.89         4.99    1.02x
BenchmarkPutUvarint32                      15.56        16.57    1.06x
BenchmarkPutUvarint64                       9.85        10.60    1.08x

crypto/des:
benchmark                              old ns/op    new ns/op    delta
BenchmarkEncrypt                            7002         5169  -26.18%
BenchmarkDecrypt                            7015         5195  -25.94%
benchmark                               old MB/s     new MB/s  speedup
BenchmarkEncrypt                            1.14         1.55    1.36x
BenchmarkDecrypt                            1.14         1.54    1.35x

strconv:
benchmark                              old ns/op    new ns/op    delta
BenchmarkAtof64Decimal                       457          385  -15.75%
BenchmarkAtof64Float                         574          479  -16.55%
BenchmarkAtof64FloatExp                     1035          906  -12.46%
BenchmarkAtof64Big                          1793         1457  -18.74%
BenchmarkAtof64RandomBits                   2267         2066   -8.87%
BenchmarkAtof64RandomFloats                 1416         1194  -15.68%
BenchmarkAtof32Decimal                       451          379  -15.96%
BenchmarkAtof32Float                         547          435  -20.48%
BenchmarkAtof32FloatExp                     1095          986   -9.95%
BenchmarkAtof32Random                       1154         1006  -12.82%
BenchmarkAtoi                               1415         1380   -2.47%
BenchmarkAtoiNeg                            1414         1401   -0.92%
BenchmarkAtoi64                             1744         1671   -4.19%
BenchmarkAtoi64Neg                          1737         1662   -4.32%

Fixes #1837.

R=rsc, dave, bradfitz
CC=golang-dev
https://golang.org/cl/12424043
2013-08-09 06:43:17 +02:00
Rémy Oudompheng
3670337097 cmd/5c, cmd/5g, cmd/5l: introduce MOVBS and MOVHS instructions.
MOVBS and MOVHS are defined as duplicates of MOVB and MOVH,
and perform sign-extension moving.
No change is made to code generation.

Update #1837

R=rsc, bradfitz
CC=golang-dev
https://golang.org/cl/12682043
2013-08-08 23:28:53 +02:00
Dmitriy Vyukov
8679d5f2b5 cmd/gc: record argument size for all indirect function calls
This is required to properly unwind reflect.methodValueCall/makeFuncStub.
Fixes #5954.
Stats for 'go install std':
61849 total INSTCALL
24655 currently have ArgSize metadata
27278 have ArgSize metadata with this change
godoc size before: 11351888, after: 11364288

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12163043
2013-07-31 20:00:33 +04:00
Russ Cox
48769bf546 runtime: use funcdata to supply garbage collection information
This CL introduces a FUNCDATA number for runtime-specific
garbage collection metadata, changes the C and Go compilers
to emit that metadata, and changes the runtime to expect it.

The old pseudo-instructions that carried this information
are gone, as is the linker code to process them.

R=golang-dev, dvyukov, cshapiro
CC=golang-dev
https://golang.org/cl/11406044
2013-07-19 16:04:09 -04:00
Russ Cox
7b3c8b7ac8 cmd/5g, cmd/6g, cmd/8g: insert arg size annotations on runtime calls
If calling a function in package runtime, emit argument size
information around the call in case the call is to a variadic C function.

R=ken2
CC=golang-dev
https://golang.org/cl/11371043
2013-07-16 16:25:10 -04:00
Russ Cox
7e97d39879 cmd/5g, cmd/6g, cmd/8g: fix line number of caller of deferred func
Deferred functions are not run by a call instruction. They are run by
the runtime editing registers to make the call start with a caller PC
returning to a
        CALL deferreturn
instruction.

That instruction has always had the line number of the function's
closing brace, but that instruction's line number is irrelevant.
Stack traces show the line number of the instruction before the
return PC, because normally that's what started the call. Not so here.
The instruction before the CALL deferreturn could be almost anywhere
in the function; it's unrelated and its line number is incorrect to show.

Fix the line number by inserting a true hardware no-op with the right
line number before the returned-to CALL instruction. That is, the deferred
calls now appear to start with a caller PC returning to the second instruction
in this sequence:
        NOP
        CALL deferreturn

The traceback will show the line number of the NOP, which we've set
to be the line number of the function's closing brace.

The NOP here is not the usual pseudo-instruction, which would be
elided by the linker. Instead it is the real hardware instruction:
XCHG AX, AX on 386 and amd64, and AND.EQ R0, R0, R0 on ARM.

Fixes #5856.

R=ken2, ken
CC=golang-dev
https://golang.org/cl/11223043
2013-07-12 13:47:55 -04:00
Russ Cox
b4e92cee97 cmd/gc: support x[i:j:k]
Design doc at golang.org/s/go12slice.
This is an experimental feature and may not be included in the release.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/10743046
2013-07-01 20:32:36 -04:00
Russ Cox
0713293374 cmd/5g, cmd/6g, cmd/8g: fix comment
Keeping the string "compactframe" because that's what
I always search for to find this code. But point to the real place too.

TBR=iant
CC=golang-dev
https://golang.org/cl/10676047
2013-06-28 12:06:25 -07:00
Russ Cox
6fa3c89b77 runtime: record proper goroutine state during stack split
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.

For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:

        TEXT myfunc
                check for split
                jmpcond nosplit
                call morestack
        nosplit:
                sub $xxx, sp

Now an extra instruction is inserted after the call:

        TEXT myfunc
        start:
                check for split
                jmpcond nosplit
                call morestack
                jmp start
        nosplit:
                sub $xxx, sp

The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.

The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.

The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.

Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)

runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
        morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
        sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow

goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
        /Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
        /Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
        /Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
        /Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
        /Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8

And here is a seg fault during oldstack:

SIGSEGV: segmentation violation
PC=0x1b2a6

runtime.oldstack()
        /Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
        /Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22

goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
        /Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
        /Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
        /Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
        strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8

goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
        /Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
        /Users/rsc/g/go/src/pkg/runtime/proc.c:166

rax     0x23ccc0
rbx     0x23ccc0
rcx     0x0
rdx     0x38
rdi     0x2102c0170
rsi     0x221032cfe0
rbp     0x221032cfa0
rsp     0x7fff5fbff5b0
r8      0x2102c0120
r9      0x221032cfa0
r10     0x221032c000
r11     0x104ce8
r12     0xe5c80
r13     0x1be82baac718
r14     0x13091135f7d69200
r15     0x0
rip     0x1b2a6
rflags  0x10246
cs      0x2b
fs      0x0
gs      0x0

Fixes #5723.

R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
2013-06-27 11:32:01 -04:00
Russ Cox
1f51d27922 cmd/gc: move genembedtramp into portable code
Requires adding new linker instruction
        RET	f(SB)
meaning return but then immediately call f.
This is what you'd use to implement a tail call after
fiddling with the arguments, but the compiler only
uses it in genwrapper.

This CL eliminates the copy-and-paste genembedtramp
functions from 5g/8g/6g and makes the code run on ARM
for the first time. It removes a small special case for function
generation, which should help Carl a bit, but at the same time
it does not bother to implement general tail call optimization,
which we do not want anyway.

Fixes #5627.

R=ken2
CC=golang-dev
https://golang.org/cl/10057044
2013-06-11 09:41:49 -04:00
Shenghou Ma
faef52c214 all: fix typos
R=golang-dev, bradfitz, khr, r
CC=golang-dev
https://golang.org/cl/7461046
2013-06-09 21:50:24 +08:00
Carl Shapiro
acd887ba57 cmd/5g, cmd/6g, cmd/8g: remove prototypes for proglist
Each of the backends has two prototypes for this function but
no corresponding definition.

R=golang-dev, bradfitz, khr
CC=golang-dev
https://golang.org/cl/9930045
2013-06-04 16:22:59 -07:00
Carl Shapiro
31be5deae4 cmd/5g, cmd/6g, cmd/8g: provide embedded trampolines with argument size information
An embedded trampoline is a function that exists to marshal
a receiver of type *S to a receiver of type *T when T is an
embedded field in S.

Embedded trampolines are generated by a special path through
the compiler and are not subject to the general analysis and
annotation done to functions.  Their effects must be provided
explicitly.

R=golang-dev, r, daniel.morsing, minux.ma
CC=golang-dev
https://golang.org/cl/9874043
2013-05-31 13:34:57 -07:00
Carl Shapiro
4e0a51c210 cmd/5l, cmd/6l, cmd/8l, cmd/gc, runtime: generate and use bitmaps of argument pointer locations
With this change the compiler emits a bitmap for each function
covering its stack frame arguments area.  If an argument word
is known to contain a pointer, a bit is set.  The garbage
collector reads this information when scanning the stack by
frames and uses it to ignores locations known to not contain a
pointer.

R=golang-dev, bradfitz, daniel.morsing, dvyukov, khr, khr, iant, cshapiro
CC=golang-dev
https://golang.org/cl/9223046
2013-05-28 17:59:10 -07:00
Rob Pike
4dcb13bb44 cmd/gc: fix some overflows in the compiler
Some 64-bit fields were run through 32-bit words, some counts were
not checked for overflow, and relocations must fit in 32 bits.
Tests to follow.

R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/9033043
2013-04-29 22:44:40 -07:00
Ian Lance Taylor
578dc3a96c cmd/5g, cmd/6g, cmd/8g: more nil ptr to large struct checks
R=r, ken, khr, daniel.morsing
CC=dsymonds, golang-dev, rickyz
https://golang.org/cl/8925043
2013-04-24 08:13:01 -07:00
Russ Cox
d3c758d7d2 cmd/gc: implement method values
R=ken2, ken
CC=golang-dev
https://golang.org/cl/7546052
2013-03-20 17:11:09 -04:00
Shenghou Ma
8cdee79063 libmach, cmd/5a, cmd/5c, cmd/5g, cmd/5l: enable DWARF type info for Linux/ARM
Fixes #3747.

Update #4912
This CL adds gotype into .5 object file.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7376054
2013-02-26 06:15:29 +08:00
Russ Cox
aa3efb28f0 cmd/gc: can stop tracking gotype in regopt
Now that the type information is in TYPE instructions
that are not rewritten by the optimization passes,
we don't have to try to preserve the type information
(no longer) attached to MOV instructions.

R=ken2
CC=golang-dev
https://golang.org/cl/7402054
2013-02-25 16:11:34 -05:00
Russ Cox
f5dce6c853 cmd/5g: fix arm build
R=ken2
CC=golang-dev
https://golang.org/cl/7365057
2013-02-25 12:21:12 -05:00
Russ Cox
1d5dc4fd48 cmd/gc: emit explicit type information for local variables
The type information is (and for years has been) included
as an extra field in the address chunk of an instruction.
Unfortunately, suppose there is a string at a+24(FP) and
we have an instruction reading its length. It will say:

        MOVQ x+32(FP), AX

and the type of *that* argument is int (not slice), because
it is the length being read. This confuses the picture seen
by debuggers and now, worse, by the garbage collector.

Instead of attaching the type information to all uses,
emit an explicit list of TYPE instructions with the information.
The TYPE instructions are no-ops whose only role is to
provide an address to attach type information to.

For example, this function:

        func f(x, y, z int) (a, b string) {
                return
        }

now compiles into:

        --- prog list "f" ---
        0000 (/Users/rsc/x.go:3) TEXT    f+0(SB),$0-56
        0001 (/Users/rsc/x.go:3) LOCALS  ,
        0002 (/Users/rsc/x.go:3) TYPE    x+0(FP){int},$8
        0003 (/Users/rsc/x.go:3) TYPE    y+8(FP){int},$8
        0004 (/Users/rsc/x.go:3) TYPE    z+16(FP){int},$8
        0005 (/Users/rsc/x.go:3) TYPE    a+24(FP){string},$16
        0006 (/Users/rsc/x.go:3) TYPE    b+40(FP){string},$16
        0007 (/Users/rsc/x.go:3) MOVQ    $0,b+40(FP)
        0008 (/Users/rsc/x.go:3) MOVQ    $0,b+48(FP)
        0009 (/Users/rsc/x.go:3) MOVQ    $0,a+24(FP)
        0010 (/Users/rsc/x.go:3) MOVQ    $0,a+32(FP)
        0011 (/Users/rsc/x.go:4) RET     ,

The { } show the formerly hidden type information.
The { } syntax is used when printing from within the gc compiler.
It is not accepted by the assemblers.

The same type information is now included on global variables:

0055 (/Users/rsc/x.go:15) GLOBL   slice+0(SB){[]string},$24(AL*0)

This more accurate type information fixes a bug in the
garbage collector's precise heap collection.

The linker only cares about globals right now, but having the
local information should make things a little nicer for Carl
in the future.

Fixes #4907.

R=ken2
CC=golang-dev
https://golang.org/cl/7395056
2013-02-25 12:13:47 -05:00
Russ Cox
9f647288ef cmd/gc: avoid runtime code generation for closures
Change ARM context register to R7, to get out of the way
of the register allocator during the compilation of the
prologue statements (it wants to use R0 as a temporary).

Step 2 of http://golang.org/s/go11func.

R=ken2
CC=golang-dev
https://golang.org/cl/7369048
2013-02-22 14:25:50 -05:00
Russ Cox
1903ad7189 cmd/gc, reflect, runtime: switch to indirect func value representation
Step 1 of http://golang.org/s/go11func.

R=golang-dev, r, daniel.morsing, remyoudompheng
CC=golang-dev
https://golang.org/cl/7393045
2013-02-21 17:01:13 -05:00
Carl Shapiro
f466617a62 cmd/5g, cmd/5l, cmd/6l, cmd/8l, cmd/gc, cmd/ld, runtime: accurate args and locals information
Previously, the func structure contained an inaccurate value for
the args member and a 0 value for the locals member.

This change populates the func structure with args and locals
values computed by the compiler.  The number of args was
already available in the ATEXT instruction.  The number of
locals is now passed through in the new ALOCALS instruction.

This change also switches the unit of args and locals to be
bytes, just like the frame member, instead of 32-bit words.

R=golang-dev, bradfitz, cshapiro, dave, rsc
CC=golang-dev
https://golang.org/cl/7399045
2013-02-21 12:52:26 -08:00
Russ Cox
0cdc0b3b78 cmd/5g, cmd/6g: fix node dump formats
lvd changed the old %N to %+hN and these
never got updated.

R=ken2
CC=golang-dev
https://golang.org/cl/7391045
2013-02-21 12:53:25 -05:00
Robert Griesemer
3ee87d02b0 cmd/godoc: use go/build to determine package and example files
Also:
- faster code for example extraction
- simplify handling of command documentation:
  all "main" packages are treated as commands
- various minor cleanups along the way

For commands written in Go, any doc.go file containing
documentation must now be part of package main (rather
then package documentation), otherwise the documentation
won't show up in godoc (it will still build, though).

For commands written in C, documentation may still be
in doc.go files defining package documentation, but the
recommended way is to explicitly ignore those files with
a +build ignore constraint to define package main.

Fixes #4806.

R=adg, rsc, dave, bradfitz
CC=golang-dev
https://golang.org/cl/7333046
2013-02-19 11:19:58 -08:00
Rémy Oudompheng
f42fa807a6 cmd/5g: add missing splitclean.
See issue 887 for the 8g analogue.

R=golang-dev, minux.ma
CC=golang-dev
https://golang.org/cl/7306069
2013-02-08 08:19:47 +01:00
Shenghou Ma
255fb521da cmd/dist: add -Wstrict-prototypes to CFLAGS and fix all the compiler errors
Plan 9 compilers insist this but as we don't have Plan 9
builders, we'd better let gcc check the prototypes.

Inspired by CL 7289050.

R=golang-dev, seed, dave, rsc, lucio.dere
CC=akumar, golang-dev
https://golang.org/cl/7288056
2013-02-05 21:43:04 +08:00