1
0
mirror of https://github.com/golang/go synced 2024-10-05 14:01:21 -06:00
Commit Graph

318 Commits

Author SHA1 Message Date
Josh Bleecher Snyder
43c87aa481 cmd/6g, cmd/8g, liblink: improve handling of float constants
* Enable basic constant propagation for floats.
  The constant propagation is still not as aggressive as it could be.
* Implement MOVSS $(0), Xx and MOVSD $(0), Xx as XORPS Xx, Xx.

Sample code:

func f32() float32 {
	var f float32
	return f
}

func f64() float64 {
	var f float64
	return f
}

Before:

"".f32 t=1 size=32 value=0 args=0x8 locals=0x0
	0x0000 00000 (demo.go:3)	TEXT	"".f32+0(SB),4,$0-8
	0x0000 00000 (demo.go:3)	FUNCDATA	$0,gclocals·a7a3692b8e27e823add69ec4239ba55f+0(SB)
	0x0000 00000 (demo.go:3)	FUNCDATA	$1,gclocals·3280bececceccd33cb74587feedb1f9f+0(SB)
	0x0000 00000 (demo.go:3)	MOVSS	$f32.00000000+0(SB),X0
	0x0008 00008 (demo.go:4)	MOVSS	$f32.00000000+0(SB),X0
	0x0010 00016 (demo.go:5)	MOVSS	X0,"".~r0+8(FP)
	0x0016 00022 (demo.go:5)	RET	,
"".f64 t=1 size=32 value=0 args=0x8 locals=0x0
	0x0000 00000 (demo.go:8)	TEXT	"".f64+0(SB),4,$0-8
	0x0000 00000 (demo.go:8)	FUNCDATA	$0,gclocals·a7a3692b8e27e823add69ec4239ba55f+0(SB)
	0x0000 00000 (demo.go:8)	FUNCDATA	$1,gclocals·3280bececceccd33cb74587feedb1f9f+0(SB)
	0x0000 00000 (demo.go:8)	MOVSD	$f64.0000000000000000+0(SB),X0
	0x0008 00008 (demo.go:9)	MOVSD	$f64.0000000000000000+0(SB),X0
	0x0010 00016 (demo.go:10)	MOVSD	X0,"".~r0+8(FP)
	0x0016 00022 (demo.go:10)	RET	,

After:

"".f32 t=1 size=16 value=0 args=0x8 locals=0x0
	0x0000 00000 (demo.go:3)	TEXT	"".f32+0(SB),4,$0-8
	0x0000 00000 (demo.go:3)	FUNCDATA	$0,gclocals·a7a3692b8e27e823add69ec4239ba55f+0(SB)
	0x0000 00000 (demo.go:3)	FUNCDATA	$1,gclocals·3280bececceccd33cb74587feedb1f9f+0(SB)
	0x0000 00000 (demo.go:3)	XORPS	X0,X0
	0x0003 00003 (demo.go:5)	MOVSS	X0,"".~r0+8(FP)
	0x0009 00009 (demo.go:5)	RET	,
"".f64 t=1 size=16 value=0 args=0x8 locals=0x0
	0x0000 00000 (demo.go:8)	TEXT	"".f64+0(SB),4,$0-8
	0x0000 00000 (demo.go:8)	FUNCDATA	$0,gclocals·a7a3692b8e27e823add69ec4239ba55f+0(SB)
	0x0000 00000 (demo.go:8)	FUNCDATA	$1,gclocals·3280bececceccd33cb74587feedb1f9f+0(SB)
	0x0000 00000 (demo.go:8)	XORPS	X0,X0
	0x0003 00003 (demo.go:10)	MOVSD	X0,"".~r0+8(FP)
	0x0009 00009 (demo.go:10)	RET	,

Change-Id: Ie9eb65e324af4f664153d0a7cd22bb16b0fba16d
Reviewed-on: https://go-review.googlesource.com/2053
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-07 22:26:55 +00:00
Keith Randall
53c5226f9f runtime: make stack frames fixed size by modifying goproc/deferproc.
Calls to goproc/deferproc used to push & pop two extra arguments,
the argument size and the function to call.  Now, we allocate space
for those arguments in the outargs section so we don't have to
modify the SP.

Defers now use the stack pointer (instead of the argument pointer)
to identify which frame they are associated with.

A followon CL might simplify funcspdelta and some of the stack
walking code.

Fixes issue #8641

Change-Id: I835ec2f42f0392c5dec7cb0fe6bba6f2aed1dad8
Reviewed-on: https://go-review.googlesource.com/1601
Reviewed-by: Russ Cox <rsc@golang.org>
2014-12-23 01:08:29 +00:00
David du Colombier
213a6645ce [dev.cc] cmd/8g: fix warning on Plan 9
warning: /usr/go/src/cmd/8g/reg.c:365 format mismatch d VLONG, arg 5

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/177160043
2014-11-21 20:44:04 +01:00
Russ Cox
754de8d403 [dev.cc] all: merge dev.power64 (f57928630b36) into dev.cc
This will be the last dev.power64 merge; we'll finish on dev.cc.

TBR=austin
CC=golang-codereviews
https://golang.org/cl/175420043
2014-11-20 11:30:43 -05:00
David du Colombier
7aa89ea798 [dev.cc] cmd/8g: work around "out of fixed registers" on Plan 9
This change works around the "out of fixed registers"
issue with the Plan 9 C compiler on 386, introduced by
the Bits change to uint64 in CL 169060043.

The purpose of this CL is to be able to properly
follow the conversion of the Plan 9 runtime to Go
on the Plan 9 builders.

This CL could be reverted once the Go compilers will
be converted to Go.

Thanks to Nick Owens for investigating this issue.

LGTM=rsc
R=rsc
CC=austin, golang-codereviews, mischief
https://golang.org/cl/177860043
2014-11-16 22:55:07 +01:00
Austin Clements
5b38501a4f [dev.power64] 5g,6g,8g,9g: debug prints for regopt pass 6 and paint2
Theses were very helpful in understanding the regions and
register selection when porting regopt to 9g.  Add them to the
other compilers (and improve 9g's successor debug print).

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/174130043
2014-11-14 11:56:31 -05:00
Austin Clements
9e7bed88cd [dev.power64] 5g,6g,8g: synchronize documentation for regopt structures
I added several comments to the regopt-related structures when
porting it to 9g.  Synchronize those comments back in to the
other compilers.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/175720043
2014-11-14 11:07:33 -05:00
Austin Clements
c3dadb3d19 [dev.power64] 6g,8g: remove unnecessary and incorrect reg use scanning
Previously, the 6g and 8g registerizers scanned for used
registers beyond the end of a region being considered for
registerization.  This ancient artifact was copied from the C
compilers, where it was probably necessary to track implicitly
used registers.  In the Go compilers it's harmless (because it
can only over-restrict the set of available registers), but no
longer necessary because the Go compilers correctly track
register use/set information.  The consequences of this extra
scan were (at least) that 1) we would not consider allocating
the AX register if there was a deferproc call in the future
because deferproc uses AX as a return register, so we see the
use of AX, but don't track that AX is set by the CALL, and 2)
we could not consider allocating the DX register if there was
a MUL in the future because MUL implicitly sets DX and (thanks
to an abuse of copyu in this code) we would also consider DX
used.

This commit fixes these problems by nuking this code.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/174110043
2014-11-13 13:34:20 -05:00
Austin Clements
f45fd5753c [dev.power64] gc: fix etype of strings
The etype of references to strings was being incorrectly set
to TINT32 on all platforms.  Change it to TSTRING.  It seems
this doesn't matter for compilation, since x86 uses LEA
instructions to load string addresses and arm and power64
disassemble the string into its constituent pieces (with the
correct types), but it helps when debugging.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/170100043
2014-11-06 14:37:39 -05:00
Austin Clements
fa32e922d5 [dev.power64] gc: convert Bits to a uint64 array
So far all of our architectures have had at most 32 registers,
so we've been able to use entry 0 in the Bits uint32 array
directly as a register mask.  Power64 has 64 registers, so
this converts Bits to a uint64 array so we can continue to use
entry 0 directly as a register mask on Power64.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/169060043
2014-11-04 16:34:56 -05:00
Russ Cox
3c40ee0fe0 cmd/gc: simplify compiled code for explicit zeroing
Among other things, *x = T{} does not need a write barrier.
The changes here avoid an unnecessary copy even when
no pointers are involved, so it may have larger effects.

In 6g and 8g, avoid manually repeated STOSQ in favor of
writing explicit MOVs, under the theory that the MOVs
should have fewer dependencies and pipeline better.

Benchmarks compare best of 5 on a 2012 MacBook Pro Core i5
with TurboBoost disabled. Most improvements can be explained
by the changes in this CL.

The effect in Revcomp is real but harder to explain: none of
the instructions in the inner loop changed. I suspect loop
alignment but really have no idea.

benchmark                       old         new         delta
BenchmarkBinaryTree17           3809027371  3819907076  +0.29%
BenchmarkFannkuch11             3607547556  3686983012  +2.20%
BenchmarkFmtFprintfEmpty        118         103         -12.71%
BenchmarkFmtFprintfString       289         277         -4.15%
BenchmarkFmtFprintfInt          304         290         -4.61%
BenchmarkFmtFprintfIntInt       507         458         -9.66%
BenchmarkFmtFprintfPrefixedInt  425         408         -4.00%
BenchmarkFmtFprintfFloat        555         555         +0.00%
BenchmarkFmtManyArgs            1835        1733        -5.56%
BenchmarkGobDecode              14738209    14639331    -0.67%
BenchmarkGobEncode              14239039    13703571    -3.76%
BenchmarkGzip                   538211054   538701315   +0.09%
BenchmarkGunzip                 135430877   134818459   -0.45%
BenchmarkHTTPClientServer       116488      116618      +0.11%
BenchmarkJSONEncode             28923406    29294334    +1.28%
BenchmarkJSONDecode             105779820   104289543   -1.41%
BenchmarkMandelbrot200          5791758     5771964     -0.34%
BenchmarkGoParse                5376642     5310943     -1.22%
BenchmarkRegexpMatchEasy0_32    195         190         -2.56%
BenchmarkRegexpMatchEasy0_1K    477         455         -4.61%
BenchmarkRegexpMatchEasy1_32    170         165         -2.94%
BenchmarkRegexpMatchEasy1_1K    1410        1394        -1.13%
BenchmarkRegexpMatchMedium_32   336         329         -2.08%
BenchmarkRegexpMatchMedium_1K   108979      106328      -2.43%
BenchmarkRegexpMatchHard_32     5854        5821        -0.56%
BenchmarkRegexpMatchHard_1K     185089      182838      -1.22%
BenchmarkRevcomp                834920364   780202624   -6.55%
BenchmarkTemplate               137046937   129728756   -5.34%
BenchmarkTimeParse              600         594         -1.00%
BenchmarkTimeFormat             559         539         -3.58%

LGTM=r
R=r
CC=golang-codereviews, iant, khr, rlh
https://golang.org/cl/157910047
2014-10-15 19:33:15 -04:00
Russ Cox
c1c5d479bd cmd/5g, cmd/8g: make 'out of registers' a fatal error
There's no point in continuing. We will only get confused.
6g already makes this fatal.

LGTM=dave, minux, iant
R=iant, dave, minux
CC=golang-codereviews
https://golang.org/cl/140660043
2014-09-16 13:16:43 -04:00
Robert Griesemer
e3d2e5550e cmd/8g: remove unused variable (fix build)
http://build.golang.org/log/0434a945e3351eedaf56aa824d2bfe9c0d5e6735

LGTM=dave
R=bradfitz, dave
CC=golang-codereviews
https://golang.org/cl/144800043
2014-09-12 16:35:40 -07:00
Russ Cox
68c1c6afa0 cmd/cc, cmd/gc: stop generating 'argsize' PCDATA
The argsize PCDATA was specifying the number of
bytes passed to a function call, so that if the function
did not specify its argument count, the garbage collector
could use the call site information to scan those bytes
conservatively. We don't do that anymore, so stop
generating the information.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/139530043
2014-09-12 07:51:00 -04:00
Russ Cox
220a6de47e build: adjustments for move from src/pkg to src
This CL adjusts code referring to src/pkg to refer to src.

Immediately after submitting this CL, I will submit
a change doing 'hg mv src/pkg/* src'.
That change will be too large to review with Rietveld
but will contain only the 'hg mv'.

This CL will break the build.
The followup 'hg mv' will fix it.

For more about the move, see golang.org/s/go14nopkg.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/134570043
2014-09-08 00:06:45 -04:00
Russ Cox
613383c765 cmd/gc, runtime: treat slices and strings like pointers in garbage collection
Before, a slice with cap=0 or a string with len=0 might have its
base pointer pointing beyond the actual slice/string data into
the next block. The collector had to ignore slices and strings with
cap=0 in order to avoid misinterpreting the base pointer.

Now, a slice with cap=0 or a string with len=0 still has a base
pointer pointing into the actual slice/string data, no matter what.
The collector can now always scan the pointer, which means
strings and slices are no longer special.

Fixes #8404.

LGTM=khr, josharian
R=josharian, khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/112570044
2014-08-25 14:38:19 -04:00
Josh Bleecher Snyder
be2ad1d7f5 cmd/5g, cmd/6g, cmd/8g: clear Addr node when registerizing
Update #8525

Some temporary variables that were fully registerized nevertheless had stack space allocated for them because the Addrs were still marked as having associated nodes.

Distribution of stack space reserved for temporary variables while running make.bash (6g):

BEFORE

40.89%  7026 allocauto: 0 to 0
 7.83%  1346 allocauto: 0 to 24
 7.22%  1241 allocauto: 0 to 8
 6.30%  1082 allocauto: 0 to 16
 4.96%   853 allocauto: 0 to 56
 4.59%   789 allocauto: 0 to 32
 2.97%   510 allocauto: 0 to 40
 2.32%   399 allocauto: 0 to 48
 2.10%   360 allocauto: 0 to 64
 1.91%   328 allocauto: 0 to 72

AFTER

48.49%  8332 allocauto: 0 to 0
 9.52%  1635 allocauto: 0 to 16
 5.28%   908 allocauto: 0 to 48
 4.80%   824 allocauto: 0 to 32
 4.73%   812 allocauto: 0 to 8
 3.38%   581 allocauto: 0 to 24
 2.35%   404 allocauto: 0 to 40
 2.32%   399 allocauto: 0 to 64
 1.65%   284 allocauto: 0 to 56
 1.34%   230 allocauto: 0 to 72

LGTM=rsc
R=rsc
CC=dave, dvyukov, golang-codereviews, minux
https://golang.org/cl/126160043
2014-08-24 19:07:27 -07:00
Russ Cox
527ae57e65 cmd/5g, cmd/8g: registerize small structs and arrays
cmd/6g has been doing this for a long time.

Arrays are still problematic on 5g because the addressing
for t[0] where local var t has type [3]uintptr takes the address of t.
That's issue 8125.

Fixes #8123.

LGTM=josharian
R=josharian, dave
CC=golang-codereviews
https://golang.org/cl/102890046
2014-08-24 21:16:24 -04:00
Shenghou Ma
b5674a2b72 cmd/8g: fix build
Fixes #8510.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/129720043
2014-08-11 17:11:31 -04:00
Russ Cox
5b63ce4e19 cmd/6g, cmd/8g: fix, test byte-sized magic multiply
Credit to Rémy for finding and writing test case.

Fixes #8325.

LGTM=r
R=golang-codereviews, r
CC=dave, golang-codereviews, iant, remyoudompheng
https://golang.org/cl/124950043
2014-08-11 15:24:36 -04:00
Dmitriy Vyukov
8b20e7bb7e cmd/gc: mark auxiliary symbols as containing no pointers
They do not, but pretend that they do.
The immediate need is that it breaks the new GC because
these are weird symbols as if with pointers but not necessary
pointer aligned.

LGTM=rsc
R=golang-codereviews, dave, josharian, khr, rsc
CC=golang-codereviews, iant, khr, rlh
https://golang.org/cl/116060043
2014-07-23 17:36:10 +04:00
Russ Cox
ebce79446d build: annotations and modifications for c2go
The main changes fall into a few patterns:

1. Replace #define with enum.

2. Add /*c2go */ comment giving effect of #define.
This is necessary for function-like #defines and
non-enum-able #defined constants.
(Not all compilers handle negative or large enums.)

3. Add extra braces in struct initializer.
(c2go does not implement the full rules.)

This is enough to let c2go typecheck the source tree.
There may be more changes once it is doing
other semantic analyses.

LGTM=minux, iant
R=minux, dave, iant
CC=golang-codereviews
https://golang.org/cl/106860045
2014-07-02 15:41:29 -04:00
Rémy Oudompheng
1ec56062ef cmd/8g: don't allocate a register early for cap(CHAN).
There is no reason to generate different code for cap and len.

Fixes #8025.
Fixes #8026.

LGTM=rsc
R=rsc, iant, khr
CC=golang-codereviews
https://golang.org/cl/93570044
2014-07-01 09:20:51 +02:00
Russ Cox
1afbceb599 cmd/6g: treat vardef-initialized fat variables as live at calls
This CL forces the optimizer to preserve some memory stores
that would be redundant except that a stack scan due to garbage
collection or stack copying might look at them during a function call.
As such, it forces additional memory writes and therefore slows
down the execution of some programs, especially garbage-heavy
programs that are already limited by memory bandwidth.

The slowdown can be as much as 7% for end-to-end benchmarks.

These numbers are from running go1.test -test.benchtime=5s three times,
taking the best (lowest) ns/op for each benchmark. I am excluding
benchmarks with time/op < 10us to focus on macro effects.
All benchmarks are on amd64.

Comparing tip (a27f34c771cb) against this CL on an Intel Core i5 MacBook Pro:

benchmark                          old ns/op      new ns/op      delta
BenchmarkBinaryTree17              3876500413     3856337341     -0.52%
BenchmarkFannkuch11                2965104777     2991182127     +0.88%
BenchmarkGobDecode                 8563026        8788340        +2.63%
BenchmarkGobEncode                 5050608        5267394        +4.29%
BenchmarkGzip                      431191816      434168065      +0.69%
BenchmarkGunzip                    107873523      110563792      +2.49%
BenchmarkHTTPClientServer          85036          86131          +1.29%
BenchmarkJSONEncode                22143764       22501647       +1.62%
BenchmarkJSONDecode                79646916       85658808       +7.55%
BenchmarkMandelbrot200             4720421        4700108        -0.43%
BenchmarkGoParse                   4651575        4712247        +1.30%
BenchmarkRegexpMatchMedium_1K      71986          73490          +2.09%
BenchmarkRegexpMatchHard_1K        111018         117495         +5.83%
BenchmarkRevcomp                   648798723      659352759      +1.63%
BenchmarkTemplate                  112673009      112819078      +0.13%

Comparing tip (a27f34c771cb) against this CL on an Intel Xeon E5520:

BenchmarkBinaryTree17              5461110720     5393104469     -1.25%
BenchmarkFannkuch11                4314677151     4327177615     +0.29%
BenchmarkGobDecode                 11065853       11235272       +1.53%
BenchmarkGobEncode                 6500065        6959837        +7.07%
BenchmarkGzip                      647478596      671769097      +3.75%
BenchmarkGunzip                    139348579      141096376      +1.25%
BenchmarkHTTPClientServer          69376          73610          +6.10%
BenchmarkJSONEncode                30172320       31796106       +5.38%
BenchmarkJSONDecode                113704905      114239137      +0.47%
BenchmarkMandelbrot200             6032730        6003077        -0.49%
BenchmarkGoParse                   6775251        6405995        -5.45%
BenchmarkRegexpMatchMedium_1K      111832         113895         +1.84%
BenchmarkRegexpMatchHard_1K        161112         168420         +4.54%
BenchmarkRevcomp                   876363406      892319935      +1.82%
BenchmarkTemplate                  146273096      148998339      +1.86%

Just to get a sense of where we are compared to the previous release,
here are the same benchmarks comparing Go 1.2 to this CL.

Comparing Go 1.2 against this CL on an Intel Core i5 MacBook Pro:

BenchmarkBinaryTree17              4370077662     3856337341     -11.76%
BenchmarkFannkuch11                3347052657     2991182127     -10.63%
BenchmarkGobDecode                 8791384        8788340        -0.03%
BenchmarkGobEncode                 4968759        5267394        +6.01%
BenchmarkGzip                      437815669      434168065      -0.83%
BenchmarkGunzip                    94604099       110563792      +16.87%
BenchmarkHTTPClientServer          87798          86131          -1.90%
BenchmarkJSONEncode                22818243       22501647       -1.39%
BenchmarkJSONDecode                97182444       85658808       -11.86%
BenchmarkMandelbrot200             4733516        4700108        -0.71%
BenchmarkGoParse                   5054384        4712247        -6.77%
BenchmarkRegexpMatchMedium_1K      67612          73490          +8.69%
BenchmarkRegexpMatchHard_1K        107321         117495         +9.48%
BenchmarkRevcomp                   733270055      659352759      -10.08%
BenchmarkTemplate                  109304977      112819078      +3.21%

Comparing Go 1.2 against this CL on an Intel Xeon E5520:

BenchmarkBinaryTree17              5986953594     5393104469     -9.92%
BenchmarkFannkuch11                4861139174     4327177615     -10.98%
BenchmarkGobDecode                 11830997       11235272       -5.04%
BenchmarkGobEncode                 6608722        6959837        +5.31%
BenchmarkGzip                      661875826      671769097      +1.49%
BenchmarkGunzip                    138630019      141096376      +1.78%
BenchmarkHTTPClientServer          71534          73610          +2.90%
BenchmarkJSONEncode                30393609       31796106       +4.61%
BenchmarkJSONDecode                139645860      114239137      -18.19%
BenchmarkMandelbrot200             5988660        6003077        +0.24%
BenchmarkGoParse                   6974092        6405995        -8.15%
BenchmarkRegexpMatchMedium_1K      111331         113895         +2.30%
BenchmarkRegexpMatchHard_1K        165961         168420         +1.48%
BenchmarkRevcomp                   995049292      892319935      -10.32%
BenchmarkTemplate                  145623363      148998339      +2.32%

Fixes #8036.

LGTM=khr
R=golang-codereviews, josharian, khr
CC=golang-codereviews, iant, r
https://golang.org/cl/99660044
2014-05-30 16:41:58 -04:00
Russ Cox
89d46fed2c cmd/gc: fix x=x crash
[Same as CL 102820043 except applied changes to 6g/gsubr.c
also to 5g/gsubr.c and 8g/gsubr.c. The problem I had last night
trying to do that was that 8g's copy of nodarg has different
(but equivalent) control flow and I was pasting the new code
into the wrong place.]

Description from CL 102820043:

The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.

Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.

The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.

Fixes #8097.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant, khr, r
https://golang.org/cl/103750043
2014-05-29 13:47:31 -04:00
Russ Cox
9dd062b82e undo CL 102820043 / b0ce6dbafc18
Breaks 386 and arm builds.
The obvious reason is that this CL only edited 6g/gsubr.c
and failed to edit 5g/gsubr.c and 8g/gsubr.c.
However, the obvious CL applying the same edit to those
files (CL 101900043) causes mysterious build failures
in various of the standard package tests, usually involving
reflect. Something deep and subtle is broken but only on
the 32-bit systems.

Undo this CL for now.

««« original CL description
cmd/gc: fix x=x crash

The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.

Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.

The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.

Fixes #8097.

LGTM=r, khr
R=golang-codereviews, r, khr
CC=golang-codereviews, iant
https://golang.org/cl/102820043
»»»

TBR=r
CC=golang-codereviews, khr
https://golang.org/cl/95660043
2014-05-28 21:46:20 -04:00
Russ Cox
948b2c722b cmd/gc: fix x=x crash
The 'nodarg' function is used to obtain a Node*
representing a function argument or result.
It returned a brand new Node*, but that violates
the guarantee in most places in the compiler that
two Node*s refer to the same variable if and only if
they are the same Node* pointer. Reestablish that
invariant by making nodarg return a preexisting
named variable if present.

Having fixed that, avoid any copy during x=x in
componentgen, because the VARDEF we emit
before the copy marks the lhs x as dead incorrectly.

The change in walk.c avoids modifying the result
of nodarg. This was the only place in the compiler
that did so.

Fixes #8097.

LGTM=r, khr
R=golang-codereviews, r, khr
CC=golang-codereviews, iant
https://golang.org/cl/102820043
2014-05-28 19:50:19 -04:00
Russ Cox
f5184d3437 cmd/gc: correct handling of globals, func args, results
Globals, function arguments, and results are special cases in
registerization.

Globals must be flushed aggressively, because nearly any
operation can cause a panic, and the recovery code must see
the latest values. Globals also must be loaded aggressively,
because nearly any store through a pointer might be updating a
global: the compiler cannot see all the "address of"
operations on globals, especially exported globals. To
accomplish this, mark all globals as having their address
taken, which effectively disables registerization.

If a function contains a defer statement, the function results
must be flushed aggressively, because nearly any operation can
cause a panic, and the deferred code may call recover, causing
the original function to return the current values of its
function results. To accomplish this, mark all function
results as having their address taken if the function contains
any defer statements. This causes not just aggressive flushing
but also aggressive loading. The aggressive loading is
overkill but the best we can do in the current code.

Function arguments must be considered live at all safe points
in a function, because garbage collection always preserves
them: they must be up-to-date in order to be preserved
correctly. Accomplish this by marking them live at all call
sites. An earlier attempt at this marked function arguments as
having their address taken, which disabled registerization
completely, making programs slower. This CL's solution allows
registerization while preserving safety. The benchmark speedup
is caused by being able to registerize again (the earlier CL
lost the same amount).

benchmark                old ns/op     new ns/op     delta
BenchmarkEqualPort32     61.4          56.0          -8.79%

benchmark                old MB/s     new MB/s     speedup
BenchmarkEqualPort32     521.56       570.97       1.09x

Fixes #1304. (again)
Fixes #7944. (again)
Fixes #7984.
Fixes #7995.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, iant, r
https://golang.org/cl/97500044
2014-05-15 15:34:53 -04:00
Russ Cox
26ad5d4ff0 cmd/gc: fix liveness vs regopt mismatch for input variables
The inputs to a function are marked live at all times in the
liveness bitmaps, so that the garbage collector will not free
the things they point at and reuse the pointers, so that the
pointers shown in stack traces are guaranteed not to have
been recycled.

Unfortunately, no one told the register optimizer that the
inputs need to be preserved at all call sites. If a function
is done with a particular input value, the optimizer will stop
preserving it across calls. For single-word values this just
means that the value recorded might be stale. For multi-word
values like slices, the value recorded could be only partially stale:
it can happen that, say, the cap was updated but not the len,
or that the len was updated but not the base pointer.
Either of these possibilities (and others) would make the
garbage collector misinterpret memory, leading to memory
corruption.

This came up in a real program, in which the garbage collector's
'slice len ≤ slice cap' check caught the inconsistency.

Fixes #7944.

LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, khr
https://golang.org/cl/100370045
2014-05-12 17:19:02 -04:00
Josh Bleecher Snyder
03c0f3fea9 cmd/gc: alias more variables during register allocation
This is joint work with Daniel Morsing.

In order for the register allocator to alias two variables, they must have the same width, stack offset, and etype. Code generation was altering a variable's etype in a few places. This prevented the variable from being moved to a register, which in turn prevented peephole optimization. This failure to alias was very common, with almost 23,000 instances just running make.bash.

This phenomenon was not visible in the register allocation debug output because the variables that failed to alias had the same name. The debugging-only change to bits.c fixes this by printing the variable number with its name.

This CL fixes the source of all etype mismatches for 6g, all but one case for 8g, and depressingly few cases for 5g. (I believe that extending CL 6819083 to 5g is a prerequisite.) Fixing the remaining cases in 8g and 5g is work for the future.

The etype mismatch fixes are:

* [gc] Slicing changed the type of the base pointer into a uintptr in order to perform arithmetic on it. Instead, support addition directly on pointers.

* [*g] OSPTR was giving type uintptr to slice base pointers; undo that. This arose, for example, while compiling copy(dst, src).

* [8g] 64 bit float conversion was assigning int64 type during codegen, overwriting the existing uint64 type.

Note that some etype mismatches are appropriate, such as a struct with a single field or an array with a single element.

With these fixes, the number of registerizations that occur while running make.bash for 6g increases ~10%. Hello world binary size shrinks ~1.5%. Running all benchmarks in the standard library show performance improvements ranging from nominal to substantive (>10%); a full comparison using 6g on my laptop is available at https://gist.github.com/josharian/8f9b5beb46667c272064. The microbenchmarks must be taken with a grain of salt; see issue 7920. The few benchmarks that show real regressions are likely due to issue 7920. I manually examined the generated code for the top few regressions and none had any assembly output changes. The few benchmarks that show extraordinary improvements are likely also due to issue 7920.

Performance results from 8g appear similar to 6g.

5g shows no performance improvements. This is not surprising, given the discussion above.

Update #7316

LGTM=rsc
R=rsc, daniel.morsing, bradfitz
CC=dave, golang-codereviews
https://golang.org/cl/91850043
2014-05-12 17:10:36 -04:00
Josh Bleecher Snyder
1848d71445 cmd/gc: don't give credit for NOPs during register allocation
The register allocator decides which variables should be placed into registers by charging for each load/store and crediting for each use, and then selecting an allocation with minimal cost. NOPs will be eliminated, however, so using a variable in a NOP should not generate credit.

Issue 7867 arises from attempted registerization of multi-word variables because they are used in NOPs. By not crediting for that use, they will no longer be considered for registerization.

This fix could theoretically lead to better register allocation, but NOPs are rare relative to other instructions.

Fixes #7867.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/94810044
2014-05-09 09:55:17 -07:00
Russ Cox
0a8a719ded cmd/5g, cmd/6g, cmd/8g: preserve wide values in large functions
In large functions with many variables, the register optimizer
may give up and choose not to track certain variables at all.
In this case, the "nextinnode" information linking together
all the words from a given variable will be incomplete, and
the result may be that only some of a multiword value is
preserved across a call. That confuses the garbage collector,
so don't do that. Instead, mark those variables as having
their address taken, so that they will be preserved at all
calls. It's overkill, but correct.

Tested by hand using the 6g -S output to see that it does fix
the buggy generated code leading to the issue 7726 failure.

There is no automated test because I managed to break the
compiler while writing a test (see issue 7727). I will check
in a test along with the fix to issue 7727.

Fixes #7726.

LGTM=khr
R=khr, bradfitz, dave
CC=golang-codereviews
https://golang.org/cl/85200043
2014-04-16 13:59:42 -04:00
Russ Cox
844ec6bbe3 cmd/8g: fix liveness for 387 build (including plan9)
TBR=khr
CC=golang-codereviews
https://golang.org/cl/84570045
2014-04-06 10:30:02 -04:00
Rémy Oudompheng
c6a41a3559 cmd/6g, cmd/8g: disable Duff's device on NaCl.
Native Client forbids jumps/calls to arbitrary locations and
enforces a particular alignement, which makes the Duff's device
ineffective.

LGTM=khr
R=rsc, dave, khr
CC=golang-codereviews
https://golang.org/cl/84400043
2014-04-04 08:42:35 +02:00
David du Colombier
9f9c9abb7e cmd/8g, cmd/gc: fix warnings on Plan 9
warning: src/cmd/8g/ggen.c:35 non-interruptable temporary
warning: src/cmd/gc/walk.c:656 set and not used: l
warning: src/cmd/gc/walk.c:658 set and not used: l

LGTM=minux.ma
R=golang-codereviews, minux.ma
CC=golang-codereviews
https://golang.org/cl/83660043
2014-04-02 21:33:50 +02:00
Keith Randall
47acf16709 cmd/gc: Don't zero more than we need.
Don't merge with the zero range, we may
end up zeroing more than we need.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/83430044
2014-04-02 09:17:42 -07:00
Keith Randall
383963b506 runtime: zero at start of frame more efficiently.
Use Duff's device for zeroing.  Combine adjacent regions.

Update #7680
Update #7624

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/83200045
2014-04-01 19:44:07 -07:00
Russ Cox
9c8f11ff96 cmd/5g, cmd/8g: fix build
Botched during CL 83090046.

TBR=khr
CC=golang-codereviews
https://golang.org/cl/83070046
2014-04-01 20:24:53 -04:00
Russ Cox
daca06f2e3 cmd/gc: shorten more temporary lifetimes
1. In functions with heap-allocated result variables or with
defer statements, the return sequence requires more than
just a single RET instruction. There is an optimization that
arranges for all returns to jump to a single copy of the return
epilogue in this case. Unfortunately, that optimization is
fundamentally incompatible with PC-based liveness information:
it takes PCs at many different points in the function and makes
them all land at one PC, making the combined liveness information
at that target PC a mess. Disable this optimization, so that each
return site gets its own copy of the 'call deferreturn' and the
copying of result variables back from the heap.
This removes quite a few spurious 'ambiguously live' variables.

2. Let orderexpr allocate temporaries that are passed by address
to a function call and then die on return, so that we can arrange
an appropriate VARKILL.

2a. Do this for ... slices.

2b. Do this for closure structs.

2c. Do this for runtime.concatstring, which is the implementation
of large string additions. Change representation of OADDSTR to
an explicit list in typecheck to avoid reconstructing list in both
walk and order.

3. Let orderexpr allocate the temporary variable copies used for
range loops, so that they can be killed when the loop is over.
Similarly, let it allocate the temporary holding the map iterator.

CL 81940043 reduced the number of ambiguously live temps
in the godoc binary from 860 to 711.

This CL reduces the number to 121. Still more to do, but another
good checkpoint.

Update #7345

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83090046
2014-04-01 20:02:54 -04:00
Keith Randall
6c7cbf086c runtime: get rid of most uses of REP for copying/zeroing.
REP MOVSQ and REP STOSQ have a really high startup overhead.
Use a Duff's device to do the repetition instead.

benchmark                 old ns/op     new ns/op     delta
BenchmarkClearFat32       7.20          1.60          -77.78%
BenchmarkCopyFat32        6.88          2.38          -65.41%
BenchmarkClearFat64       7.15          3.20          -55.24%
BenchmarkCopyFat64        6.88          3.44          -50.00%
BenchmarkClearFat128      9.53          5.34          -43.97%
BenchmarkCopyFat128       9.27          5.56          -40.02%
BenchmarkClearFat256      13.8          9.53          -30.94%
BenchmarkCopyFat256       13.5          10.3          -23.70%
BenchmarkClearFat512      22.3          18.0          -19.28%
BenchmarkCopyFat512       22.0          19.7          -10.45%
BenchmarkCopyFat1024      36.5          38.4          +5.21%
BenchmarkClearFat1024     35.1          35.0          -0.28%

TODO: use for stack frame zeroing
TODO: REP prefixes are still used for "reverse" copying when src/dst
regions overlap.  Might be worth fixing.

LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews, r
https://golang.org/cl/81370046
2014-04-01 12:51:02 -07:00
Russ Cox
b700cb4974 cmd/gc: shorten temporary lifetimes when possible
The new channel and map runtime routines take pointers
to values, typically temporaries. Without help, the compiler
cannot tell when those temporaries stop being needed,
because it isn't sure what happened to the pointer.
Arrange to insert explicit VARKILL instructions for these
temporaries so that the liveness analysis can avoid seeing
them as "ambiguously live".

The change is made in order.c, which was already in charge of
introducing temporaries to preserve the order-of-evaluation
guarantees. Now its job has expanded to include introducing
temporaries as needed by runtime routines, and then also
inserting the VARKILL annotations for all these temporaries,
so that their lifetimes can be shortened.

In order to do its job for the map runtime routines, order.c arranges
that all map lookups or map assignments have the form:

        x = m[k]
        x, y = m[k]
        m[k] = x

where x, y, and k are simple variables (often temporaries).
Likewise, receiving from a channel is now always:

        x = <-c

In order to provide the map guarantee, order.c is responsible for
rewriting x op= y into x = x op y, so that m[k] += z becomes

        t = m[k]
        t2 = t + z
        m[k] = t2

While here, fix a few bugs in order.c's traversal: it was failing to
walk into select and switch case bodies, so order of evaluation
guarantees were not preserved in those situations.
Added tests to test/reorder2.go.

Fixes #7671.

In gc/popt's temporary-merging optimization, allow merging
of temporaries with their address taken as long as the liveness
ranges do not intersect. (There is a good chance of that now
that we have VARKILL annotations to limit the liveness range.)

Explicitly killing temporaries cuts the number of ambiguously
live temporaries that must be zeroed in the godoc binary from
860 to 711, or -17%. There is more work to be done, but this
is a good checkpoint.

Update #7345

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/81940043
2014-04-01 13:31:38 -04:00
Russ Cox
6722d45631 cmd/gc: liveness-related bug fixes
1. On entry to a function, only zero the ambiguously live stack variables.
Before, we were zeroing all stack variables containing pointers.
The zeroing is pretty inefficient right now (issue 7624), but there are also
too many stack variables detected as ambiguously live (issue 7345),
and that must be addressed before deciding how to improve the zeroing code.
(Changes in 5g/ggen.c, 6g/ggen.c, 8g/ggen.c, gc/pgen.c)

Fixes #7647.

2. Make the regopt word-based liveness analysis preserve the
whole-variable liveness property expected by the garbage collection
bitmap liveness analysis. That is, if the regopt liveness decides that
one word in a struct needs to be preserved, make sure it preserves
the entire struct. This is particularly important for multiword values
such as strings, slices, and interfaces, in which all the words need
to be present in order to understand the meaning.
(Changes in 5g/reg.c, 6g/reg.c, 8g/reg.c.)

Fixes #7591.

3. Make the regopt word-based liveness analysis treat a variable
as having its address taken - which makes it preserved across
all future calls - whenever n->addrtaken is set, for consistency
with the gc bitmap liveness analysis, even if there is no machine
instruction actually taking the address. In this case n->addrtaken
is incorrect (a nicer way to put it is overconservative), and ideally
there would be no such cases, but they can happen and the two
analyses need to agree.
(Changes in 5g/reg.c, 6g/reg.c, 8g/reg.c; test in bug484.go.)

Fixes crashes found by turning off "zero everything" in step 1.

4. Remove spurious VARDEF annotations. As the comment in
gc/pgen.c explains, the VARDEF must immediately precede
the initialization. It cannot be too early, and it cannot be too late.
In particular, if a function call sits between the VARDEF and the
actual machine instructions doing the initialization, the variable
will be treated as live during that function call even though it is
uninitialized, leading to problems.
(Changes in gc/gen.c; test in live.go.)

Fixes crashes found by turning off "zero everything" in step 1.

5. Do not treat loading the address of a wide value as a signal
that the value must be initialized. Instead depend on the existence
of a VARDEF or the first actual read/write of a word in the value.
If the load is in order to pass the address to a function that does
the actual initialization, treating the load as an implicit VARDEF
causes the same problems as described in step 4.
The alternative is to arrange to zero every such value before
passing it to the real initialization function, but this is a much
easier and more efficient change.
(Changes in gc/plive.c.)

Fixes crashes found by turning off "zero everything" in step 1.

6. Treat wide input parameters with their address taken as
initialized on entry to the function. Otherwise they look
"ambiguously live" and we will try to emit code to zero them.
(Changes in gc/plive.c.)

Fixes crashes found by turning off "zero everything" in step 1.

7. An array of length 0 has no pointers, even if the element type does.
Without this change, the zeroing code complains when asked to
clear a 0-length array.
(Changes in gc/reflect.c.)

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80160044
2014-03-27 14:05:57 -04:00
Rémy Oudompheng
0285d2b96b cmd/6g, cmd/8g: skip CONVNOP nodes in bgen.
Revision 3ae4607a43ff introduced CONVNOP layers
to fix type checking issues arising from comparisons.
The added complexity made 8g run out of registers
when compiling an equality function in go.net/ipv6.

A similar issue occurred in test/sizeof.go on
amd64p32 with 6g.

Fixes #7405.

LGTM=khr
R=rsc, dave, iant, khr
CC=golang-codereviews
https://golang.org/cl/78100044
2014-03-20 22:22:37 +01:00
Keith Randall
14664050b9 cmd/6g: do small zeroings with straightline code.
Removes most uses of the REP prefix, which has a high startup cost.

LGTM=iant
R=golang-codereviews, iant, khr
CC=golang-codereviews
https://golang.org/cl/77920043
2014-03-19 15:41:34 -07:00
Dave Cheney
e31a1ce109 cmd/gc, cmd/5g, cmd/6g, cmd/8g: introduce linkarchinit and add amd64p32 support
Replaces CL 70000043.

Introduce linkarchinit() from cmd/ld.

For cmd/6g, switch to the amd64p32 linker model if we are building under nacl/amd64p32.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/71330045
2014-03-07 15:33:44 +11:00
Russ Cox
c2dd33a46f cmd/ld: clear unused ctxt before morestack
For non-closure functions, the context register is uninitialized
on entry and will not be used, but morestack saves it and then the
garbage collector treats it as live. This can be a source of memory
leaks if the context register points at otherwise dead memory.
Avoid this by introducing a parallel set of morestack functions
that clear the context register, and use those for the non-closure functions.

I hope this will help with some of the finalizer flakiness, but it probably won't.

Fixes #7244.

LGTM=dvyukov
R=khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/71030044
2014-03-04 13:53:08 -05:00
Ian Lance Taylor
fa6375ea47 cmd/6g, cmd/8g: simplify calls to gvardef
The gvardef function does nothing if n->class == PEXTERN, so
we don't need to test for that before calling it.  This makes
the 6g/8g code more like the 5g code and clarifies that the
cases that do not test for n->class != PEXTERN are not buggy.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/68900044
2014-02-26 07:37:10 -08:00
Josh Bleecher Snyder
d97d37188a 5g, 8g: remove dead code
maxstksize is superfluous and appears to be vestigial. 6g does not use it.

c >= 4 cannot occur; c = w % 4.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/68750043
2014-02-25 14:43:53 -08:00
Dave Cheney
7c8280c9ef all: merge NaCl branch (part 1)
See golang.org/s/go13nacl for design overview.

This CL is the mostly mechanical changes from rsc's Go 1.2 based NaCl branch, specifically 39cb35750369 to 500771b477cf from https://code.google.com/r/rsc-go13nacl. This CL does not include working NaCl support, there are probably two or three more large merges to come.

CL 15750044 is not included as it involves more invasive changes to the linker which will need to be merged separately.

The exact change lists included are

15050047: syscall: support for Native Client
15360044: syscall: unzip implementation for Native Client
15370044: syscall: Native Client SRPC implementation
15400047: cmd/dist, cmd/go, go/build, test: support for Native Client
15410048: runtime: support for Native Client
15410049: syscall: file descriptor table for Native Client
15410050: syscall: in-memory file system for Native Client
15440048: all: update +build lines for Native Client port
15540045: cmd/6g, cmd/8g, cmd/gc: support for Native Client
15570045: os: support for Native Client
15680044: crypto/..., hash/crc32, reflect, sync/atomic: support for amd64p32
15690044: net: support for Native Client
15690048: runtime: support for fake time like on Go Playground
15690051: build: disable various tests on Native Client

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/68150047
2014-02-25 09:47:42 -05:00
Russ Cox
1ca1cbea65 cmd/5g, cmd/8g: zero ambiguously live values on entry
The code here is being restored after its deletion in CL 14430048.

I restored the copy in cmd/6g in CL 56430043 but neglected the
other two.

This is the reason that enabling precisestack only worked on amd64.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/66170043
2014-02-19 17:08:55 -05:00