1
0
mirror of https://github.com/golang/go synced 2024-11-14 08:30:25 -07:00
Commit Graph

2980 Commits

Author SHA1 Message Date
Josh Bleecher Snyder
c5f0104daf cmd/compile: use intrinsic for LeadingZeros8 on amd64
The previous change sped up the pure computation form of LeadingZeros8.
This places it somewhat close to the table lookup form.
Depending on something that varies from toolchain to toolchain
(alignment, perhaps?), the slowdown from ditching the table lookup
is either 20% or 5%.

This benchmark is the best case scenario for the table lookup:
It is in the L1 cache already.

I think we're close enough that we can switch to the computational version,
and trust that the memory effects and binary size savings will be worth it.

Code:

func f8(x uint8)   { z = bits.LeadingZeros8(x) }

Before:

"".f8 STEXT nosplit size=34 args=0x8 locals=0x0
	0x0000 00000 (x.go:7)	TEXT	"".f8(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:7)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:7)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:7)	MOVBLZX	"".x+8(SP), AX
	0x0005 00005 (x.go:7)	MOVBLZX	AL, AX
	0x0008 00008 (x.go:7)	LEAQ	math/bits.len8tab(SB), CX
	0x000f 00015 (x.go:7)	MOVBLZX	(CX)(AX*1), AX
	0x0013 00019 (x.go:7)	ADDQ	$-8, AX
	0x0017 00023 (x.go:7)	NEGQ	AX
	0x001a 00026 (x.go:7)	MOVQ	AX, "".z(SB)
	0x0021 00033 (x.go:7)	RET

After:

"".f8 STEXT nosplit size=30 args=0x8 locals=0x0
	0x0000 00000 (x.go:7)	TEXT	"".f8(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:7)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:7)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:7)	MOVBLZX	"".x+8(SP), AX
	0x0005 00005 (x.go:7)	MOVBLZX	AL, AX
	0x0008 00008 (x.go:7)	LEAL	1(AX)(AX*1), AX
	0x000c 00012 (x.go:7)	BSRL	AX, AX
	0x000f 00015 (x.go:7)	ADDQ	$-8, AX
	0x0013 00019 (x.go:7)	NEGQ	AX
	0x0016 00022 (x.go:7)	MOVQ	AX, "".z(SB)
	0x001d 00029 (x.go:7)	RET

Change-Id: Icc7db50a7820fb9a3da8a816d6b6940d7f8e193e
Reviewed-on: https://go-review.googlesource.com/108942
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-25 21:34:15 +00:00
Josh Bleecher Snyder
1d321ada73 cmd/compile: optimize LeadingZeros(16|32) on amd64
Introduce Len8 and Len16 ops and provide optimized lowerings for them.
amd64 only for this CL, although it wouldn't surprise me
if other architectures also admit of optimized lowerings.

Also use and optimize the Len32 lowering, along the same lines.

Leave Len8 unused for the moment; a subsequent CL will enable it.

For 16 and 32 bits, this leads to a speed-up.

name              old time/op  new time/op  delta
LeadingZeros16-8  1.42ns ± 5%  1.23ns ± 5%  -13.42%  (p=0.000 n=20+20)
LeadingZeros32-8  1.25ns ± 5%  1.03ns ± 5%  -17.63%  (p=0.000 n=20+16)

Code:

func f16(x uint16) { z = bits.LeadingZeros16(x) }
func f32(x uint32) { z = bits.LeadingZeros32(x) }

Before:

"".f16 STEXT nosplit size=38 args=0x8 locals=0x0
	0x0000 00000 (x.go:8)	TEXT	"".f16(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:8)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:8)	MOVWLZX	"".x+8(SP), AX
	0x0005 00005 (x.go:8)	MOVWLZX	AX, AX
	0x0008 00008 (x.go:8)	BSRQ	AX, AX
	0x000c 00012 (x.go:8)	MOVQ	$-1, CX
	0x0013 00019 (x.go:8)	CMOVQEQ	CX, AX
	0x0017 00023 (x.go:8)	ADDQ	$-15, AX
	0x001b 00027 (x.go:8)	NEGQ	AX
	0x001e 00030 (x.go:8)	MOVQ	AX, "".z(SB)
	0x0025 00037 (x.go:8)	RET

"".f32 STEXT nosplit size=34 args=0x8 locals=0x0
	0x0000 00000 (x.go:9)	TEXT	"".f32(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:9)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:9)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:9)	MOVL	"".x+8(SP), AX
	0x0004 00004 (x.go:9)	BSRQ	AX, AX
	0x0008 00008 (x.go:9)	MOVQ	$-1, CX
	0x000f 00015 (x.go:9)	CMOVQEQ	CX, AX
	0x0013 00019 (x.go:9)	ADDQ	$-31, AX
	0x0017 00023 (x.go:9)	NEGQ	AX
	0x001a 00026 (x.go:9)	MOVQ	AX, "".z(SB)
	0x0021 00033 (x.go:9)	RET

After:

"".f16 STEXT nosplit size=30 args=0x8 locals=0x0
	0x0000 00000 (x.go:8)	TEXT	"".f16(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:8)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:8)	MOVWLZX	"".x+8(SP), AX
	0x0005 00005 (x.go:8)	MOVWLZX	AX, AX
	0x0008 00008 (x.go:8)	LEAL	1(AX)(AX*1), AX
	0x000c 00012 (x.go:8)	BSRL	AX, AX
	0x000f 00015 (x.go:8)	ADDQ	$-16, AX
	0x0013 00019 (x.go:8)	NEGQ	AX
	0x0016 00022 (x.go:8)	MOVQ	AX, "".z(SB)
	0x001d 00029 (x.go:8)	RET

"".f32 STEXT nosplit size=28 args=0x8 locals=0x0
	0x0000 00000 (x.go:9)	TEXT	"".f32(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:9)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:9)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:9)	MOVL	"".x+8(SP), AX
	0x0004 00004 (x.go:9)	LEAQ	1(AX)(AX*1), AX
	0x0009 00009 (x.go:9)	BSRQ	AX, AX
	0x000d 00013 (x.go:9)	ADDQ	$-32, AX
	0x0011 00017 (x.go:9)	NEGQ	AX
	0x0014 00020 (x.go:9)	MOVQ	AX, "".z(SB)
	0x001b 00027 (x.go:9)	RET

Change-Id: I6c93c173752a7bfdeab8be30777ae05a736e1f4b
Reviewed-on: https://go-review.googlesource.com/108941
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-25 21:34:04 +00:00
Josh Bleecher Snyder
54dbab5221 cmd/compile: optimize TrailingZeros(8|16) on amd64
Introduce Ctz8 and Ctz16 ops and provide optimized lowerings for them.
amd64 only for this CL, although it wouldn't surprise me
if other architectures also admit of optimized lowerings.

name               old time/op  new time/op  delta
TrailingZeros8-8   1.33ns ± 6%  0.84ns ± 3%  -36.90%  (p=0.000 n=20+20)
TrailingZeros16-8  1.26ns ± 5%  0.84ns ± 5%  -33.50%  (p=0.000 n=20+18)

Code:

func f8(x uint8)   { z = bits.TrailingZeros8(x) }
func f16(x uint16) { z = bits.TrailingZeros16(x) }

Before:

"".f8 STEXT nosplit size=34 args=0x8 locals=0x0
	0x0000 00000 (x.go:7)	TEXT	"".f8(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:7)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:7)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:7)	MOVBLZX	"".x+8(SP), AX
	0x0005 00005 (x.go:7)	MOVBLZX	AL, AX
	0x0008 00008 (x.go:7)	BTSQ	$8, AX
	0x000d 00013 (x.go:7)	BSFQ	AX, AX
	0x0011 00017 (x.go:7)	MOVL	$64, CX
	0x0016 00022 (x.go:7)	CMOVQEQ	CX, AX
	0x001a 00026 (x.go:7)	MOVQ	AX, "".z(SB)
	0x0021 00033 (x.go:7)	RET

"".f16 STEXT nosplit size=34 args=0x8 locals=0x0
	0x0000 00000 (x.go:8)	TEXT	"".f16(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:8)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:8)	MOVWLZX	"".x+8(SP), AX
	0x0005 00005 (x.go:8)	MOVWLZX	AX, AX
	0x0008 00008 (x.go:8)	BTSQ	$16, AX
	0x000d 00013 (x.go:8)	BSFQ	AX, AX
	0x0011 00017 (x.go:8)	MOVL	$64, CX
	0x0016 00022 (x.go:8)	CMOVQEQ	CX, AX
	0x001a 00026 (x.go:8)	MOVQ	AX, "".z(SB)
	0x0021 00033 (x.go:8)	RET

After:

"".f8 STEXT nosplit size=20 args=0x8 locals=0x0
	0x0000 00000 (x.go:7)	TEXT	"".f8(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:7)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:7)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:7)	MOVBLZX	"".x+8(SP), AX
	0x0005 00005 (x.go:7)	BTSL	$8, AX
	0x0009 00009 (x.go:7)	BSFL	AX, AX
	0x000c 00012 (x.go:7)	MOVQ	AX, "".z(SB)
	0x0013 00019 (x.go:7)	RET

"".f16 STEXT nosplit size=20 args=0x8 locals=0x0
	0x0000 00000 (x.go:8)	TEXT	"".f16(SB), NOSPLIT, $0-8
	0x0000 00000 (x.go:8)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
	0x0000 00000 (x.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (x.go:8)	MOVWLZX	"".x+8(SP), AX
	0x0005 00005 (x.go:8)	BTSL	$16, AX
	0x0009 00009 (x.go:8)	BSFL	AX, AX
	0x000c 00012 (x.go:8)	MOVQ	AX, "".z(SB)
	0x0013 00019 (x.go:8)	RET

Change-Id: I0551e357348de2b724737d569afd6ac9f5c3aa11
Reviewed-on: https://go-review.googlesource.com/108940
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-25 21:33:52 +00:00
Ilya Tocar
fb017c60bc cmd/compile/internal/ssa: fix endless compile loop on AMD64
We currently rewrite
(TESTQ (MOVQconst [c] x)) into (TESTQconst [c] x)
and (TESTQconst [-1] x) into (TESTQ x x)
if x is a (MOVQconst [-1]) we will be stuck in the endless rewrite loop.
Don't perform the rewrite in such cases.

Fixes #25006

Change-Id: I77f561ba2605fc104f1e5d5c57f32e9d67a2c000
Reviewed-on: https://go-review.googlesource.com/108879
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-24 16:20:41 +00:00
Josh Bleecher Snyder
d292f77e95 cmd/compile: rewrite 2*x+c into LEAx1 on amd64
Rewrite x<<1+c into x+x+c, which can be expressed as a single LEAQ/LEAL.

Bit of a special case, but the single-instruction
LEA is both shorter and faster than SHL then ADD.

Triggers 293 times during make.bash.

Change-Id: I3f09c8e9a8f3859d1eeed336f095fc3ada79c2c1
Reviewed-on: https://go-review.googlesource.com/108938
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-23 22:40:10 +00:00
Josh Bleecher Snyder
566e3e074c cmd/compile: avoid runtime call during switch string(byteslice)
This triggers three times while building std,
once in image/png and twice in go/internal/gccgoimporter.

There are no instances in std in which a more aggressive
optimization would have triggered.

This doesn't necessarily avoid an allocation,
because escape analysis is already able in many cases
to use a temporary backing for the string,
but it does at a minimum avoid the runtime call and copy.

Fixes #24937

Change-Id: I7019e85638ba8cd7e2f03890e672558b858579bc
Reviewed-on: https://go-review.googlesource.com/108035
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-21 00:50:50 +00:00
Lynn Boger
30311e8860 cmd/compile: generate load without DS relocation for go.string on ppc64le
Due to some recent optimizations related to the compare
instruction, DS-form load instructions started to be used
to load 8-byte go.strings. This can cause link time errors
if the go.string is not aligned to 4 bytes.

For DS-form instructions, the value in the offset field must
be a multiple of 4. If the offset is known at the time the
rules are processed, a DS-form load will not be chosen. But for
go.strings, the offset is not known at that time, but a
relocation is generated indicating that the linker should fill
in the DS relocation. When the linker tries to fill in the
relocation, if the offset is not aligned properly, a link error
will occur.

To fix this, when loading a go.string using MOVDload, the full
address of the go.string is generated and loaded into the base
register. Then the go.string is loaded with a 0 offset field.

Added a testcase that reproduces this problem.

Fixes #24799

Change-Id: I6a154e8e1cba64eae290be0fbcb608b75884ecdd
Reviewed-on: https://go-review.googlesource.com/107855
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-04-20 16:16:47 +00:00
Ben Shi
34f5f8a580 cmd/compile: optimize ARM64 with register indexed load/store
ARM64 supports load/store instructions with a memory operand that
the address is calculated by base register + index register.

In this CL,
1. Some rules are added to the compile's ARM64 backend to emit
such efficient instructions.
2. A wrong rule of load combination is fixed.

The go1 benchmark does show improvement.

name                     old time/op    new time/op    delta
BinaryTree17-4              44.5s ± 2%     44.1s ± 1%   -0.81%  (p=0.000 n=28+29)
Fannkuch11-4                32.7s ± 3%     30.5s ± 0%   -6.79%  (p=0.000 n=30+26)
FmtFprintfEmpty-4           499ns ± 0%     506ns ± 5%   +1.39%  (p=0.003 n=25+30)
FmtFprintfString-4         1.07µs ± 0%    1.04µs ± 4%   -3.17%  (p=0.000 n=23+30)
FmtFprintfInt-4            1.15µs ± 4%    1.13µs ± 0%   -1.55%  (p=0.000 n=30+23)
FmtFprintfIntInt-4         1.77µs ± 4%    1.74µs ± 0%   -1.71%  (p=0.000 n=30+24)
FmtFprintfPrefixedInt-4    2.37µs ± 5%    2.12µs ± 0%  -10.56%  (p=0.000 n=30+23)
FmtFprintfFloat-4          3.03µs ± 1%    3.03µs ± 4%   -0.13%  (p=0.003 n=25+30)
FmtManyArgs-4              7.38µs ± 1%    7.43µs ± 4%   +0.59%  (p=0.003 n=25+30)
GobDecode-4                 101ms ± 6%      95ms ± 5%   -5.55%  (p=0.000 n=30+30)
GobEncode-4                78.0ms ± 4%    78.8ms ± 6%   +1.05%  (p=0.000 n=30+30)
Gzip-4                      4.25s ± 0%     4.27s ± 4%   +0.45%  (p=0.003 n=24+30)
Gunzip-4                    428ms ± 1%     420ms ± 0%   -1.88%  (p=0.000 n=23+23)
HTTPClientServer-4          549µs ± 1%     541µs ± 1%   -1.56%  (p=0.000 n=29+29)
JSONEncode-4                194ms ± 0%     188ms ± 4%     ~     (p=0.417 n=23+30)
JSONDecode-4                890ms ± 5%     831ms ± 0%   -6.55%  (p=0.000 n=30+23)
Mandelbrot200-4            47.3ms ± 2%    46.5ms ± 0%     ~     (p=0.980 n=30+26)
GoParse-4                  43.1ms ± 6%    43.8ms ± 6%   +1.65%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4      1.06µs ± 0%    1.07µs ± 3%     ~     (p=0.092 n=23+30)
RegexpMatchEasy0_1K-4      5.53µs ± 0%    5.51µs ± 0%   -0.24%  (p=0.000 n=25+25)
RegexpMatchEasy1_32-4      1.02µs ± 3%    1.01µs ± 0%   -1.27%  (p=0.000 n=30+24)
RegexpMatchEasy1_1K-4      7.26µs ± 0%    7.33µs ± 0%   +0.95%  (p=0.000 n=23+26)
RegexpMatchMedium_32-4     1.84µs ± 7%    1.79µs ± 1%     ~     (p=0.333 n=30+23)
RegexpMatchMedium_1K-4      553µs ± 0%     547µs ± 0%   -1.14%  (p=0.000 n=24+22)
RegexpMatchHard_32-4       30.8µs ± 1%    30.3µs ± 0%   -1.40%  (p=0.000 n=24+24)
RegexpMatchHard_1K-4        928µs ± 0%     929µs ± 5%   +0.12%  (p=0.013 n=23+30)
Revcomp-4                   8.13s ± 4%     6.32s ± 1%  -22.23%  (p=0.000 n=30+23)
Template-4                  899ms ± 6%     854ms ± 1%   -5.01%  (p=0.000 n=30+24)
TimeParse-4                4.66µs ± 4%    4.59µs ± 1%   -1.57%  (p=0.000 n=30+23)
TimeFormat-4               4.58µs ± 0%    4.61µs ± 0%   +0.57%  (p=0.000 n=26+24)
[Geo mean]                  717µs          698µs        -2.55%

name                     old speed      new speed      delta
GobDecode-4              7.63MB/s ± 6%  8.08MB/s ± 5%   +5.88%  (p=0.000 n=30+30)
GobEncode-4              9.85MB/s ± 4%  9.75MB/s ± 6%   -1.04%  (p=0.000 n=30+30)
Gzip-4                   4.56MB/s ± 0%  4.55MB/s ± 4%   -0.36%  (p=0.003 n=24+30)
Gunzip-4                 45.3MB/s ± 1%  46.2MB/s ± 0%   +1.92%  (p=0.000 n=23+23)
JSONEncode-4             10.0MB/s ± 0%  10.4MB/s ± 4%     ~     (p=0.403 n=23+30)
JSONDecode-4             2.18MB/s ± 5%  2.33MB/s ± 0%   +6.91%  (p=0.000 n=30+23)
GoParse-4                1.34MB/s ± 5%  1.32MB/s ± 5%   -1.66%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4    30.2MB/s ± 0%  29.8MB/s ± 3%     ~     (p=0.099 n=23+30)
RegexpMatchEasy0_1K-4     185MB/s ± 0%   186MB/s ± 0%   +0.24%  (p=0.000 n=25+25)
RegexpMatchEasy1_32-4    31.4MB/s ± 3%  31.8MB/s ± 0%   +1.24%  (p=0.000 n=30+24)
RegexpMatchEasy1_1K-4     141MB/s ± 0%   140MB/s ± 0%   -0.94%  (p=0.000 n=23+26)
RegexpMatchMedium_32-4    541kB/s ± 6%   560kB/s ± 0%   +3.45%  (p=0.000 n=30+23)
RegexpMatchMedium_1K-4   1.85MB/s ± 0%  1.87MB/s ± 0%   +1.08%  (p=0.000 n=24+23)
RegexpMatchHard_32-4     1.04MB/s ± 1%  1.06MB/s ± 1%   +1.48%  (p=0.000 n=24+24)
RegexpMatchHard_1K-4     1.10MB/s ± 0%  1.10MB/s ± 5%   +0.15%  (p=0.004 n=23+30)
Revcomp-4                31.3MB/s ± 4%  40.2MB/s ± 1%  +28.52%  (p=0.000 n=30+23)
Template-4               2.16MB/s ± 6%  2.27MB/s ± 1%   +5.18%  (p=0.000 n=30+24)
[Geo mean]               7.57MB/s       7.79MB/s        +2.98%

fixes #24907

Change-Id: I94afd0e3f53d62a1cf5e452f3dd6daf61be21785
Reviewed-on: https://go-review.googlesource.com/107376
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-19 15:08:10 +00:00
Cherry Zhang
3042463d61 cmd/compile: in escape analysis, use element type for OIND of slice
The escape analysis models the flow of "content" of X with a
level of "indirection" (OIND node) of X. This content can be
pointer dereference, or slice/string element. For the latter
case, the type of the OIND node should be the element type of
the slice/string. This CL fixes this. In particular, this
matters when the element type is pointerless, where the data
flow should not cause any escape.

Fixes #15730.

Change-Id: Iba9f92898681625e7e3ddef76ae65d7cd61c41e0
Reviewed-on: https://go-review.googlesource.com/107597
Reviewed-by: David Chase <drchase@google.com>
2018-04-18 02:59:37 +00:00
Michael Munday
58cdecb9c8 cmd/compile: generate constants for NeqPtr, EqPtr and IsNonNil ops
If both inputs are constant offsets from the same pointer then we
can evaluate NeqPtr and EqPtr at compile time. Triggers a few times
during all.bash. Removes a conditional branch in the following
code:

copy(x[1:], x[:])

This branch was recently added as an optimization in CL 94596. We
now skip the memmove if the pointers are equal. However, in the
above code we know at compile time that they are never equal.

Also, when the offset is variable, check if the offset is zero
rather than if the pointers are equal. For example:

copy(x[a:], x[:])

This would now skip the copy if a == 0, rather than if x + a == x.

Finally I've also added a rule to make IsNonNil true for pointers
to values on the stack. The nil check elimination pass will catch
these anyway, but eliminating them here might eliminate branches
earlier.

Change-Id: If72f436fef0a96ad0f4e296d3a1f8b6c3e712085
Reviewed-on: https://go-review.googlesource.com/106635
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-16 20:43:57 +00:00
Ben Shi
cd65bbc01b cmd/compile/internal/ssa: optimize 386's subtraction
The SUBL instruction can take a memory operand, and this CL
implements this optimization.

The go1 benchmark shows a little improvement.

name                     old time/op    new time/op    delta
BinaryTree17-4              3.27s ± 2%     3.29s ± 3%    ~     (p=0.322 n=37+40)
Fannkuch11-4                3.49s ± 0%     3.53s ± 1%  +1.21%  (p=0.000 n=31+40)
FmtFprintfEmpty-4          46.2ns ± 3%    46.3ns ± 2%    ~     (p=0.351 n=40+28)
FmtFprintfString-4         82.0ns ± 3%    81.5ns ± 2%  -0.69%  (p=0.002 n=40+30)
FmtFprintfInt-4            94.6ns ± 3%    94.6ns ± 6%    ~     (p=0.913 n=39+37)
FmtFprintfIntInt-4          147ns ± 3%     150ns ± 2%  +1.72%  (p=0.000 n=40+25)
FmtFprintfPrefixedInt-4     186ns ± 3%     186ns ± 0%  -0.33%  (p=0.006 n=40+25)
FmtFprintfFloat-4           388ns ± 4%     388ns ± 4%    ~     (p=0.162 n=40+40)
FmtManyArgs-4               612ns ± 3%     616ns ± 4%    ~     (p=0.223 n=40+40)
GobDecode-4                7.35ms ± 5%    7.42ms ± 5%    ~     (p=0.095 n=40+40)
GobEncode-4                7.21ms ± 8%    7.23ms ± 4%    ~     (p=0.294 n=40+40)
Gzip-4                      360ms ± 4%     359ms ± 4%    ~     (p=0.097 n=40+40)
Gunzip-4                   46.1ms ± 3%    45.6ms ± 3%  -1.20%  (p=0.000 n=40+40)
HTTPClientServer-4         64.0µs ± 2%    64.1µs ± 2%    ~     (p=0.648 n=39+40)
JSONEncode-4               21.9ms ± 4%    22.1ms ± 5%    ~     (p=0.086 n=40+40)
JSONDecode-4               67.9ms ± 4%    66.7ms ± 4%  -1.63%  (p=0.000 n=40+40)
Mandelbrot200-4            5.19ms ± 3%    5.17ms ± 3%    ~     (p=0.881 n=40+40)
GoParse-4                  3.34ms ± 3%    3.28ms ± 2%  -1.78%  (p=0.000 n=40+40)
RegexpMatchEasy0_32-4       101ns ± 5%      99ns ± 3%  -2.40%  (p=0.000 n=40+40)
RegexpMatchEasy0_1K-4       851ns ± 1%     848ns ± 3%  -0.36%  (p=0.004 n=33+40)
RegexpMatchEasy1_32-4       109ns ± 5%     105ns ± 3%  -3.53%  (p=0.000 n=39+40)
RegexpMatchEasy1_1K-4      1.03µs ± 4%    1.03µs ± 3%    ~     (p=0.638 n=40+38)
RegexpMatchMedium_32-4      131ns ± 5%     127ns ± 4%  -3.36%  (p=0.000 n=38+40)
RegexpMatchMedium_1K-4     43.4µs ± 4%    43.2µs ± 3%  -0.46%  (p=0.008 n=40+40)
RegexpMatchHard_32-4       2.21µs ± 4%    2.23µs ± 1%  +0.77%  (p=0.014 n=40+28)
RegexpMatchHard_1K-4       67.6µs ± 4%    67.7µs ± 3%  +0.11%  (p=0.016 n=40+40)
Revcomp-4                   1.86s ± 3%     1.77s ± 2%  -4.81%  (p=0.000 n=40+40)
Template-4                 71.7ms ± 3%    71.6ms ± 4%    ~     (p=0.200 n=40+40)
TimeParse-4                 436ns ± 4%     433ns ± 3%    ~     (p=0.358 n=40+40)
TimeFormat-4                413ns ± 4%     412ns ± 3%    ~     (p=0.415 n=40+40)
[Geo mean]                 63.9µs         63.6µs       -0.49%

name                     old speed      new speed      delta
GobDecode-4               105MB/s ± 5%   104MB/s ± 5%    ~     (p=0.096 n=40+40)
GobEncode-4               106MB/s ± 7%   106MB/s ± 3%    ~     (p=0.385 n=39+40)
Gzip-4                   54.0MB/s ± 4%  54.0MB/s ± 4%    ~     (p=0.100 n=40+40)
Gunzip-4                  421MB/s ± 3%   426MB/s ± 3%  +1.21%  (p=0.000 n=40+40)
JSONEncode-4             88.5MB/s ± 5%  88.0MB/s ± 5%    ~     (p=0.083 n=40+40)
JSONDecode-4             28.6MB/s ± 4%  29.1MB/s ± 4%  +1.65%  (p=0.000 n=40+40)
GoParse-4                17.3MB/s ± 3%  17.7MB/s ± 2%  +1.82%  (p=0.000 n=40+40)
RegexpMatchEasy0_32-4     316MB/s ± 5%   323MB/s ± 4%  +2.44%  (p=0.000 n=40+40)
RegexpMatchEasy0_1K-4    1.20GB/s ± 1%  1.21GB/s ± 3%  +0.40%  (p=0.004 n=33+40)
RegexpMatchEasy1_32-4     291MB/s ± 7%   302MB/s ± 4%  +3.82%  (p=0.000 n=40+40)
RegexpMatchEasy1_1K-4     993MB/s ± 4%   990MB/s ± 3%    ~     (p=0.623 n=40+38)
RegexpMatchMedium_32-4   7.61MB/s ± 5%  7.87MB/s ± 4%  +3.36%  (p=0.000 n=38+40)
RegexpMatchMedium_1K-4   23.6MB/s ± 4%  23.7MB/s ± 4%  +0.46%  (p=0.007 n=40+40)
RegexpMatchHard_32-4     14.5MB/s ± 4%  14.3MB/s ± 1%  -0.79%  (p=0.017 n=40+28)
RegexpMatchHard_1K-4     15.1MB/s ± 4%  15.1MB/s ± 3%  -0.11%  (p=0.015 n=40+40)
Revcomp-4                 137MB/s ± 3%   144MB/s ± 3%  +5.06%  (p=0.000 n=40+40)
Template-4               27.1MB/s ± 3%  27.1MB/s ± 4%    ~     (p=0.211 n=40+40)
[Geo mean]               78.9MB/s       79.7MB/s       +1.01%

Change-Id: I638fa4fef85833e8605919d693f9570cc3cf7334
Reviewed-on: https://go-review.googlesource.com/107275
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-16 04:41:20 +00:00
Giovanni Bajo
e7b1d0a9cf test: add missing copyright header
Change-Id: Ia64535492515f725fe3c4b59ea300363a0c4ce10
Reviewed-on: https://go-review.googlesource.com/107136
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-15 21:17:54 +00:00
Giovanni Bajo
2954ef20bb test: small cleanup of code and comments in run.go
While writing CL 107315, I went back and forth for the syntax used for
constraints of build environments in which the architecture did not
support varitants ("plan9/amd64" vs "plan9/amd64/"). I eventually
settled for the latter because the code required less heuristics
(think parsing "plan9/386" vs "386/sse2") but there were a few
leftovers in code and comments.

Change-Id: I9d9a008f3814f9a1642609650eb571e7f1a675cf
Reviewed-on: https://go-review.googlesource.com/107338
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-15 21:17:43 +00:00
Giovanni Bajo
284ba47b49 test: run codegen tests on all supported architecture variants
This CL makes the codegen testsuite automatically test all
architecture variants for architecture specified in tests. For
instance, if a test file specifies a "arm" test, it will be
automatically run on all GOARM variants (5,6,7), to increase
the coverage.

The CL also introduces a syntax to specify only a specific
variant (eg: "arm/7") in case the test makes sense only there.
The same syntax also allows to specify the operating system
in case it matters (eg: "plan9/386/sse2").

Fixes #24658

Change-Id: I2eba8b918f51bb6a77a8431a309f8b71af07ea22
Reviewed-on: https://go-review.googlesource.com/107315
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-15 20:02:43 +00:00
Giovanni Bajo
01aa1d7dbe test: migrate plan9 tests to codegen
And remove it from asmtest. Next CL will remove the whole
asmtest infrastructure.

Change-Id: I5851bf7c617456d62a3c6cffacf70252df7b056b
Reviewed-on: https://go-review.googlesource.com/107335
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-15 20:02:30 +00:00
Cherry Zhang
b08a9b7ecc all: use new softfloat on GOARM=5
Use the new softfloat support in the compiler, originally added
for softfloat on MIPS. This support is portable, so we can just
use it for softfloat on ARM.

In the old softfloat support on ARM, the compiler generates
floating point instructions, then the assembler inserts calls
to _sfloat before FP instructions. _sfloat decodes the following
FP instructions and simulates them.

In the new scheme, the compiler generates runtime calls to do FP
operations at a higher level. It doesn't generate FP instructions,
and therefore the assembler won't insert _sfloat calls, i.e. the
old mechanism is automatically suppressed.

The old method may be still be triggered with assembly code
using FP instructions. In the standard library, the only
occurance is math/sqrt_arm.s, which is rewritten to call to the
Go implementation instead.

Some significant speedups for code using floating points:

name                     old time/op    new time/op     delta
BinaryTree17-4              37.1s ± 2%      37.3s ± 1%     ~     (p=0.105 n=10+10)
Fannkuch11-4                13.0s ± 0%      13.1s ± 0%   +0.46%  (p=0.000 n=10+10)
FmtFprintfEmpty-4           700ns ± 4%      734ns ± 6%   +4.84%  (p=0.009 n=10+10)
FmtFprintfString-4         1.22µs ± 3%     1.22µs ± 4%     ~     (p=0.897 n=10+10)
FmtFprintfInt-4            1.27µs ± 2%     1.30µs ± 1%   +1.91%  (p=0.001 n=10+9)
FmtFprintfIntInt-4         1.83µs ± 2%     1.81µs ± 3%     ~     (p=0.149 n=10+10)
FmtFprintfPrefixedInt-4    1.80µs ± 3%     1.81µs ± 2%     ~     (p=0.421 n=10+8)
FmtFprintfFloat-4          6.89µs ± 3%     3.59µs ± 2%  -47.93%  (p=0.000 n=10+10)
FmtManyArgs-4              6.39µs ± 1%     6.09µs ± 1%   -4.61%  (p=0.000 n=10+9)
GobDecode-4                 109ms ± 2%       81ms ± 2%  -25.99%  (p=0.000 n=9+10)
GobEncode-4                 109ms ± 2%       76ms ± 2%  -29.88%  (p=0.000 n=10+9)
Gzip-4                      3.61s ± 1%      3.59s ± 1%     ~     (p=0.247 n=10+10)
Gunzip-4                    449ms ± 4%      450ms ± 1%     ~     (p=0.230 n=10+7)
HTTPClientServer-4         1.55ms ± 3%     1.53ms ± 2%     ~     (p=0.400 n=9+10)
JSONEncode-4                356ms ± 1%      183ms ± 1%  -48.73%  (p=0.000 n=10+10)
JSONDecode-4                1.12s ± 2%      0.87s ± 1%  -21.88%  (p=0.000 n=10+10)
Mandelbrot200-4             5.49s ± 1%      2.55s ± 1%  -53.45%  (p=0.000 n=9+10)
GoParse-4                  49.6ms ± 2%     47.5ms ± 1%   -4.08%  (p=0.000 n=10+9)
RegexpMatchEasy0_32-4      1.13µs ± 4%     1.20µs ± 4%   +6.42%  (p=0.000 n=10+10)
RegexpMatchEasy0_1K-4      4.41µs ± 2%     4.44µs ± 2%     ~     (p=0.128 n=10+10)
RegexpMatchEasy1_32-4      1.15µs ± 5%     1.20µs ± 5%   +4.85%  (p=0.002 n=10+10)
RegexpMatchEasy1_1K-4      6.21µs ± 2%     6.37µs ± 4%   +2.62%  (p=0.001 n=9+10)
RegexpMatchMedium_32-4     1.58µs ± 5%     1.65µs ± 3%   +4.85%  (p=0.000 n=10+10)
RegexpMatchMedium_1K-4      341µs ± 3%      351µs ± 7%     ~     (p=0.573 n=8+10)
RegexpMatchHard_32-4       21.4µs ± 3%     21.5µs ± 5%     ~     (p=0.931 n=9+9)
RegexpMatchHard_1K-4        626µs ± 2%      626µs ± 1%     ~     (p=0.645 n=8+8)
Revcomp-4                  46.4ms ± 2%     47.4ms ± 2%   +2.07%  (p=0.000 n=10+10)
Template-4                  1.31s ± 3%      1.23s ± 4%   -6.13%  (p=0.000 n=10+10)
TimeParse-4                4.49µs ± 1%     4.41µs ± 2%   -1.81%  (p=0.000 n=10+9)
TimeFormat-4               9.31µs ± 1%     9.32µs ± 2%     ~     (p=0.561 n=9+9)

Change-Id: Iaeeff6c9a09c1b2c064d06e09dd88101dc02bfa4
Reviewed-on: https://go-review.googlesource.com/106735
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-13 16:39:39 +00:00
Cherry Zhang
5a91c83ce8 cmd/compile: in escape analysis, propagate loop depth to field
The escape analysis models "loop depth". If the address of an
expression is assigned to something defined at a lower (outer)
loop depth, the escape analysis decides it escapes. However, it
uses the loop depth of the address operator instead of where
the RHS is defined. This causes an unnecessary escape if there is
an assignment inside a loop but the RHS is defined outside the
loop. This CL propagates the loop depth.

Fixes #24730.

Change-Id: I5ff1530688bdfd90561a7b39c8be9bfc009a9dae
Reviewed-on: https://go-review.googlesource.com/105257
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-04-13 14:48:23 +00:00
Josh Bleecher Snyder
c1ed1f3c80 cmd/compile: fix evaluation of "" < s
Fixes #24817

Change-Id: Ifa79ab3dfe69297eeef85f7193cd5f85e5982bc5
Reviewed-on: https://go-review.googlesource.com/106655
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-12 19:38:37 +00:00
Josh Bleecher Snyder
2dfb423e6e cmd/compile: loop to ensure all autogenerated functions are compiled
I was wrong. There was a need to loop here.

Fixes #24761

Change-Id: If13b3ab72febde930bdaebdddd1c05e0d0446020
Reviewed-on: https://go-review.googlesource.com/105615
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-11 23:46:30 +00:00
Alberto Donizetti
467eca6076 test/codegen: port last stack and memcombining tests
And delete them from asm_test.

Also delete an arm64 cmov test has been already ported to the new test
harness.

Change-Id: I4458721e1f512bc9ecbbe1c22a2c9c7109ad68fe
Reviewed-on: https://go-review.googlesource.com/106335
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-04-11 16:08:04 +00:00
Robert Griesemer
3d501df441 cmd/compile: better error message when referring to ambiguous method/field
Fixes #14321.

Change-Id: I9c92c767b01cf7938c4808a8fef9f2936fc667bc
Reviewed-on: https://go-review.googlesource.com/106119
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-10 23:39:13 +00:00
Matthew Dempsky
535ad8efb8 cmd/compile: fix check that ensures main.main is a function
The check was previously disallowing package main from even importing
a non-function symbol named "main".

Fixes #24801.

Change-Id: I849b9713890429f0a16860ef16b5dc7e970d04a4
Reviewed-on: https://go-review.googlesource.com/106120
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-10 23:34:12 +00:00
Alberto Donizetti
188e2bf897 test/codegen: port arm64 BIC/EON/ORN and masking tests
And delete them from asm_test.

Change-Id: I24f421b87e8cb4770c887a6dfd58eacd0088947d
Reviewed-on: https://go-review.googlesource.com/106056
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-10 10:57:50 +00:00
Alberto Donizetti
d5ff631e6b test/codegen: port last remaining misc bit/arithmetic tests
And delete them from asm_test.

Change-Id: I9a75efe9858ef9d7ac86065f860c2ae3f25b0941
Reviewed-on: https://go-review.googlesource.com/105597
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-10 07:58:35 +00:00
Matthew Dempsky
49ed4cbe85 cmd/compile: sort method sets using package height
Also, when statically building itabs, compare *types.Sym instead of
name alone so that method sets with duplicate non-exported methods are
handled correctly.

Fixes #24693.

Change-Id: I2db8a3d6e80991a71fef5586a15134b6de116269
Reviewed-on: https://go-review.googlesource.com/105039
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-10 00:06:06 +00:00
Matthew Dempsky
fe77a5413e cmd/compile: fix constant pointer comparison failure
Previously, constant pointer-typed expressions could use either Mpint
or NilVal as their Val depending on their construction, but const.go
expects each type to have a single corresponding Val kind.

This CL changes pointer-typed expressions to exclusively use Mpint.

Fixes #21221.

Change-Id: I6ba36c9b11eb19a68306f0b296acb11a8c254c41
Reviewed-on: https://go-review.googlesource.com/105315
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-09 23:19:45 +00:00
Michael Munday
b65122f99a cmd/compile: optimize comparisons using load merging where available
Multi-byte comparison operations were used on amd64, arm64, i386
and s390x for comparisons with constant arrays, but only amd64 and
i386 for comparisons with string constants. This CL combines the
check for platform capability, since they have the same requirements,
and also enables both on ppc64le which also supports load merging.

Note that these optimizations currently use little endian byte order
which results in byte reversal instructions on s390x. This should
be fixed at some point.

Change-Id: Ie612d13359b50c77f4d7c6e73fea4a59fa11f322
Reviewed-on: https://go-review.googlesource.com/102558
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-09 21:16:47 +00:00
Keith Randall
0de0ed369f test: check that unaligned load-add opcodes work.
A test for CL 102036.

Change-Id: Ief6dcb4f478670813fbe22ea75a06815a4b201a3
Reviewed-on: https://go-review.googlesource.com/105875
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-09 18:57:37 +00:00
Alberto Donizetti
54c3f56ee0 test/codegen: port various mem-combining tests
And delete them from asm_test.

Change-Id: I0e33d58274951ab5acb67b0117b60ef617ea887a
Reviewed-on: https://go-review.googlesource.com/105735
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-09 12:00:06 +00:00
Alberto Donizetti
3e31eb6b84 test/codegen: port arm64 slice zeroing tests
Finish porting arm64 slice zeroing codegen tests; delete them from
asm_test.

Change-Id: Id2532df8ba1c340fa662a6b5238daa3de30548be
Reviewed-on: https://go-review.googlesource.com/105136
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-04-07 09:55:51 +00:00
Matthew Dempsky
950a56899a cmd/compile: fix method expressions with anonymous receivers
Method expressions with anonymous receiver types like "struct { T }.m"
require wrapper functions, which we weren't always creating. This in
turn resulted in linker errors.

This CL ensures that we generate wrapper functions for any anonymous
receiver types used in a method expression.

Fixes #22444.

Change-Id: Ia8ac27f238c2898965e57b82a91d959792d2ddd4
Reviewed-on: https://go-review.googlesource.com/105044
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-06 15:39:11 +00:00
Matthew Dempsky
638f112d69 cmd/compile: cleanup method symbol creation
There were multiple ad hoc ways to create method symbols, with subtle
and confusing differences between them. This CL unifies them into a
single well-documented encoding and implementation.

This introduces some inconsequential changes to symbol format for the
sake of simplicity and consistency. Two notable changes:

1) Symbol construction is now insensitive to the package currently
being compiled. Previously, non-exported methods on anonymous types
received different method symbols depending on whether the method was
local or imported.

2) Symbols for method values parenthesized non-pointer receiver types
and non-exported method names, and also always package-qualified
non-exported method names. Now they use the same rules as normal
method symbols.

The methodSym function is also now stricter about rejecting
non-sensical method/receiver combinations. Notably, this means that
typecheckfunc needs to call addmethod to validate the method before
calling declare, which also means we no longer emit errors about
redeclaring bogus methods.

Change-Id: I9501c7a53dd70ef60e5c74603974e5ecc06e2003
Reviewed-on: https://go-review.googlesource.com/104876
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-05 22:01:17 +00:00
Daniel Martí
9767727353 test: skip locklinear's lockmany test for now
Since it's been reliably failing on one of the linux-arm builders
(arm5spacemonkey) for a long time.

Updates #24221.

Change-Id: I8fccc7e16631de497ccc2c285e510a110a93ad95
Reviewed-on: https://go-review.googlesource.com/104535
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-05 10:53:40 +00:00
Alberto Donizetti
f2abca90a2 test/codegen: port arm64 byte slice zeroing tests
And delete them from asm_test.

Change-Id: Id533130470da9176a401cb94972f626f43a62148
Reviewed-on: https://go-review.googlesource.com/103656
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-04-04 13:18:15 +00:00
Robert Griesemer
4637699e92 cmd/compile/internal/syntax: better error message for incorrect if/switch header
Fixes #23664.

Change-Id: Ic0637e9f896b2fc6502dfbab2d1c4de3c62c0bd2
Reviewed-on: https://go-review.googlesource.com/104616
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Robert Griesemer <gri@golang.org>
2018-04-03 21:57:37 +00:00
Giovanni Bajo
ac43de3ae5 cmd/compile: in prove, complete support for OpIsInBounds/OpIsSliceInBounds
The logic in addBranchRestrictions didn't allow to correctly
model OpIs(Slice)Bound for signed domain, and it was also partly
implemented within addRestrictions.

Thanks to the previous changes, it is now possible to handle
the negative conditions correctly, so that we can learn
both signed/LT + unsigned/LT on the positive side, and
signed/GE + unsigned/GE on the negative side (but only if
the index can be proved to be non-negative).

This is able to prove ~50 more slice accesses in std+cmd.

Change-Id: I9858080dc03b16f85993a55983dbc4b00f8491b0
Reviewed-on: https://go-review.googlesource.com/104037
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-03 20:25:34 +00:00
isharipo
dcaf3fb134 cmd/compile: make DCE remove nodes after terminating if
This change makes compiler frontend dead code elimination of const expr if
statements introduced in https://golang.org/cl/38773 treat both
	if constCondTrue { ...; returnStmt } toBeRemoved...
	if constCondFalse { ...; } else { returnStmt } toBeRemoved...
identically to:
	if constCondTrue { ...; returnStmt } else { toBeRemoved... }

Where "constCondTrue" is a an expression that can be evaluated
to "true" during compile time.

The additional checks are only triggered for const expr
if conditions that evaluate to true.

name       old time/op       new time/op       delta
Template         431ms ± 2%        429ms ± 1%    ~     (p=0.491 n=8+6)
Unicode          198ms ± 4%        201ms ± 2%    ~     (p=0.234 n=7+6)
GoTypes          1.40s ± 1%        1.41s ± 2%    ~     (p=0.053 n=7+7)
Compiler         6.72s ± 2%        6.81s ± 1%  +1.35%  (p=0.011 n=7+7)
SSA              17.3s ± 1%        17.3s ± 2%    ~     (p=0.731 n=6+7)
Flate            275ms ± 2%        275ms ± 2%    ~     (p=0.902 n=7+7)
GoParser         340ms ± 2%        339ms ± 2%    ~     (p=0.902 n=7+7)
Reflect          910ms ± 2%        905ms ± 1%    ~     (p=0.310 n=6+6)
Tar              403ms ± 1%        403ms ± 2%    ~     (p=0.366 n=7+6)
XML              486ms ± 1%        490ms ± 1%    ~     (p=0.065 n=6+6)
StdCmd           56.2s ± 1%        56.6s ± 2%    ~     (p=0.620 n=7+7)

name       old user-time/op  new user-time/op  delta
Template         559ms ± 8%        557ms ± 7%    ~     (p=0.713 n=8+7)
Unicode          266ms ±13%        277ms ± 9%    ~     (p=0.157 n=8+7)
GoTypes          1.83s ± 2%        1.84s ± 1%    ~     (p=0.522 n=8+7)
Compiler         8.67s ± 4%        8.89s ± 4%    ~     (p=0.077 n=7+7)
SSA              23.9s ± 1%        24.2s ± 1%  +1.31%  (p=0.005 n=7+7)
Flate            351ms ± 4%        342ms ± 5%    ~     (p=0.105 n=7+7)
GoParser         437ms ± 2%        423ms ± 5%  -3.14%  (p=0.016 n=7+7)
Reflect          1.16s ± 3%        1.15s ± 2%    ~     (p=0.362 n=7+7)
Tar              517ms ± 4%        511ms ± 3%    ~     (p=0.538 n=7+7)
XML              619ms ± 3%        617ms ± 4%    ~     (p=0.483 n=7+7)

Fixes #23521

Change-Id: I165a7827d869aeb93ce6047d026ff873d039a4f3
Reviewed-on: https://go-review.googlesource.com/91056
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-04-03 15:19:41 +00:00
Robert Griesemer
c65a2781be cmd/compile: better handling of incorrect type switches
Don't report errors if we don't have a correct type switch
guard; instead ignore it and leave it to the type-checker
to report the error. This leads to better error messages
concentrating on the type switch guard rather than errors
around (confusing) syntactic details.

Also clean up some code setting up AssertExpr (they never
have a nil Type field) and remove some incorrect TODOs.

Fixes #24470.

Change-Id: I69512f36e0417e3b5ea9c8856768e04b19d654a8
Reviewed-on: https://go-review.googlesource.com/103615
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-03 05:34:20 +00:00
isharipo
b44e73eacb test/fixedbugs: fix bug248 and bug345
When test/run script was removed, these two tests
were changed to be executed by test/run.go.
Because errchk does not exit with non-zero status on
errors, they were silently failing for a while.

This change makes 2 things:

1. Compile tested packages in GOROOT/test to match older runner script
   behavior (strictly required only in bug345, optional in bug248)

2. Check command output with "(?m)^BUG" regexp.
   It approximates older `grep -q '^BUG' that was used before.

See referenced issue for detailed explanation.

Fixes #24629

Change-Id: Ie888dcdb4e25cdbb19d434bbc5cb03eb633e9ee8
Reviewed-on: https://go-review.googlesource.com/104095
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-02 20:08:27 +00:00
Bryan Chan
625f2dccd4 cmd/compile/internal/ssa: handle symbol address comparisons consistently
CL 38338 introduced SSA rules to optimize two types of pointer equality
tests: a pointer compared with itself, and comparison of addresses taken
of two symbols which may have the same base. This patch adds rules to
apply the same optimization to pointer inequality tests, which also ensures
that two pointers to zero-width types cannot be both equal and unequal
at the same time.

Fixes #24503.

Change-Id: Ic828aeb86ae2e680caf66c35f4c247674768a9ba
Reviewed-on: https://go-review.googlesource.com/102275
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-31 21:37:13 +00:00
Alberto Donizetti
3b0b8bcd68 test/codegen: port stack-related tests to codegen
And delete them from asm_test.

Change-Id: Idfe1249052d82d15b9c30b292c78656a0bf7b48d
Reviewed-on: https://go-review.googlesource.com/103315
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-30 08:08:06 +00:00
Alberto Donizetti
56eaf574a1 test/codegen: match 387 ops too for GOARCH=386
Change-Id: I99407e27e340689009af798989b33cef7cb92070
Reviewed-on: https://go-review.googlesource.com/103376
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-29 20:05:40 +00:00
Alberto Donizetti
aacf7a1846 test: avoid touching GOOS/GOARCH in codegen driver
This change modifies the codegen test harness driver so that it no
longer modifies the environment GOOS/GOARCH, since that seems to cause
flakiness in other concurrently-running tests.

The change also enables the codegen tests in run.go.

Fixes #24538

Change-Id: I997ac1eb38eb92054efff67fe5c4d3cccc86412b
Reviewed-on: https://go-review.googlesource.com/103455
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-29 19:00:10 +00:00
Alberto Donizetti
04e993f953 test: update list of escape reasons
The escape_because.go test file (which tests the "because" escape
explainations printed by `-m -m`) cointains a machine-generated list
of all the escape reasons seen in the escape tests.

The list appears to be outdated; moreove a new escape reason was added
in CL 102895. This change re-generates the list.

Change-Id: Idc721c6bbfe9516895b5cf1e6d09b77deda5a3dd
Reviewed-on: https://go-review.googlesource.com/103375
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-29 14:17:40 +00:00
Alberto Donizetti
360c19157a cmd/compile: print accurate escape reason for non-const-length slices
This change makes `-m -m` print a better explanation for the case
where a slice is marked as escaping and heap-allocated because it
has a non-constant len/cap.

Fixes #24578

Change-Id: I0ebafb77c758a99857d72b365817bdba7b446cc0
Reviewed-on: https://go-review.googlesource.com/102895
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-03-28 16:56:03 +00:00
Matthew Dempsky
7b177b1a03 cmd/compile: fix method set computation for shadowed methods
In expandmeth, we call expand1/expand0 to build a list of all
candidate methods to promote, and then we use dotpath to prune down
which names actually resolve to a promoted method and how.

However, previously we still computed "followsptr" based on the
expand1/expand0 traversal (which is depth-first), rather than
dotpath (which is breadth-first). The result is that we could
sometimes end up miscomputing whether a particular promoted method
involves a pointer traversal, which could result in bad code
generation for method trampolines.

Fixes #24547.

Change-Id: I57dc014466d81c165b05d78b98610dc3765b7a90
Reviewed-on: https://go-review.googlesource.com/102618
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-03-27 18:56:36 +00:00
Alberto Donizetti
a27cd4fd31 test/codegen: port tbz/tbnz arm64 tests
And delete them from asm_test.

Change-Id: I34fcf85ae8ce09cd146fe4ce6a0ae7616bd97e2d
Reviewed-on: https://go-review.googlesource.com/102296
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-24 09:35:53 +00:00
Giovanni Bajo
79112707bb cmd/compile: add patterns for bit set/clear/complement on amd64
This patch completes implementation of BT(Q|L), and adds support
for BT(S|R|C)(Q|L).

Example of code changes from time.(*Time).addSec:

        if t.wall&hasMonotonic != 0 {
  0x1073465               488b08                  MOVQ 0(AX), CX
  0x1073468               4889ca                  MOVQ CX, DX
  0x107346b               48c1e93f                SHRQ $0x3f, CX
  0x107346f               48c1e13f                SHLQ $0x3f, CX
  0x1073473               48f7c1ffffffff          TESTQ $-0x1, CX
  0x107347a               746b                    JE 0x10734e7

        if t.wall&hasMonotonic != 0 {
  0x1073435               488b08                  MOVQ 0(AX), CX
  0x1073438               480fbae13f              BTQ $0x3f, CX
  0x107343d               7363                    JAE 0x10734a2

Another example:

                        t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
  0x10734c8               4881e1ffffff3f          ANDQ $0x3fffffff, CX
  0x10734cf               48c1e61e                SHLQ $0x1e, SI
  0x10734d3               4809ce                  ORQ CX, SI
  0x10734d6               48b90000000000000080    MOVQ $0x8000000000000000, CX
  0x10734e0               4809f1                  ORQ SI, CX
  0x10734e3               488908                  MOVQ CX, 0(AX)

                        t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
  0x107348b		4881e2ffffff3f		ANDQ $0x3fffffff, DX
  0x1073492		48c1e61e		SHLQ $0x1e, SI
  0x1073496		4809f2			ORQ SI, DX
  0x1073499		480fbaea3f		BTSQ $0x3f, DX
  0x107349e		488910			MOVQ DX, 0(AX)

Go1 benchmarks seem unaffected, and I would be surprised
otherwise:

name                     old time/op    new time/op     delta
BinaryTree17-4              2.64s ± 4%      2.56s ± 9%  -2.92%  (p=0.008 n=9+9)
Fannkuch11-4                2.90s ± 1%      2.95s ± 3%  +1.76%  (p=0.010 n=10+9)
FmtFprintfEmpty-4          35.3ns ± 1%     34.5ns ± 2%  -2.34%  (p=0.004 n=9+8)
FmtFprintfString-4         57.0ns ± 1%     58.4ns ± 5%  +2.52%  (p=0.029 n=9+10)
FmtFprintfInt-4            59.8ns ± 3%     59.8ns ± 6%    ~     (p=0.565 n=10+10)
FmtFprintfIntInt-4         93.9ns ± 3%     91.2ns ± 5%  -2.94%  (p=0.014 n=10+9)
FmtFprintfPrefixedInt-4     107ns ± 6%      104ns ± 6%    ~     (p=0.099 n=10+10)
FmtFprintfFloat-4           187ns ± 3%      188ns ± 3%    ~     (p=0.505 n=10+9)
FmtManyArgs-4               410ns ± 1%      415ns ± 6%    ~     (p=0.649 n=8+10)
GobDecode-4                5.30ms ± 3%     5.27ms ± 3%    ~     (p=0.436 n=10+10)
GobEncode-4                4.62ms ± 5%     4.47ms ± 2%  -3.24%  (p=0.001 n=9+10)
Gzip-4                      197ms ± 4%      193ms ± 3%    ~     (p=0.123 n=10+10)
Gunzip-4                   30.4ms ± 3%     30.1ms ± 3%    ~     (p=0.481 n=10+10)
HTTPClientServer-4         76.3µs ± 1%     76.0µs ± 1%    ~     (p=0.236 n=8+9)
JSONEncode-4               10.5ms ± 9%     10.3ms ± 3%    ~     (p=0.280 n=10+10)
JSONDecode-4               42.3ms ±10%     41.3ms ± 2%    ~     (p=0.053 n=9+10)
Mandelbrot200-4            3.80ms ± 2%     3.72ms ± 2%  -2.15%  (p=0.001 n=9+10)
GoParse-4                  2.88ms ±10%     2.81ms ± 2%    ~     (p=0.247 n=10+10)
RegexpMatchEasy0_32-4      69.5ns ± 4%     68.6ns ± 2%    ~     (p=0.171 n=10+10)
RegexpMatchEasy0_1K-4       165ns ± 3%      162ns ± 3%    ~     (p=0.137 n=10+10)
RegexpMatchEasy1_32-4      65.7ns ± 6%     64.4ns ± 2%  -2.02%  (p=0.037 n=10+10)
RegexpMatchEasy1_1K-4       278ns ± 2%      279ns ± 3%    ~     (p=0.991 n=8+9)
RegexpMatchMedium_32-4     99.3ns ± 3%     98.5ns ± 4%    ~     (p=0.457 n=10+9)
RegexpMatchMedium_1K-4     30.1µs ± 1%     30.4µs ± 2%    ~     (p=0.173 n=8+10)
RegexpMatchHard_32-4       1.40µs ± 2%     1.41µs ± 4%    ~     (p=0.565 n=10+10)
RegexpMatchHard_1K-4       42.5µs ± 1%     41.5µs ± 3%  -2.13%  (p=0.002 n=8+9)
Revcomp-4                   332ms ± 4%      328ms ± 5%    ~     (p=0.720 n=9+10)
Template-4                 48.3ms ± 2%     49.6ms ± 3%  +2.56%  (p=0.002 n=8+10)
TimeParse-4                 252ns ± 2%      249ns ± 3%    ~     (p=0.116 n=9+10)
TimeFormat-4                262ns ± 4%      252ns ± 3%  -4.01%  (p=0.000 n=9+10)

name                     old speed      new speed       delta
GobDecode-4               145MB/s ± 3%    146MB/s ± 3%    ~     (p=0.436 n=10+10)
GobEncode-4               166MB/s ± 5%    172MB/s ± 2%  +3.28%  (p=0.001 n=9+10)
Gzip-4                   98.6MB/s ± 4%  100.4MB/s ± 3%    ~     (p=0.123 n=10+10)
Gunzip-4                  639MB/s ± 3%    645MB/s ± 3%    ~     (p=0.481 n=10+10)
JSONEncode-4              185MB/s ± 8%    189MB/s ± 3%    ~     (p=0.280 n=10+10)
JSONDecode-4             46.0MB/s ± 9%   47.0MB/s ± 2%  +2.21%  (p=0.046 n=9+10)
GoParse-4                20.1MB/s ± 9%   20.6MB/s ± 2%    ~     (p=0.239 n=10+10)
RegexpMatchEasy0_32-4     460MB/s ± 4%    467MB/s ± 2%    ~     (p=0.165 n=10+10)
RegexpMatchEasy0_1K-4    6.19GB/s ± 3%   6.28GB/s ± 3%    ~     (p=0.165 n=10+10)
RegexpMatchEasy1_32-4     487MB/s ± 5%    497MB/s ± 2%  +2.00%  (p=0.043 n=10+10)
RegexpMatchEasy1_1K-4    3.67GB/s ± 2%   3.67GB/s ± 3%    ~     (p=0.963 n=8+9)
RegexpMatchMedium_32-4   10.1MB/s ± 3%   10.1MB/s ± 4%    ~     (p=0.435 n=10+9)
RegexpMatchMedium_1K-4   34.0MB/s ± 1%   33.7MB/s ± 2%    ~     (p=0.173 n=8+10)
RegexpMatchHard_32-4     22.9MB/s ± 2%   22.7MB/s ± 4%    ~     (p=0.565 n=10+10)
RegexpMatchHard_1K-4     24.0MB/s ± 3%   24.7MB/s ± 3%  +2.64%  (p=0.001 n=9+9)
Revcomp-4                 766MB/s ± 4%    775MB/s ± 5%    ~     (p=0.720 n=9+10)
Template-4               40.2MB/s ± 2%   39.2MB/s ± 3%  -2.47%  (p=0.002 n=8+10)

The rules match ~1800 times during all.bash.

Fixes #18943

Change-Id: I64be1ada34e89c486dfd935bf429b35652117ed4
Reviewed-on: https://go-review.googlesource.com/94766
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-24 02:38:50 +00:00
Alberto Donizetti
fc6280d4b0 test/codegen: port direct comparisons with memory tests
And remove them from asm_test.

Change-Id: I1ca29b40546d6de06f20bfd550ed8ff87f495454
Reviewed-on: https://go-review.googlesource.com/102115
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-22 17:20:09 +00:00
Alberto Donizetti
be371edd67 test/codegen: port comparisons tests to codegen
And delete them from asm_test.

Change-Id: I64c512bfef3b3da6db5c5d29277675dade28b8ab
Reviewed-on: https://go-review.googlesource.com/101595
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-20 19:38:06 +00:00