Unroll s == "ab" to
len(s) == 2 && s[0] == 'a' && s[1] == 'b'
This generates faster and shorter code
by avoiding a runtime call.
Do something similar for !=.
The cutoff length is 6. This was chosen empirically
by examining binary sizes on arm, arm64, 386, and amd64
using the SSA backend.
For all architectures examined, 4, 5, and 6 were
the ideal cutoff, with identical binary sizes.
The distribution of constant string equality sizes
during 'go build -a std' is:
40.81% 622 len 0
14.11% 215 len 4
9.45% 144 len 1
7.81% 119 len 3
7.48% 114 len 5
5.12% 78 len 7
4.13% 63 len 2
3.54% 54 len 8
2.69% 41 len 6
1.18% 18 len 10
0.85% 13 len 9
0.66% 10 len 14
0.59% 9 len 17
0.46% 7 len 11
0.26% 4 len 12
0.20% 3 len 19
0.13% 2 len 13
0.13% 2 len 15
0.13% 2 len 16
0.07% 1 len 20
0.07% 1 len 23
0.07% 1 len 33
0.07% 1 len 36
A cutoff of length 6 covers most of the cases.
Benchmarks on amd64 comparing a string to a constant of length 3:
Cmp/1same-8 4.78ns ± 6% 0.94ns ± 9% -80.26% (p=0.000 n=20+20)
Cmp/1diffbytes-8 6.43ns ± 6% 0.96ns ±11% -85.13% (p=0.000 n=20+20)
Cmp/3same-8 4.71ns ± 5% 1.28ns ± 5% -72.90% (p=0.000 n=20+20)
Cmp/3difffirstbyte-8 6.33ns ± 7% 1.27ns ± 7% -79.90% (p=0.000 n=20+20)
Cmp/3difflastbyte-8 6.34ns ± 8% 1.26ns ± 9% -80.13% (p=0.000 n=20+20)
The change to the prove test preserves the
existing intent of the test. When the string was
short, there was a new "proved in bounds" report
that referred to individual byte comparisons.
Change-Id: I593ac303b0d11f275672090c5c786ea0c6b8da13
Reviewed-on: https://go-review.googlesource.com/26758
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
staticassign unwraps all CONVNOPs.
However, in the included test, we need the
CONVNOP for everything to typecheck.
Stop unwrapping unnecessarily.
The code we generate for this example is
suboptimal, but that's not new; see #17113.
Fixes#17111.
Change-Id: I29532787a074a6fe19a5cc53271eb9c84bf1b576
Reviewed-on: https://go-review.googlesource.com/29213
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Get rid of BlockCheck. Josh goaded me into it, and I went
down a rabbithole making it happen.
NilCheck now panics if the pointer is nil and returns void, as before.
BlockCheck is gone, and NilCheck is no longer a Control value for
any block. It just exists (and deadcode knows not to throw it away).
I rewrote the nilcheckelim pass to handle this case. In particular,
there can now be multiple NilCheck ops per block.
I moved all of the arch-dependent nil check elimination done as
part of ssaGenValue into its own proper pass, so we don't have to
duplicate that code for every architecture.
Making the arch-dependent nil check its own pass means I needed
to add a bunch of flags to the opcode table so I could write
the code without arch-dependent ops everywhere.
Change-Id: I419f891ac9b0de313033ff09115c374163416a9f
Reviewed-on: https://go-review.googlesource.com/29120
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
The change corrects the values of the largest float32 value (f1) and the
value of the halfway point between f1 and the next, overflow value (f2).
Fixes#17012
Change-Id: Idaf9997b69d61fafbffdb980d751c9857732e14d
Reviewed-on: https://go-review.googlesource.com/29171
Reviewed-by: Robert Griesemer <gri@golang.org>
CL 28978 (6ec993a) accidentally disabled the test (it would only
run if amd64 AND s390x, whereas it should be amd64 OR s390x).
Change-Id: I23c1ad71724ff55f5808d5896b19b62c8ec5af76
Reviewed-on: https://go-review.googlesource.com/28981
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The new SSA backend modifies the ABI slightly: R0 is now a usable
general purpose register.
Fixes#16677.
Change-Id: I367435ce921e0c7e79e021c80cf8ef5d1d1466cf
Reviewed-on: https://go-review.googlesource.com/28978
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Atomic ops on ARM are implemented with kernel calls, so they are
not intrinsified.
Change-Id: I0e7cc2e5526ae1a3d24b4b89be1bd13db071f8ef
Reviewed-on: https://go-review.googlesource.com/28977
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
When possible, emit static data rather than
init functions for interface values.
This:
* cuts 32k off cmd/go
* removes several error values from runtime init
* cuts the size of the image/color/palette compiled package from 103k to 34k
* reduces the time to build the package in #15520 from 8s to 1.5s
Fixes#6289Fixes#15528
Change-Id: I317112da17aadb180c958ea328ab380f83e640b4
Reviewed-on: https://go-review.googlesource.com/26668
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Redo of CL 28575 with fixed test.
We're in a pre-KeepAlive world for a bit yet, the old tests
were in a client which was in a post-KeepAlive world.
Change-Id: I114fd630339d761ab3306d1d99718d3cb973678d
Reviewed-on: https://go-review.googlesource.com/28582
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reason for revert: broke the build due to cherrypick;
relies on an unsubmitted parent CL.
Original issue's description:
> cmd/compile: ignore contentEscapes for marking nodes as escaping
>
> We can still stack allocate and VarKill nodes which don't
> escape but their content does.
>
> Fixes#16996
>
> Change-Id: If8aa0fcf2c327b4cb880a3d5af8d213289e6f6bf
> Reviewed-on: https://go-review.googlesource.com/28575
> Run-TryBot: Keith Randall <khr@golang.org>
> TryBot-Result: Gobot Gobot <gobot@golang.org>
> Reviewed-by: David Chase <drchase@google.com>
>
Change-Id: Ie1a325209de14d70af6acb2d78269b7a0450da7a
Reviewed-on: https://go-review.googlesource.com/28578
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We can still stack allocate and VarKill nodes which don't
escape but their content does.
Fixes#16996
Change-Id: If8aa0fcf2c327b4cb880a3d5af8d213289e6f6bf
Reviewed-on: https://go-review.googlesource.com/28575
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Includes test case shown to fail with unpatched compiler.
Fixes#17005.
Change-Id: I49b7b1a3f02736d85846a2588018b73f68d50320
Reviewed-on: https://go-review.googlesource.com/28573
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Enabled checks (except for DUFF-ops which aren't implemented yet).
Added ppc64le to relevant test.
Also updated register list to reflect no-longer-reserved-
for-constants status (file was missed in that change).
Updates #16010.
Change-Id: I31b1aac19e14994f760f2ecd02edbeb1f78362e7
Reviewed-on: https://go-review.googlesource.com/28548
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
It should alias to Xchg instead of Swap. Found when testing #16985.
Change-Id: If9fd734a1f89b8b2656f421eb31b9d1b0d95a49f
Reviewed-on: https://go-review.googlesource.com/28512
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
It was fixed earlier in the Go 1.8 cycle.
Add a test.
Fixes#15895
Change-Id: I5834831235d99b9fcf21b435932cdd7ac6dc2c6e
Reviewed-on: https://go-review.googlesource.com/28476
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
make(T, n, m) returns a slice of type T with length n and capacity m
where "The size arguments n and m must be of integer type or untyped."
https://tip.golang.org/ref/spec#Making_slices_maps_and_channels
The failure to reject typed non-integer size arguments in make
during compile time was uncovered after https://golang.org/cl/27851
changed the generation of makeslice calls.
Fixes #16940
Updates #16949
Change-Id: Ib1e3576f0e6ad199c9b16b7a50c2db81290c63b4
Reviewed-on: https://go-review.googlesource.com/28301
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Intrinsified atomic op produces <value,memory>. Make sure this
memory is considered in the store chain calculation.
Fixes#16948.
Change-Id: I029f164b123a7e830214297f8373f06ea0bf1e26
Reviewed-on: https://go-review.googlesource.com/28350
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Generate a for loop for ranging over strings that only needs to call
the runtime function charntorune for non ASCII characters.
This provides faster iteration over ASCII characters and slightly
faster iteration for other characters.
The runtime function charntorune is changed to take an index from where
to start decoding and returns the index after the last byte belonging
to the decoded rune.
All call sites of charntorune in the runtime are replaced by a for loop
that will be transformed by the compiler instead of calling the charntorune
function directly.
go binary size decreases by 80 bytes.
godoc binary size increases by around 4 kilobytes.
runtime:
name old time/op new time/op delta
RuneIterate/range/ASCII-4 43.7ns ± 3% 10.3ns ± 4% -76.33% (p=0.000 n=44+45)
RuneIterate/range/Japanese-4 72.5ns ± 2% 62.8ns ± 2% -13.41% (p=0.000 n=49+50)
RuneIterate/range1/ASCII-4 43.5ns ± 2% 10.4ns ± 3% -76.18% (p=0.000 n=50+50)
RuneIterate/range1/Japanese-4 72.5ns ± 2% 62.9ns ± 2% -13.26% (p=0.000 n=50+49)
RuneIterate/range2/ASCII-4 43.5ns ± 3% 10.3ns ± 2% -76.22% (p=0.000 n=48+47)
RuneIterate/range2/Japanese-4 72.4ns ± 2% 62.7ns ± 2% -13.47% (p=0.000 n=50+50)
strings:
name old time/op new time/op delta
IndexRune-4 64.7ns ± 5% 22.4ns ± 3% -65.43% (p=0.000 n=25+21)
MapNoChanges-4 269ns ± 2% 157ns ± 2% -41.46% (p=0.000 n=23+24)
Fields-4 23.0ms ± 2% 19.7ms ± 2% -14.35% (p=0.000 n=25+25)
FieldsFunc-4 23.1ms ± 2% 19.6ms ± 2% -14.94% (p=0.000 n=25+24)
name old speed new speed delta
Fields-4 45.6MB/s ± 2% 53.2MB/s ± 2% +16.87% (p=0.000 n=24+25)
FieldsFunc-4 45.5MB/s ± 2% 53.5MB/s ± 2% +17.57% (p=0.000 n=25+24)
Updates #13162
Change-Id: I79ffaf828d82bf9887592f08e5cad883e9f39701
Reviewed-on: https://go-review.googlesource.com/27853
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
Where possible generate calls to runtime makeslice with int arguments
during compile time instead of makeslice with int64 arguments.
This eliminates converting arguments for calls to makeslice with
int64 arguments for platforms where int64 values do not fit into
arguments of type int.
godoc 386 binary shrinks by approximately 12 kilobyte.
amd64:
name old time/op new time/op delta
MakeSlice-2 29.8ns ± 1% 29.8ns ± 1% ~ (p=1.000 n=24+24)
386:
name old time/op new time/op delta
MakeSlice-2 52.3ns ± 0% 45.9ns ± 0% -12.17% (p=0.000 n=25+22)
Fixes #15357
Change-Id: Icb8701bb63c5a83877d26c8a4b78e782ba76de7c
Reviewed-on: https://go-review.googlesource.com/27851
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Add the following optimizations:
- fold constants
- fold address into load/store
- simplify extensions and conditional branches
- remove nil checks
Turn on SSA on MIPS64 by default, and toggle the tests.
Fixes#16359.
Change-Id: I7f1e38c2509e22e42cd024e712990ebbe47176bd
Reviewed-on: https://go-review.googlesource.com/27870
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The compiler was canonicalizing unnamed types of the form
struct { i int }
across packages, even though an unexported field i should not be
accessible from other packages.
The fix requires both qualifying the field name in the string used by
the compiler to distinguish the type, and ensuring the struct's pkgpath
is set in the rtype version of the data when the type being written is
not part of the localpkg.
Fixes#16616
Change-Id: Ibab160b8b5936dfa47b17dbfd48964a65586785b
Reviewed-on: https://go-review.googlesource.com/27791
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
This CL reworks walkcompare for clarity and concision.
It also makes one significant functional change.
(The functional change is hard to separate cleanly
from the cleanup, so I just did them together.)
When inlining and unrolling an equality comparison
for a small struct or array, compare the elements like:
a[0] == b[0] && a[1] == b[1]
rather than
pa := &a
pb := &b
pa[0] == pb[0] && pa[1] == pb[1]
The result is the same, but taking the address
and working through the indirect
forces the backends to generate less efficient code.
This is only an improvement with the SSA backend.
However, every port but s390x now has a working
SSA backend, and switching to the SSA backend
by default everywhere is a priority for Go 1.8.
It thus seems reasonable to start to prioritize
SSA performance over the old backend.
Updates #15303
Sample code:
type T struct {
a, b int8
}
func g(a T) bool {
return a == T{1, 2}
}
SSA before:
"".g t=1 size=80 args=0x10 locals=0x8
0x0000 00000 (badeq.go:7) TEXT "".g(SB), $8-16
0x0000 00000 (badeq.go:7) SUBQ $8, SP
0x0004 00004 (badeq.go:7) FUNCDATA $0, gclocals·23e8278e2b69a3a75fa59b23c49ed6ad(SB)
0x0004 00004 (badeq.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0004 00004 (badeq.go:8) MOVBLZX "".a+16(FP), AX
0x0009 00009 (badeq.go:8) MOVB AL, "".autotmp_0+6(SP)
0x000d 00013 (badeq.go:8) MOVBLZX "".a+17(FP), AX
0x0012 00018 (badeq.go:8) MOVB AL, "".autotmp_0+7(SP)
0x0016 00022 (badeq.go:8) MOVB $0, "".autotmp_1+4(SP)
0x001b 00027 (badeq.go:8) MOVB $1, "".autotmp_1+4(SP)
0x0020 00032 (badeq.go:8) MOVB $2, "".autotmp_1+5(SP)
0x0025 00037 (badeq.go:8) MOVBLZX "".autotmp_0+6(SP), AX
0x002a 00042 (badeq.go:8) MOVBLZX "".autotmp_1+4(SP), CX
0x002f 00047 (badeq.go:8) CMPB AL, CL
0x0031 00049 (badeq.go:8) JNE 70
0x0033 00051 (badeq.go:8) MOVBLZX "".autotmp_0+7(SP), AX
0x0038 00056 (badeq.go:8) CMPB AL, $2
0x003a 00058 (badeq.go:8) SETEQ AL
0x003d 00061 (badeq.go:8) MOVB AL, "".~r1+24(FP)
0x0041 00065 (badeq.go:8) ADDQ $8, SP
0x0045 00069 (badeq.go:8) RET
0x0046 00070 (badeq.go:8) MOVB $0, AL
0x0048 00072 (badeq.go:8) JMP 61
SSA after:
"".g t=1 size=32 args=0x10 locals=0x0
0x0000 00000 (badeq.go:7) TEXT "".g(SB), $0-16
0x0000 00000 (badeq.go:7) NOP
0x0000 00000 (badeq.go:7) NOP
0x0000 00000 (badeq.go:7) FUNCDATA $0, gclocals·23e8278e2b69a3a75fa59b23c49ed6ad(SB)
0x0000 00000 (badeq.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (badeq.go:8) MOVBLZX "".a+8(FP), AX
0x0005 00005 (badeq.go:8) CMPB AL, $1
0x0007 00007 (badeq.go:8) JNE 25
0x0009 00009 (badeq.go:8) MOVBLZX "".a+9(FP), CX
0x000e 00014 (badeq.go:8) CMPB CL, $2
0x0011 00017 (badeq.go:8) SETEQ AL
0x0014 00020 (badeq.go:8) MOVB AL, "".~r1+16(FP)
0x0018 00024 (badeq.go:8) RET
0x0019 00025 (badeq.go:8) MOVB $0, AL
0x001b 00027 (badeq.go:8) JMP 20
Change-Id: I120185d58012b7bbcdb1ec01225b5b08d0855d86
Reviewed-on: https://go-review.googlesource.com/22277
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Now that we have ops that can return 2 results, have BSF return a result
and flags. We can then get rid of the redundant comparison and use CMOV
instead of CMOVconst ops.
Get rid of a bunch of the ops we don't use. Ctz{8,16}, plus all the Clzs,
and CMOVNEs. I don't think we'll ever use them, and they would be easy
to add back if needed.
Change-Id: I8858a1d017903474ea7e4002fc76a6a86e7bd487
Reviewed-on: https://go-review.googlesource.com/27630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Blank struct fields are regular unexported fields. Two
blank fields are different if they are from different
packages. In order to correctly differentiate them, the
compiler needs the package information. Add it to the
export data.
Fixes#15514.
Change-Id: I421aaca22b542fcd0d66b2d2db777249cad78df6
Reviewed-on: https://go-review.googlesource.com/27639
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Change-Id: I4faf9a55414e217f0c48528efb13ab8fdcd9bb16
Reviewed-on: https://go-review.googlesource.com/24845
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Add a type conversion to uintptr for untyped constants
before the conversion to unsafe.Pointer.
Fixes#16317
Change-Id: Ib85feccad1019e687e7eb6135890b64b82fb87fb
Reviewed-on: https://go-review.googlesource.com/27441
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This is simpler than the sorting technique.
It also allows us to simplify or eliminate
some of the sorting decisions.
Most important, sorting will not work when case clauses
represent ranges of integers: There is no correct
sort order that allows overlap detection by comparing
neighbors. Using a map allows of a cheap, simple
approach to ranges, namely to insert every int
in the map. The equivalent approach for sorting
means juggling temporary Nodes for every int,
which is a lot more expensive.
Change-Id: I84df3cb805992a1b04d14e0e4b2334f943e0ce05
Reviewed-on: https://go-review.googlesource.com/26766
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This is a bit simpler than playing sorting games,
and it is clearer that it generates errors
in the correct (source) order.
It also allows us to simplify sorting.
It also prevents quadratic error messages for
(pathological) inputs with many duplicate type cases.
While we’re here, refactoring deduping into separate functions.
Negligible compilebench impact.
Fixes#15912.
Change-Id: I6cc19edd38875389a70ccbdbdf0d9b7d5ac5946f
Reviewed-on: https://go-review.googlesource.com/26762
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This makes a bunch of changes to package syntax to tweak line numbers
for AST nodes. For example, short variable declaration statements are
now associated with the location of the ":=" token, and function calls
are associated with the location of the final ")" token. These help
satisfy many unit tests that assume the old parser's behavior.
Because many of these changes are questionable, they're guarded behind
a new "gcCompat" const to make them easy to identify and revisit in
the future.
A handful of remaining tests are too difficult to make behave
identically. These have been updated to execute with -newparser=0 and
comments explaining why they need to be fixed.
all.bash now passes with both the old and new parsers.
Change-Id: Iab834b71ca8698d39269f261eb5c92a0d55a3bf4
Reviewed-on: https://go-review.googlesource.com/27199
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
This time with the cherry-pick from the proper patch of
the old CL.
Stack size increased.
Corrected NaN-comparison glitches.
Marked g register as clobbered by calls.
Fixed shared libraries.
live_ssa.go still disabled because of differences.
Presumably turning on more optimization will fix
both the stack size and the live_ssa.go glitches.
Enhanced debugging output for shared libs test.
Rebased onto master.
Updates #16010.
Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4
Reviewed-on: https://go-review.googlesource.com/27159
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
In CSE if a tuple generator is CSE'd to a different block, its
selectors are copied to the same block. In this case, also CES
the copied selectors.
Test copied from Keith's CL 27202.
Fixes#16741.
Change-Id: I2fc8b9513d430f10d6104275cfff5fb75d3ef3d9
Reviewed-on: https://go-review.googlesource.com/27236
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
When T is a scalar, there are no runtime calls
required, which makes this a clear win.
encoding/binary:
WriteInts-8 958ns ± 3% 864ns ± 2% -9.80% (p=0.000 n=15+15)
This also considerably shrinks a core fmt
routine:
Before: "".(*pp).printArg t=1 size=3952 args=0x20 locals=0xf0
After: "".(*pp).printArg t=1 size=2624 args=0x20 locals=0x98
Unfortunately, I find it very hard to get stable
numbers out of the fmt benchmarks due to thermal scaling.
Change-Id: I1278006b030253bf8e48dc7631d18985cdaa143d
Reviewed-on: https://go-review.googlesource.com/26659
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Don't fold constant factors into a multiply
beyond the capacity of a MULQ instruction (32 bits).
Fixes#16733
Change-Id: Idc213c6cb06f7c94008a8cf9e60a9e77d085fd89
Reviewed-on: https://go-review.googlesource.com/27160
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Add more ARM64 optimizations:
- use hardware zero register when it is possible.
- use shifted ops.
The assembler supports shifted ops but not documented, nor knows
how to print it. This CL adds them.
- enable fast division.
This was disabled because it makes the old backend generate slower
code. But with SSA it generates faster code.
Turn on SSA by default, also adjust tests.
Change-Id: I7794479954c83bb65008dcb457bc1e21d7496da6
Reviewed-on: https://go-review.googlesource.com/26950
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Last part of the 386 SSA port.
Modify the x86 backend to simulate SSE registers and
instructions with 387 registers and instructions.
The simulation isn't terribly performant, but it works,
and the old implementation wasn't very performant either.
Leaving to people who care about 387 to optimize if they want.
Turn on SSA backend for 386 by default.
Fixes#16358
Change-Id: I678fb59132620b2c47e993c1c10c4c21135f70c0
Reviewed-on: https://go-review.googlesource.com/25271
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
It's not a new backend, just a PtrSize==4 modification
of the existing AMD64 backend.
Change-Id: Icc63521a5cf4ebb379f7430ef3f070894c09afda
Reviewed-on: https://go-review.googlesource.com/25586
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>