These instructions are actually 5 argument opcodes as specified
by the ISA. Prior to this patch, the MB and ME arguments were
merged into a single bitmask operand to workaround the limitations
of the ppc64 assembler backend.
This limitation no longer exists. Thus, we can pass operands for
these opcodes without having to merge the MB and ME arguments in
the assembler frontend or compiler backend.
Likewise, support for 4 operand variants is unchanged.
Change-Id: Ib086774f3581edeaadfd2190d652aaaa8a90daeb
Reviewed-on: https://go-review.googlesource.com/c/go/+/298750
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
Trust: Carlos Eduardo Seo <carlos.seo@linaro.org>
Optimize combinations of left and right shifts by a constant value
into a 'rotate then insert selected bits [into zero]' instruction.
Use the same instruction for contiguous masks since it has some
benefits over 'and immediate' (not restricted to 32-bits, does not
overwrite source register).
To keep the complexity of this change under control I've only
implemented 64 bit operations for now.
There are a lot more optimizations that can be done with this
instruction family. However, since their function overlaps with other
instructions we need to be somewhat careful not to break existing
optimization rules by creating optimization dead ends. This is
particularly true of the load/store merging rules which contain lots
of zero extensions and shifts.
This CL does interfere with the store merging rules when an operand
is shifted left before it is stored:
binary.BigEndian.PutUint64(b, x << 1)
This is unfortunate but it's not critical and somewhat complex so
I plan to fix that in a follow up CL.
file before after Δ %
addr2line 4117446 4117282 -164 -0.004%
api 4945184 4942752 -2432 -0.049%
asm 4998079 4991891 -6188 -0.124%
buildid 2685158 2684074 -1084 -0.040%
cgo 4553732 4553394 -338 -0.007%
compile 19294446 19245070 -49376 -0.256%
cover 4897105 4891319 -5786 -0.118%
dist 3544389 3542785 -1604 -0.045%
doc 3926795 3927617 +822 +0.021%
fix 3302958 3293868 -9090 -0.275%
link 6546274 6543456 -2818 -0.043%
nm 4102021 4100825 -1196 -0.029%
objdump 4542431 4548483 +6052 +0.133%
pack 2482465 2416389 -66076 -2.662%
pprof 13366541 13363915 -2626 -0.020%
test2json 2829007 2761515 -67492 -2.386%
trace 10216164 10219684 +3520 +0.034%
vet 6773956 6773572 -384 -0.006%
total 107124151 106917891 -206260 -0.193%
Change-Id: I7591cce41e06867ba10a745daae9333513062746
Reviewed-on: https://go-review.googlesource.com/c/go/+/233317
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Michael Munday <mike.munday@ibm.com>
Combine (AND m (SRWconst x)) or (SRWconst (AND m x)) when mask m is
and the shift value produce constant which can be encoded into an
RLWINM instruction.
Combine (CLRLSLDI (SRWconst x)) if the combining of the underling rotate
masks produces a constant which can be encoded into RLWINM.
Likewise for (SLDconst (SRWconst x)) and (CLRLSDI (RLWINM x)).
Combine rotate word + and operations which can be encoded as a single
RLWINM/RLWNM instruction.
The most notable performance improvements arise from the crypto
benchmarks below (GOARCH=power8 on a ppc64le/linux):
pkg:golang.org/x/crypto/blowfish goos:linux goarch:ppc64le
ExpandKeyWithSalt 52.2µs ± 0% 47.5µs ± 0% -8.88%
ExpandKey 44.4µs ± 0% 40.3µs ± 0% -9.15%
pkg:golang.org/x/crypto/ssh/internal/bcrypt_pbkdf goos:linux goarch:ppc64le
Key 57.6ms ± 0% 52.3ms ± 0% -9.13%
pkg:golang.org/x/crypto/bcrypt goos:linux goarch:ppc64le
Equal 90.9ms ± 0% 82.6ms ± 0% -9.13%
DefaultCost 91.0ms ± 0% 82.7ms ± 0% -9.12%
Change-Id: I59a0ca29face38f4ab46e37124c32906f216c4ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/260798
Run-TryBot: Carlos Eduardo Seo <carlos.seo@linaro.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Initialization of t.UInt is missing from SetTypPtrs in config.go,
preventing rules that use it from matching when they should.
This adds the initialization to allow those rules to work.
Updated test/codegen/rotate.go to test for this case, which
appears in math/bits RotateLeft32 and RotateLeft64. There had been
a testcase for this in go 1.10 but that went away when asm_test.go
was removed.
Change-Id: I82fc825ad8364df6fc36a69a1e448214d2e24ed5
Reviewed-on: https://go-review.googlesource.com/112518
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Remove old tests from asm_test.
Change-Id: Ib408ec7faa60068bddecf709b93ce308e0ef665a
Reviewed-on: https://go-review.googlesource.com/100075
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
asmcheck comments now support a compact form of specifying
multiple checks for each platform, using the following syntax:
amd64:"SHL\t[$]4","SHR\t[$]4"
Negative checks are also parsed using the following syntax:
amd64:-"ROR"
though they are still not working.
Moreover, out-of-line comments have been implemented. This
allows to specify asmchecks on comment-only lines, that will
be matched on the first subsequent non-comment non-empty line.
// amd64:"XOR"
// arm:"EOR"
x ^= 1
Change-Id: I110c7462fc6a5c70fd4af0d42f516016ae7f2760
Reviewed-on: https://go-review.googlesource.com/97816
Reviewed-by: Keith Randall <khr@golang.org>
The top-level test harness is modified to support a new kind
of test: "asmcheck". This is meant to replace asm_test.go
as an easier and more readable way to test code generation.
I've added a couple of codegen tests to get initial feedback
on the syntax. I've created them under a common "codegen"
subdirectory, so that it's easier to run them all with
"go run run.go -v codegen".
The asmcheck syntax allows to insert line comments that
can specify a regular expression to match in the assembly code,
for multiple architectures (the testsuite will automatically
build each testfile multiple times, one per mentioned architecture).
Negative matches are unsupported for now, so this cannot fully
replace asm_test yet.
Change-Id: Ifdbba389f01d55e63e73c99e5f5449e642101d55
Reviewed-on: https://go-review.googlesource.com/97355
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>