This change fixes the implementation of Data Cache instructions for
ppc64x, allowing non-zero hint field values.
Change-Id: I454aac9293d069a4817ee574d5809fa1799b3216
Reviewed-on: https://go-review.googlesource.com/68670
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Also add -V=full to print a unique identifier of the specific tool being invoked.
This will be used for content-based staleness.
Also sort and clean up a few of the flag doc comments.
Change-Id: I786fe50be0b8e5f77af809d8d2dab721185c2abd
Reviewed-on: https://go-review.googlesource.com/68590
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Adds last missing SSE4 instruction.
Also introduces additional ytab set 'yextractps'.
See https://golang.org/cl/57470 that adds other SSE4 instructions
but skips this one due to 'yextractps'.
To make EXTRACTPS less "sloppy", Yu2 oclass added to forbid
usage of invalid offset values in immediate operand.
Part of the mission to add missing amd64 SSE4 instructions to Go asm.
Change-Id: I0e67e3497054f53257dd8eb4c6268da5118b4853
Reviewed-on: https://go-review.googlesource.com/57490
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
The following instructions were introduced in ARMv6, and the compiler
can do more optimization with them.
1. "MOVBS Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does signed
byte extension to word, and writes the result to Rd.
2. "MOVHS Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does signed
halfword extension to word, and writes the result to Rd.
3. "MOVBU Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does unsigned
byte extension to word, and writes the result to Rd.
4. "MOVHU Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does unsigned
half-word extension to word, and writes the result to Rd.
5. "XTAB Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does signed
byte extension to word, adds Rn, and writes the result to Rd.
6. "XTAH Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does signed
half-word extension to word, adds Rn, and writes the result to Rd.
7. "XTABU Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does unsigned
byte extension to word, adds Rn, and writes the result to Rd.
8. "XTAHU Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does unsigned
half-word extension to word, adds Rn, and writes the result to Rd.
Change-Id: I4306d7ebac93015d7e2e40d307f2c4271c03f466
Reviewed-on: https://go-review.googlesource.com/65790
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
These are the last instructions missing to complete SSE3 support.
For reference what was missing was found by a tool [1]:
$ x86db-gogen list --extension SSE3 --not-known
ADDSUBPD xmmreg,xmmrm [rm: 66 0f d0 /r] PRESCOTT,SSE3,SO
ADDSUBPS xmmreg,xmmrm [rm: f2 0f d0 /r] PRESCOTT,SSE3,SO
[1] https://github.com/dlespiau/x86dbFixes#20293
Change-Id: Ib5a91bf64dcc5282cdb60eae740ae52b4db16ebd
Reviewed-on: https://go-review.googlesource.com/42990
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
This change makes it easier to express instructions
with arbitrary number of operands.
Rationale: previous approach with operand "hiding" does
not scale well, AVX and especially AVX512 have many
instructions with 3+ operands.
x86 asm backend is updated to handle up to 6 explicit operands.
It also fixes issue with 4-th immediate operand type checks.
All `ytab` tables are updated accordingly.
Changes to non-x86 backends only include these patterns:
`p.From3 = X` => `p.SetFrom3(X)`
`p.From3.X = Y` => `p.GetFrom3().X = Y`
Over time, other backends can adapt Prog.RestArgs
and reduce the amount of workarounds.
-- Performance --
x/benchmark/build:
$ benchstat upstream.bench patched.bench
name old time/op new time/op delta
Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10)
name old binary-size new binary-size delta
Build-48 10.3M ± 0% 10.3M ± 0% ~ (all equal)
name old build-time/op new build-time/op delta
Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10)
name old build-peak-RSS-bytes new build-peak-RSS-bytes delta
Build-48 145MB ± 5% 148MB ± 5% ~ (p=0.218 n=10+10)
name old build-user+sys-time/op new build-user+sys-time/op delta
Build-48 21.0s ± 2% 21.2s ± 2% ~ (p=0.075 n=10+10)
Microbenchmark shows a slight slowdown.
name old time/op new time/op delta
AMD64asm-4 49.5ms ± 1% 49.9ms ± 1% +0.67% (p=0.001 n=23+15)
func BenchmarkAMD64asm(b *testing.B) {
for i := 0; i < b.N; i++ {
TestAMD64EndToEnd(nil)
TestAMD64Encoder(nil)
}
}
Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183
Reviewed-on: https://go-review.googlesource.com/63490
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The functions Float64bits and Float64frombits perform
poorly on ppc64x because the int<->float conversions
often result in load and store sequences to handle the
type change. This patch adds more rules to recognize
those sequences and use register to register moves
and avoid unnecessary loads and stores where possible.
There were some existing rules to improve these conversions,
but this provides additional improvements. Included here:
- New instruction FCFIDS to improve on conversion to 32 bit
- Rename Xf2i64 and Xi2f64 as MTVSRD, MFVSRD, to match the asm
- Add rules to lower some of the load/store sequences for
- Added new go asm to ppc64.s testcase.
conversions
Improvements:
BenchmarkAbs-16 2.16 0.93 -56.94%
BenchmarkCopysign-16 2.66 1.18 -55.64%
BenchmarkRound-16 4.82 2.69 -44.19%
BenchmarkSignbit-16 1.71 1.14 -33.33%
BenchmarkFrexp-16 11.4 7.94 -30.35%
BenchmarkLogb-16 10.4 7.34 -29.42%
BenchmarkLdexp-16 15.7 11.2 -28.66%
BenchmarkIlogb-16 10.2 7.32 -28.24%
BenchmarkPowInt-16 69.6 55.9 -19.68%
BenchmarkModf-16 10.1 8.19 -18.91%
BenchmarkLog2-16 17.4 14.3 -17.82%
BenchmarkCbrt-16 45.0 37.3 -17.11%
BenchmarkAtanh-16 57.6 48.3 -16.15%
BenchmarkRemainder-16 76.6 65.4 -14.62%
BenchmarkGamma-16 26.0 22.5 -13.46%
BenchmarkPowFrac-16 197 174 -11.68%
BenchmarkMod-16 112 99.8 -10.89%
BenchmarkAsinh-16 59.9 53.7 -10.35%
BenchmarkAcosh-16 44.8 40.3 -10.04%
Updates #21390
Change-Id: I56cc991fc2e55249d69518d4e1ba76cc23904e35
Reviewed-on: https://go-review.googlesource.com/63290
Reviewed-by: Michael Munday <mike.munday@ibm.com>
This adds the VFMADD[213|231]SD, VFNMADD[213|231]SD,
VADDSD, VSUBSD instructions
This will allow us to write a fast path for exp_amd64.s where
these optimizations can be applied in a lot of places.
Change-Id: Ide292107ab887bd1e225a1ad60880235b5ed7c61
Reviewed-on: https://go-review.googlesource.com/61810
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
3rd change out of 3 to cover AMD64 SSSE3 instruction set in Go asm.
This commit adds instruction that do require new ytab variable.
Change-Id: I0bc7d9401c9176eb3760c3d59494ef082e97af84
Reviewed-on: https://go-review.googlesource.com/56870
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
This is the last instruction I found missing in SSE2 set.
It does not reuse 'yprefetch' ytabs due to differences in
operands SRC/DST roles:
- PREFETCHx: ModRM:r/m(r) -> FROM
- CLFLUSH: ModRM:r/m(w) -> TO
unaryDst map is extended accordingly.
Change-Id: I89e34ebb81cc0ee5f9ebbb1301bad417f7ee437f
Reviewed-on: https://go-review.googlesource.com/56833
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
instructions
1st change out of 3 to cover AMD64 SSSE3 instruction set in Go asm.
This commit adds instructions that do not require new named ytab sets.
Change-Id: I0c3dfd8d39c3daa8b7683ab163c63145626d042e
Reviewed-on: https://go-review.googlesource.com/56834
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
This change adds new ppc64 instructions from the POWER9 ISA. This includes
compares, loads, maths, register moves and the new random number generator and
copy/paste facilities.
Change-Id: Ife3720b90f5af184ff115bbcdcbce5c1302d39b6
Reviewed-on: https://go-review.googlesource.com/53930
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Most of these are return values that were part of a receiving parameter,
so they're still accessible.
A few others are not, but those have never had a use.
Found with github.com/mvdan/unparam, after Kevin Burke's suggestion that
the tool should also warn about unused result parameters.
Change-Id: Id8b5ed89912a99db22027703a88bd94d0b292b8b
Reviewed-on: https://go-review.googlesource.com/55910
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Went mainly for the ones that make no sense, such as the ones
mid-sentence or after commas.
Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0
Reviewed-on: https://go-review.googlesource.com/57293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The current code treats the type of SIMD&FP register as C_REG incorrectly.
The fix code converts C_REG type into C_FREG type.
Uncomment fcsels/fcseld test cases.
Fixes#21582
Change-Id: I754c51f72a0418bd352cbc0f7740f14cc599c72d
Reviewed-on: https://go-review.googlesource.com/58350
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The current code treats floating-point constant as integer
and does not treat fcmp/fcmpe as the comparison instrucitons
that requires special handling.
The fix corrects the type of immediate arguments and adds fcmp/fcmpe
in the special handing.
Uncomment the fcmp/fcmpe cases.
Fixes#21567
Change-Id: I6782520e2770f6ce70270b667dd5e68f71e2d5ad
Reviewed-on: https://go-review.googlesource.com/57852
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The current code gets shift arguments value from prog.From3.Offset.
But prog.From3.Offset is not assigned the shift arguments value in
instructions assemble process.
The fix calls movcon() function to get the correct value.
Uncomment the movk/movkw cases.
Fixes#21398
Change-Id: I78d40c33c24bd4e3688a04622e4af7ddb5333fa6
Reviewed-on: https://go-review.googlesource.com/54990
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
BFX extracts given bits from the source register, sign extends them
to 32-bit, and writes to destination register. BFXU does the similar
operation with zero extention.
They were introduced in ARMv6T2.
Change-Id: I6822ebf663497a87a662d3645eddd7c611de2b1e
Reviewed-on: https://go-review.googlesource.com/56071
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Instructions added in https://golang.org/cl/18853
2nd change out of 3 to cover AMD64 SSSE3 instruction set in Go asm.
This commit does not actually add any new instructions, only
enables some test cases.
Change-Id: I9596435b31ee4c19460a51dd6cea4530aac9d198
Reviewed-on: https://go-review.googlesource.com/56835
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Instructions are implemented in the following revisions:
PCMPESTRI - https://golang.org/cl/22337
PHMINPOSUW - https://golang.org/cl/18853
It is unknown when x86test will be updated/re-run, but tests are useful
to check which x86 instructions are not yet supported.
As an example of tool that uses this information, there is Damien
Lespiau x86db.
Part of the mission to add missing amd64 SSE4 instructions to Go asm.
Change-Id: I512ff26040f47a0976b3e37000fb1f37eac5b762
Reviewed-on: https://go-review.googlesource.com/55830
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
The stxr/stxrw/stxrb/stxrh instructions belong to STLXR-like instructions
set and they require special handling. The current code has no special
handling for those instructions.
The fix adds the special handling for those instructions.
Uncomment stxr/stxrw/stxrb/stxrh test cases.
Fixes#21397
Change-Id: I31cee29dd6b30b1c25badd5c7574dda7a01bf016
Reviewed-on: https://go-review.googlesource.com/54951
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Wrong instructions "MOVW 8(F0), R1" and "MOVW R0<<0(F1), R1"
are silently accepted, and all Fx are treated as Rx.
The patch checks all those illegal base registers.
fixes#20724
Change-Id: I05d41bb43fe774b023205163b7daf4a846e9dc88
Reviewed-on: https://go-review.googlesource.com/46132
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The current code calculates register number incorrectly.
The fix corrects the register number calculation.
Add cases created by decoder to test assembler.
Fixes#20697Fixes#20723
Change-Id: I73ac153df9ea9f51c43a5104828d7a5389551c92
Reviewed-on: https://go-review.googlesource.com/45850
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
"MULBB R1, R2, R3" is encoded to 0xe163f182, which should be
0xe1630182.
This patch fix it.
fix#20764
Change-Id: I9d3c3ffa40ecde86638e5e083eacc67578caebf4
Reviewed-on: https://go-review.googlesource.com/46491
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
"MOVBS.U R0<<0(R1), R2" is assembled to 0xe19120d0 (ldrsb r2, [r1, r0]),
but it is expected to be 0xe11120d0 (ldrsb r2, [r1, -r0]).
This patch fixes it and adds more encoding tests.
fixes#20701
Change-Id: Ic1fb46438d71a978dbef06d97494a70c95fcbf3a
Reviewed-on: https://go-review.googlesource.com/45996
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
"MOVW FPSR, g" should be assembled to 0xeef1aa10, but actually
0xee30a110 (RFS). "MOVW g, FPSR" should be 0xeee1aa10, but actually
0xee20a110 (WFS). They should be updated to VFP forms, since the ARM
back end doesn't support non-VFP floating points.
The patch fixes them and adds more assembly encoding tests.
fixes#20643
Change-Id: I3b29490337c6e8d891b400fcedc8b0a87b82b527
Reviewed-on: https://go-review.googlesource.com/45276
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
"MOVW R1, CPSR" is assembled to 0xe129f001, which should be 0xe12cf001.
"MOVW $255, CPSR" is assembled to 0xe329f0ff, which should be 0xe32cf0ff.
This patch fixes them and adds more assembly encoding tests.
fix#20626
Change-Id: Iefc945879ea774edf40438ce39f52c144e1501a1
Reviewed-on: https://go-review.googlesource.com/45170
Reviewed-by: Cherry Zhang <cherryyz@google.com>
There are two issues in constant decomposition.
1. A typo in "func immrot2s" blocks "case 107" of []optab be triggered.
2. Though "ADD $0xffff, R0, R0" is decomposed to "ADD $0xff00, R0, R0" and
"ADD $0x00ff, R0, R0" as expected, "ADD $0xffff, R0" still uses the
constant pool, which should be the same as "ADD $0xffff, R0, R0".
This patch fixes them and adds more instruction encoding tests.
fix#20516
Change-Id: Icd7bdfa1946b29db15580dcb429111266f1384c6
Reviewed-on: https://go-review.googlesource.com/44335
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
It is expected to test assembly code for ARMv5, ARMv6 and ARMv7
in cmd/asm/internal/asm/endtoend_test.go. But actually the loop
in "func TestARMEndToEnd(t *testing.T)" runs three times all
for ARMv5.
This patch fixes that bug and adds a new armv6.s which is only tested
with GOARM=6.
fixes#20465
Change-Id: I5dbf00809a47ace2c195335e2c9bdd768479aada
Reviewed-on: https://go-review.googlesource.com/43930
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
"ADDF F0, R1, F2" is silently accepted by the arm assembler and
assembled to the same binary code of "ADDF F0, F1, F2". So does
"CMPF F0, R1".
"ABSF F0, F1, F2" is also silently accepted and assembled to a
different instruction.
This patch reports those illegal forms and adds test cases.
fix#20464
Change-Id: I88b80dc29de24c6266ac7bf7bce1578c5adbc68c
Reviewed-on: https://go-review.googlesource.com/43931
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Many instructions can not have a .S suffix, such as MULS, SWI, CLZ,
CMP, STREX and others. And so do .P and .W suffixes. Even wrong
assembly code is generated for some instructions with invalid
suffixes.
This patch tries to simplify .S/.W/.P checks. And a wrong assembly
test for arm is added.
fixes#20377
Change-Id: Iba1c99d9e6b7b16a749b4d93ca2102e17c5822fe
Reviewed-on: https://go-review.googlesource.com/43561
Reviewed-by: Cherry Zhang <cherryyz@google.com>
SWI only support "SWI $imm", but currently "SWI (Reg)" is also
accepted. This patch fixes it.
And more instruction tests are added to cmd/asm/internal/asm/testdata/arm.s
fixes#20375
Change-Id: Id437d853924a403e41da9b6cbddd20d994b624ff
Reviewed-on: https://go-review.googlesource.com/43552
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
LLV and SCV are 64-bit load-linked and store-conditional. They
were used in runtime as #define WORD. Change them to normal
instruction form.
NOOP is hardware no-op. It was written as WORD $0. Make a name
for it for better disassembly output.
Fixes#12561.
Fixes#18238.
Change-Id: I82c667ce756fa83ef37b034b641e8c4366335e83
Reviewed-on: https://go-review.googlesource.com/40297
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
ARM64 assembler backend only accepts loads and stores with small
or aligned offset. The compiler therefore can only fold small or
aligned offsets into loads and stores. For locals and args, their
offsets to SP are not known until very late, and the compiler
makes conservative decision not folding some of them. However,
in most cases, the offset is indeed small or aligned, and can
be folded into load and store (but actually not).
This CL adds support of loads and stores with large and unaligned
offsets. When the offset doesn't fit into the instruction, it
uses two instructions and (for very large offset) the constant
pool. This way, the compiler doesn't need to be conservative,
and can simply fold the offset.
To make it work, the assembler's optab matching rules need to be
changed. Before, MOVD accepts C_UAUTO32K which matches multiple
of 8 between 0 and 32K, and also C_UAUTO16K, which may not be
multiple of 8 and does not fit into MOVD instruction. The
assembler errors in the latter case. This change makes it only
matches multiple of 8 (or offsets within ±256, which also fits
in instruction), and uses the large-or-unaligned-offset rule
for things doesn't fit (without error). Other sized move rules
are changed similarly.
Class C_UAUTO64K and C_UOREG64K are removed, as they are never
used.
In shared library, load/store of global is rewritten to using
GOT and temp register, which conflicts with the use of temp
register for assembling large offset. So the folding is disabled
for globals in shared library mode.
Reduce cmd/go binary size by 2%.
name old time/op new time/op delta
BinaryTree17-8 8.67s ± 0% 8.61s ± 0% -0.60% (p=0.000 n=9+10)
Fannkuch11-8 6.24s ± 0% 6.19s ± 0% -0.83% (p=0.000 n=10+9)
FmtFprintfEmpty-8 116ns ± 0% 116ns ± 0% ~ (all equal)
FmtFprintfString-8 196ns ± 0% 192ns ± 0% -1.89% (p=0.000 n=10+10)
FmtFprintfInt-8 199ns ± 0% 198ns ± 0% -0.35% (p=0.001 n=9+10)
FmtFprintfIntInt-8 294ns ± 0% 293ns ± 0% -0.34% (p=0.000 n=8+8)
FmtFprintfPrefixedInt-8 318ns ± 1% 318ns ± 1% ~ (p=1.000 n=10+10)
FmtFprintfFloat-8 537ns ± 0% 531ns ± 0% -1.17% (p=0.000 n=9+10)
FmtManyArgs-8 1.19µs ± 1% 1.18µs ± 1% -1.41% (p=0.001 n=10+10)
GobDecode-8 17.2ms ± 1% 17.3ms ± 2% ~ (p=0.165 n=10+10)
GobEncode-8 14.7ms ± 1% 14.7ms ± 2% ~ (p=0.631 n=10+10)
Gzip-8 837ms ± 0% 836ms ± 0% -0.14% (p=0.006 n=9+10)
Gunzip-8 141ms ± 0% 139ms ± 0% -1.24% (p=0.000 n=9+10)
HTTPClientServer-8 256µs ± 1% 253µs ± 1% -1.35% (p=0.000 n=10+10)
JSONEncode-8 40.1ms ± 1% 41.3ms ± 1% +3.06% (p=0.000 n=10+9)
JSONDecode-8 157ms ± 1% 156ms ± 1% -0.83% (p=0.001 n=9+8)
Mandelbrot200-8 8.94ms ± 0% 8.94ms ± 0% +0.02% (p=0.000 n=9+9)
GoParse-8 8.69ms ± 0% 8.54ms ± 1% -1.69% (p=0.000 n=8+10)
RegexpMatchEasy0_32-8 227ns ± 1% 228ns ± 1% +0.48% (p=0.016 n=10+9)
RegexpMatchEasy0_1K-8 1.92µs ± 0% 1.63µs ± 0% -15.08% (p=0.000 n=10+9)
RegexpMatchEasy1_32-8 256ns ± 0% 251ns ± 0% -2.19% (p=0.000 n=10+9)
RegexpMatchEasy1_1K-8 2.38µs ± 0% 2.09µs ± 0% -12.49% (p=0.000 n=10+9)
RegexpMatchMedium_32-8 352ns ± 0% 354ns ± 0% +0.39% (p=0.002 n=10+9)
RegexpMatchMedium_1K-8 106µs ± 0% 106µs ± 0% -0.05% (p=0.005 n=10+9)
RegexpMatchHard_32-8 5.92µs ± 0% 5.89µs ± 0% -0.40% (p=0.000 n=9+8)
RegexpMatchHard_1K-8 180µs ± 0% 179µs ± 0% -0.14% (p=0.000 n=10+9)
Revcomp-8 1.20s ± 0% 1.13s ± 0% -6.29% (p=0.000 n=9+8)
Template-8 159ms ± 1% 154ms ± 1% -3.14% (p=0.000 n=9+10)
TimeParse-8 800ns ± 3% 769ns ± 1% -3.91% (p=0.000 n=10+10)
TimeFormat-8 826ns ± 2% 817ns ± 2% -1.04% (p=0.050 n=10+10)
[Geo mean] 145µs 143µs -1.79%
Change-Id: I5fc42087cee9b54ea414f8ef6d6d020b80eb5985
Reviewed-on: https://go-review.googlesource.com/42172
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
MOVSD is properly handled but its encoding test wasn't enabled. Enable
it.
For reference this was found with a little tool I wrote [1] to explore
which instructions are missing or not tested in the go obj package and
assembler:
"which SSE2 instructions aren't tested? And don't list instructions
which can take MMX operands"
$ x86db-gogen list --extension SSE2 --not-tested --not-mmx
CLFLUSH mem [m: np 0f ae /7] WILLAMETTE,SSE2
MOVSD xmmreg,xmmreg [rm: f2 0f 10 /r] WILLAMETTE,SSE2
MOVSD xmmreg,xmmreg [mr: f2 0f 11 /r] WILLAMETTE,SSE2
MOVSD mem64,xmmreg [mr: f2 0f 11 /r] WILLAMETTE,SSE2
MOVSD xmmreg,mem64 [rm: f2 0f 10 /r] WILLAMETTE,SSE2
(CLFLUSH was introduced with SSE2, but has its own CPUID bit)
[1] https://github.com/dlespiau/x86db
Change-Id: Ic3af3028cb8d4f02e53fdebb9b30fb311f4ee454
Reviewed-on: https://go-review.googlesource.com/42814
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
As discussion in issue #19141, the addend should be the third
argument of MULA. This patch fixes it in both the front end
and the back end of the assembler. And also tests are added to
the encoding test.
Fixes#19141
Change-Id: Idbc6f338b8fdfcad97a135f27a98c5b375b27d43
Reviewed-on: https://go-review.googlesource.com/42028
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This CL modifies how MOV[DWHB] instructions that store a constant to
memory are assembled to avoid them clobbering the condition code
(flags). It also modifies zeroAuto to use MOVD instructions instead of
CLEAR (which is assembled as XC).
MOV[DWHB]storeconst ops also no longer clobbers flags.
Note: this CL modifies the assembler so that it can no longer handle
immediates outside the range of an int16 or offsets from SB, which
reflects what the machine instructions support. The compiler doesn't
need this capability any more and I don't think this affects any existing
assembly, but it is easy to workaround if it does.
Fixes#20187.
Change-Id: Ie54947ff38367bd6a19962bf1a6d0296a4accffb
Reviewed-on: https://go-review.googlesource.com/42179
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
ANDPS, like all others PS (Packed Single precision floats) instructions,
need Ym: they don't use the 0x66 prefix.
From the manual:
NP 0F 54 /r ANDPS xmm1, xmm2/m128
NP meaning, quoting the manual:
NP - Indicates the use of 66/F2/F3 prefixes (beyond those already part
of the instructions opcode) are not allowed with the instruction.
And indeed, the same instruction prefixed by 0x66 is ANDPD.
Updates #14069
Change-Id: If312a6f1e77113ab8c0febe66bdb1b4171e41e0a
Reviewed-on: https://go-review.googlesource.com/42090
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The generated test cases had their arguments reversed, putting them back
in order makes those tests pass.
CMPPS SRC, DEST, CC
Change-Id: Ie15021edc533d5681a6a78d10d88b665e3de9017
Reviewed-on: https://go-review.googlesource.com/42097
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The current code treats condition as special register and write
its raw data directly into instruction.
The fix converts the raw data into correct condition encoding.
Also fix the operand catogery of FCCMP.
Add tests to cover all cases.
Change-Id: Ib194041bd9017dd0edbc241564fe983082ac616b
Reviewed-on: https://go-review.googlesource.com/41511
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Taken from the Intel Software Development Manual (of course, in the line
below it's ADC DST, SRC; The opposite of the commit subject).
12 /r ADC r8, r/m8
We need 0x12 for the corresponding ytab line, not 0x10.
{Ymb, Ynone, Yrb, Zm_r, 1},
Updates #14069
Change-Id: Id37cbd0c581c9988c2de355efa908956278e2189
Reviewed-on: https://go-review.googlesource.com/41857
Reviewed-by: Keith Randall <khr@golang.org>
The assembler reordered the operands of some instructions to put the
first operand into From3. Unfortunately this meant that when the
instructions were printed the operands were in a different order than
the assembler would expect as input. For example, 'MVC $8, (R1), (R2)'
would be printed as 'MVC (R1), $8, (R2)'.
Originally this was done to ensure that From contained the source
memory operand. The current compiler no longer requires this and so
this CL simply makes all instructions use the standard order for
operands: From, Reg, From3 and finally To.
Fixes#18295
Change-Id: Ib2b5ec29c647ca7a995eb03dc78f82d99618b092
Reviewed-on: https://go-review.googlesource.com/40299
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
There were only two versions, 0 and 1,
and the only user of version 1 was the assembler,
to indicate that a symbol was static.
Rename LSym.Version to Static,
and add it to LSym.Attributes.
Simplify call-sites.
Passes toolstash-check.
Change-Id: Iabd39918f5019cce78f381d13f0481ae09f3871f
Reviewed-on: https://go-review.googlesource.com/41201
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This updates PPC64.rules to include rules to generate rotates
for ADD, OR, XOR operators that combine two opposite shifts
that sum to 32 or 64.
To support this change opcodes for ROTL and ROTLW were added to
be used like the rotldi and rotlwi extended mnemonics.
This provides the following improvement in sha3:
BenchmarkPermutationFunction-8 302.83 376.40 1.24x
BenchmarkSha3_512_MTU-8 98.64 121.92 1.24x
BenchmarkSha3_384_MTU-8 136.80 168.30 1.23x
BenchmarkSha3_256_MTU-8 169.21 211.29 1.25x
BenchmarkSha3_224_MTU-8 179.76 221.19 1.23x
BenchmarkShake128_MTU-8 212.87 263.23 1.24x
BenchmarkShake256_MTU-8 196.62 245.60 1.25x
BenchmarkShake256_16x-8 163.57 194.37 1.19x
BenchmarkShake256_1MiB-8 199.02 248.74 1.25x
BenchmarkSha3_512_1MiB-8 106.55 133.13 1.25x
Fixes#20030
Change-Id: I484c56f48395d32f53ff3ecb3ac6cb8191cfee44
Reviewed-on: https://go-review.googlesource.com/40992
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
the assembler backends no longer requires reinstalling cmd/link or
cmd/addr2line.
There's also now one canonical definition of the object file format in
cmd/internal/objabi/doc.go, with a warning to update all three
implementations.
objabi is still something of a grab bag of unrelated code (e.g., flag
and environment variable handling probably belong in a separate "tool"
package), but this is still progress.
Fixes#15165.
Fixes#20026.
Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
Reviewed-on: https://go-review.googlesource.com/40972
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Automated refactoring using github.com/mdempsky/unbed (to rewrite
s.Foo to s.FuncInfo.Foo) and then gorename (to rename the FuncInfo
field to just Func).
Passes toolstash-check -all.
Change-Id: I802c07a1239e0efea058a91a87c5efe12170083a
Reviewed-on: https://go-review.googlesource.com/40670
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>