This change applies x86avxgen output.
https://go-review.googlesource.com/c/arch/+/66972
As an effect, many new AVX instructions are now available.
One of the side-effects of this patch is
sorted AXXX (A-enum) constants.
Some AVX1/2 instructions still not added due to:
1. x86.csv V0.2 does not list them;
2. partially because of (1), test suite does not contain tests for
these instructions.
Change-Id: I90430d773974ca5c995d6950d90e2c62ec88ef47
Reviewed-on: https://go-review.googlesource.com/75490
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
BFC (Bit Field Clear) and BFI (Bit Field Insert) were
introduced in ARMv6T2, and the compiler can use them
to do further optimization.
Change-Id: I5a3fbcd2c2400c9bf4b939da6366c854c744c27f
Reviewed-on: https://go-review.googlesource.com/72891
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
To improve readability when exported fields are removed,
forbid the printer from emitting an empty line before the first comment
in a const, var, or type block.
Also, when printing the "Has filtered or unexported fields." message,
add an empty line before it to separate the message from the struct
or interfact contents.
Before the change:
<<<
type NamedArg struct {
// Name is the name of the parameter placeholder.
//
// If empty, the ordinal position in the argument list will be
// used.
//
// Name must omit any symbol prefix.
Name string
// Value is the value of the parameter.
// It may be assigned the same value types as the query
// arguments.
Value interface{}
// contains filtered or unexported fields
}
>>>
After the change:
<<<
type NamedArg struct {
// Name is the name of the parameter placeholder.
//
// If empty, the ordinal position in the argument list will be
// used.
//
// Name must omit any symbol prefix.
Name string
// Value is the value of the parameter.
// It may be assigned the same value types as the query
// arguments.
Value interface{}
// contains filtered or unexported fields
}
>>>
Fixes#18264
Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88
Reviewed-on: https://go-review.googlesource.com/71990
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Current AllowedOpCodes is 1024, which is not enough for modern x86.
Changed limit to 2048 (though AVX512 will exceed this).
Additional Z-cases and ytab tables are added to make it possible
to handle missing AVX1 and AVX2 instructions.
This CL is required by x86avxgen to work properly:
https://go-review.googlesource.com/c/arch/+/66972
Change-Id: I290214bbda554d2cba53349f50dcd34014fe4cee
Reviewed-on: https://go-review.googlesource.com/70650
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Add support for ADX cpuid bit detection and all instructions,
implied by that bit (ADOX/ADCX). They are useful for rsa and math/big in
general.
Change-Id: Idaa93303ead48fd18b9b3da09b3e79de2f7e2193
Reviewed-on: https://go-review.googlesource.com/74850
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The counter part, writeInt in cmd/internal/obj, writes int64s.
So the reader side should also read int64s. This may cause a
larger range of values being accepted, some of which should
not be that large. This is probably ok: for example, for
size/index/length, the very large value (due to corruption)
may be well past the end and causes other errors. And we did
not do much bound check anyway.
One exmaple where this matters is ARM32's object file. For one
type of relocation it encodes the instruction into Reloc.Add
field (which itself may be problematic and worth fix) and the
instruction encoding overflows int32, causing ARM32 object
file being rejected by goobj (and so objdump and nm) before.
Unskip ARM32 object file tests in goobj, nm, and objdump.
Updates #19811.
Change-Id: Ia46c2b68df5f1c5204d6509ceab6416ad6372315
Reviewed-on: https://go-review.googlesource.com/69010
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Dsymutil, an utility used on macOS when externally linking executables,
does not support base address selector entries in debug_ranges.
To work around this deficiency this commit removes base address
selectors from debug_ranges and emits instead a list composed only of
compile unit relative addresses.
A new type of relocation is introduced, R_ADDRCUOFF, similar to
R_ADDROFF, that relocates an address to its offset from the low_pc of
the symbol's compile unit.
Fixes#21945
Change-Id: Ie991f9bc1afda2b49ac5d734eb41c37d3a37e554
Reviewed-on: https://go-review.googlesource.com/72371
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
This CL changes the go command to base all its rebuilding decisions
on the content of the files being processed and not their file system
modification times. It also eliminates the special handling of release
toolchains, which were previously considered always up-to-date
because modification time order could not be trusted when unpacking
a pre-built release.
The go command previously tracked "build IDs" as a backup to
modification times, to catch changes not reflected in modification times.
For example, if you remove one .go file in a package with multiple .go
files, there is no modification time remaining in the system that indicates
that the installed package is out of date. The old build ID was the hash
of a list of file names and a few other factors, expected to change if
those factors changed.
This CL moves to using this kind of build ID as the only way to
detect staleness, making sure that the build ID hash includes all
possible factors that need to influence the rebuild decision.
One such factor is the compiler flags. As of this CL, if you run
go build -gcflags -N cmd/gofmt
you will get a gofmt where every package is built with -N,
regardless of what may or may not be installed already.
Another such factor is the linker flags. As of this CL, if you run
go install myprog
go install -ldflags=-s myprog
the second go install will now correctly build a new myprog with
the updated linker flags. (Previously the installed myprog appeared
up-to-date, because the ldflags were not included in the build ID.)
Because we have more precise information we can also validate whether
the target of a "go test -c" operation is already the right binary and
therefore can avoid a rebuild.
This CL sets us up for having a more general build artifact cache,
maybe even a step toward not having a pkg directory with .a files,
but this CL does not take that step. For now the result of go install
is the same as it ever was; we just do a better job of what needs to
be installed.
This CL does slow down builds a small amount by reading all the
dependent source files in full. (The go command already read the
beginning of every dependent source file to discover build tags
and imports.) On my MacBook Pro, before this CL all.bash takes
3m58s, while after this CL and a few optimizations stacked above it
all.bash takes 4m28s. Given that CL 73850 cut 1m43s off the all.bash
time earlier today, we can afford adding 30s back for now.
More optimizations are planned that should make the go command
more efficient than it was even before this CL.
Fixes#15799.
Fixes#18369.
Fixes#19340.
Fixes#21477.
Change-Id: I10d7ca0e31ca3f58aabb9b1f11e2e3d9d18f0bc9
Reviewed-on: https://go-review.googlesource.com/73212
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
If the go install doesn't use the same flags as the main build
it can overwrite the installed standard library, leading to
flakiness and slow future tests.
Force uses of 'go install' etc to propagate $GO_GCFLAGS
or disable them entirely, to avoid problems.
As I understand it, the main place this happens is the ssacheck builder.
If there are other uses that need to run some of the now-disabled
tests we can reenable fixed tests in followup CLs.
Change-Id: Ib860a253539f402f8a96a3c00ec34f0bbf137c9a
Reviewed-on: https://go-review.googlesource.com/74470
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Adds the following s390x test under mask (immediate) instructions:
TMHH
TMHL
TMLH
TMLL
These are useful for testing bits and are already used in the math package.
Change-Id: Idffb3f83b238dba76ac1e42ac6b0bf7f1d11bea2
Reviewed-on: https://go-review.googlesource.com/41092
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This change adds three new instructions:
- LPDFR: load positive (math.Abs(x))
- LNDFR: load negative (-math.Abs(x))
- CPSDR: copy sign (math.Copysign(x, y))
By making use of GPR <-> FPR moves we can now compile math.Abs and
math.Copysign to these instructions using SSA rules.
This CL also adds new rules to merge address generation into combined
load operations. This makes GPR <-> FPR move matching more reliable.
name old time/op new time/op delta
Copysign 1.85ns ± 0% 1.40ns ± 1% -24.65% (p=0.000 n=8+10)
Abs 1.58ns ± 1% 0.73ns ± 1% -53.64% (p=0.000 n=10+10)
The geo mean improvement for all math package benchmarks was 4.6%.
Change-Id: I0cec35c5c1b3fb45243bf666b56b57faca981bc9
Reviewed-on: https://go-review.googlesource.com/73950
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
This adds support for math Abs, Copysign to be instrinsics on ppc64x.
New instruction FCPSGN is added to generate fcpsgn. Some new
rules are added to improve the int<->float conversions that are
generated mainly due to the Float64bits and Float64frombits in
the math package. PPC64.rules is also modified as suggested
in the review for CL 63290.
Improvements:
benchmark old ns/op new ns/op delta
BenchmarkAbs-16 1.12 0.69 -38.39%
BenchmarkCopysign-16 1.30 0.93 -28.46%
BenchmarkNextafter32-16 9.34 8.05 -13.81%
BenchmarkFrexp-16 8.81 7.60 -13.73%
Others that used Copysign also saw smaller improvements.
I attempted to make this work using rules since that
seems to be preferred, but due to the use of Float64bits and
Float64frombits in these functions, several rules had to be added and
even then not all cases were matched. Using rules became too
complicated and seemed too fragile for these.
Updates #21390
Change-Id: Ia265da9a18355e08000818a4fba1a40e9e031995
Reviewed-on: https://go-review.googlesource.com/67130
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
New relocation flavor R_DWARFFILEREF, to be applied to DWARF attribute
values that correspond to file references (ex: DW_AT_decl_file,
DW_AT_call_file). The LSym for this relocation is the file itself; the
linker replaces the relocation target with the index of the specified
file in the line table's file section.
Note: for testing purposes this patch changes the DWARF function
subprogram DIE abbrev to include DW_AT_decl_file (allowed by DWARF
but not especially useful) so as to have a way to test this
functionality. This attribute will be removed once there are other
file reference attributes (coming as part of inlining support).
Change-Id: Icf676beb60fcc33f06d78e747ef717532daaa3ba
Reviewed-on: https://go-review.googlesource.com/73330
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
1. Move AXXX constants (A-enumeration) from "a.out.go" to "aenum.go"
2. Move VEX-encoded optabs from "asm6.go" to "vex_optabs.go"
Also run "go generate" over aenum.go. This explains diff in "anames.go".
Initialization of opindex is split into 2 loops:
one for `vexOptab`, second for `optab`.
Rationale:
when VEX instructions are generated with current structure,
asm6.go is modified, which can lead to merge conflicts and
larger diffs than desired. Same for a.out.go.
This change makes x86avxgen usage possible:
https://go-review.googlesource.com/c/arch/+/66972
Change-Id: Id9eefcf5ccf0a89440e5d01bcb80926a8163b41d
Reviewed-on: https://go-review.googlesource.com/70630
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The addressing mode of global variable was missing, whereas the
compiler may make use of it, causing "illegal combination" error.
This CL adds support of that addressing mode.
Fixes#22390.
Change-Id: Ic8eade31aba73e6fb895f758ee7f277f8f1832ef
Reviewed-on: https://go-review.googlesource.com/72610
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This adds a whitelisted subset of compiler flags to the DW_AT_producer
DWARF attribute of each package compilation unit DIE. This is common
practice in DWARF and can help debuggers determine the quality of the
produced debugging information.
Fixes#22168.
Change-Id: I1b994ef2262aa9b88b68eb6e883695d1103acc58
Reviewed-on: https://go-review.googlesource.com/71430
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Set DW_AT_variable_parameter on DW_TAG_formal_parameters that are
actually return values. variable_parameter is supposed to indicate inout
parameters, but Go doesn't really have those, and DWARF doesn't have
explicit support for multiple return values. This seems to be the best
compromise, especially since the implementation of the two is very
similar -- both are stack slots.
Fixes#21100
Change-Id: Icebabc92b7b397e0aa00a7237478cce84ad1a670
Reviewed-on: https://go-review.googlesource.com/71670
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Current suffix check is based on instruction, which is not very
accurate. For example, "MOVW.S R1, R2" is valid, but
"MOVW.S $0xaaaaaaaa, R1" and "MOVW.P CPSR, R9" are not.
This patch fixes the above kinds of issues by checking suffix
based on []optab. And also more test cases are added.
fixes#20509
Change-Id: Ibad91be72c78eefa719412a83b4d44370d2202a8
Reviewed-on: https://go-review.googlesource.com/70910
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
On ARM, STREX does not permit the same register used as both the
source and the destination. Reject the bad instruction.
The assembler also accepted special cases
STREX R0, (R1) as STREX R0, (R1), R0
STREX (R1), R0 as STREX R0, (R1), R0
both are illegal. Remove this special case as well.
For STREXD, check that the destination is not source, and not
source+1. Also check that the source register is even numbered,
as required by the architecture's manual.
Fixes#22268.
Change-Id: I6bfde86ae692d8f1d35bd0bd7aac0f8a11ce8e22
Reviewed-on: https://go-review.googlesource.com/71190
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Some ARM64-specific instructions (such as SIMD instructions) are not supported.
This patch adds support for the following:
1. Extended register, e.g.:
ADD Rm.<ext>[<<amount], Rn, Rd
<ext> can have the following values:
UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW and SXTX
2. Arrangement for SIMD instructions, e.g.:
VADDP Vm.<T>, Vn.<T>, Vd.<T>
<T> can have the following values:
B8, B16, H4, H8, S2, S4 and D2
3. Width specifier and element index for SIMD instructions, e.g.:
VMOV Vn.<T>[index], Rd // MOV(to general register)
<T> can have the following values:
S and D
4. Register List, e.g.:
VLD1 (Rn), [Vt1.<T>, Vt2.<T>, Vt3.<T>]
5. Register offset variant, e.g.:
VLD1.P (Rn)(Rm), [Vt1.<T>, Vt2.<T>] // Rm is the post-index register
6. Go assembly for ARM64 reference manual
new added instructions are required to have according explanation items in
the manual and items for existed instructions will be added incrementally
For more information about the refinement background, please refer to the
discussion (https://groups.google.com/forum/#!topic/golang-dev/rWgDxCrL4GU)
This patch only adds syntax and doesn't break any assembly that already exists.
Change-Id: I34e90b7faae032820593a0e417022c354a882008
Reviewed-on: https://go-review.googlesource.com/41654
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
When we split separate packages into separate compilation units, we
lost PC range information because it was no longer contiguous. This
brings it back by constructing proper per-package PC range tables.
Change-Id: Id0ab5187e08ac5d13b3d3794977bfc857a56224f
Reviewed-on: https://go-review.googlesource.com/69974
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Currently, the linker generates one huge DWARF compilation unit for
the entire Go binary. This commit creates a separate compilation unit
and line table per Go package.
We temporarily lose compilation unit PC range information, since it's
now discontiguous, so harder to emit. We'll bring it back in the next
commit.
Beyond being "more traditional", this has various technical
advantages:
* It should speed up line table lookup, since that requires a
sequential scan of the line table. With this change, a debugger can
first locate the per-package line table and then scan only that line
table.
* Once we emit compilation unit PC ranges again, this should also
speed up various other debugger reverse PC lookups.
* It puts us in a good position to move more DWARF generation into the
compiler, which could produce at least the CU header, per-function
line table fragments, and per-function frame unwinding info that the
linker could simply paste together.
* It will let us record a per-package compiler command-line flags
(#22168).
Change-Id: Ibac642890984636b3ef1d4b37fe97f4453c2cc84
Reviewed-on: https://go-review.googlesource.com/69973
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In order to improve the line numbering for debuggers,
it's necessary to trace lines through compilation.
This makes it (much) easier to follow.
The format of the last column of the ssa.html output was
also changed to reduce the spamminess of the file name,
which is usually the same and makes it far harder to read
instructions and line numbers, and to make it wider and also
able to break words when wrapping (long path names still
can push off the end otherwise; side-to-side scrolling was
tried but was more annoying than the occasional wrapped
line).
Sample output now, where [...] is elision for sake of making
the CL character-counter happy -- and the (##) line numbers
are rendered in italics and a smaller font (11 point) under
control of a CSS class "line-number".
genssa
# /Users/drchase/[...]/ssa/testdata/hist.go
00000 (35) TEXT "".main(SB)
00001 (35) FUNCDATA $0, gclocals·7be4bb[...]1e8b(SB)
00002 (35) FUNCDATA $1, gclocals·9ab98a[...]4568(SB)
v920 00003 (36) LEAQ ""..autotmp_31-640(SP), DI
v858 00004 (36) XORPS X0, X0
v6 00005 (36) LEAQ -48(DI), DI
v6 00006 (36) DUFFZERO $277
v576 00007 (36) LEAQ ""..autotmp_31-640(SP), AX
v10 00008 (36) TESTB AX, (AX)
b1 00009 (36) JMP 10
and from an earlier phase:
b18: ← b17
v242 (47) = Copy <mem> v238
v243 (47) = VarKill <mem> {.autotmp_16} v242
v244 (48) = Addr <**bufio.Scanner> {scanner} v2
v245 (48) = Load <*bufio.Scanner> v244 v243
[...]
v279 (49) = Store <mem> {int64} v277 v276 v278
v280 (49) = Addr <*error> {.autotmp_18} v2
v281 (49) = Load <error> v280 v279
v282 (49) = Addr <*error> {err} v2
v283 (49) = VarDef <mem> {err} v279
v284 (49) = Store <mem> {error} v282 v281 v283
v285 (47) = VarKill <mem> {.autotmp_18} v284
v286 (47) = VarKill <mem> {.autotmp_17} v285
v287 (50) = Addr <*error> {err} v2
v288 (50) = Load <error> v287 v286
v289 (50) = NeqInter <bool> v288 v51
If v289 → b21 b22 (line 50)
Change-Id: I3f46310918f965761f59e6f03ea53067237c28a8
Reviewed-on: https://go-review.googlesource.com/69591
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
On Windows, not closing f keeps us from being able to remove it.
Change-Id: Id4cb709b6ce0b30485b87364a9f0e6e71d2782bd
Reviewed-on: https://go-review.googlesource.com/70070
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
If the stack frame is too large, abort immediately.
We used to generate code first, then abort.
In issue 22200, generating code raised a panic
so we got an ICE instead of an error message.
Change the max frame size to 1GB (from 2GB).
Stack frames between 1.1GB and 2GB didn't used to work anyway,
the pcln table generation would have failed and generated an ICE.
Fixes#22200
Change-Id: I1d918ab27ba6ebf5c87ec65d1bccf973f8c8541e
Reviewed-on: https://go-review.googlesource.com/69810
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This CL does a few things.
1. It moves the existing "read a build ID" code out of the go command
and into cmd/internal/buildid.
2. It adds new code there to "write a build ID".
3. It adds better tests.
4. It encapsulates cmd/internal/buildid into a new standalone program
"go tool buildid".
The go command is going to use the new "write a build ID" functionality
in a future CL. Adding the separate "go tool buildid" gives "go build -x"
a printable command to explain what it is doing in that new step.
(This is similar to the go command printing "go tool pack" commands
equivalent to the actions it is taking, even though it's not invoking pack
directly.) Keeping go build -x honest means that other build systems can
potentially keep up with the go command.
Change-Id: I01c0a66e30a80fa7254e3f2879283d3cd7aa03b4
Reviewed-on: https://go-review.googlesource.com/69053
Reviewed-by: David Crawshaw <crawshaw@golang.org>
This change fixes the implementation of Data Cache instructions for
ppc64x, allowing non-zero hint field values.
Change-Id: I454aac9293d069a4817ee574d5809fa1799b3216
Reviewed-on: https://go-review.googlesource.com/68670
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Thanks to Christopher Nelson for spearheading the effort.
Fixes#11058
Change-Id: Icafabac8dc697626ff1bd943cc577b0b1cc6b349
Reviewed-on: https://go-review.googlesource.com/69091
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Also add -V=full to print a unique identifier of the specific tool being invoked.
This will be used for content-based staleness.
Also sort and clean up a few of the flag doc comments.
Change-Id: I786fe50be0b8e5f77af809d8d2dab721185c2abd
Reviewed-on: https://go-review.googlesource.com/68590
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
"VEX.vvvv" field (VSR, VEX-specified-register) made explicit
in Optab encoding.
vexNDS, vexNDD, vexDDS and vexNOVSR do nothing,
this change does not produce any noticeable effect.
Rationale behind this change:
- keep more information inside optab entries
- make encodings match SDM more closely
- one less special rule to keep in mind
Pvex optabs are updated based on the Intel SDM descriptions.
Unused VEX combinations are removed;
it is problematic to choose VSR combinations for them
without actual Optabs that use them.
The origin of this idea can be found in:
https://go-review.googlesource.com/#/c/arch/+/66972/
Change-Id: I54634a72b44d61f4b924a1e45f2240aab7384dc2
Reviewed-on: https://go-review.googlesource.com/67890
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Adds last missing SSE4 instruction.
Also introduces additional ytab set 'yextractps'.
See https://golang.org/cl/57470 that adds other SSE4 instructions
but skips this one due to 'yextractps'.
To make EXTRACTPS less "sloppy", Yu2 oclass added to forbid
usage of invalid offset values in immediate operand.
Part of the mission to add missing amd64 SSE4 instructions to Go asm.
Change-Id: I0e67e3497054f53257dd8eb4c6268da5118b4853
Reviewed-on: https://go-review.googlesource.com/57490
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
C_PPAUTO was matching offsets that is a multiple 8. But this
condition is dropped in CL 55610, causing unaligned offset
between 256 and 504 mistakenly matched to some classes, e.g.
C_UAUTO8K. This CL restores this condition, also fixes an
error that C_PPAUTO shouldn't match C_PSAUTO, because the
latter is not guaranteed to be multiple of 8. C_PPAUTO_8 is
unnecessary, removed.
Fixes#21992.
Change-Id: I75d5a0e5f5dc3dae335721fbec1bbcd4a3b862f2
Reviewed-on: https://go-review.googlesource.com/65730
Reviewed-by: David Chase <drchase@google.com>
The following instructions were introduced in ARMv6, and the compiler
can do more optimization with them.
1. "MOVBS Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does signed
byte extension to word, and writes the result to Rd.
2. "MOVHS Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does signed
halfword extension to word, and writes the result to Rd.
3. "MOVBU Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does unsigned
byte extension to word, and writes the result to Rd.
4. "MOVHU Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does unsigned
half-word extension to word, and writes the result to Rd.
5. "XTAB Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does signed
byte extension to word, adds Rn, and writes the result to Rd.
6. "XTAH Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does signed
half-word extension to word, adds Rn, and writes the result to Rd.
7. "XTABU Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does unsigned
byte extension to word, adds Rn, and writes the result to Rd.
8. "XTAHU Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does unsigned
half-word extension to word, adds Rn, and writes the result to Rd.
Change-Id: I4306d7ebac93015d7e2e40d307f2c4271c03f466
Reviewed-on: https://go-review.googlesource.com/65790
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The information that's used to generate DWARF location lists is very
ssa.Value centric; it uses Values as start and end coordinates to define
ranges. That mostly works fine, but control flow instructions don't come
from Values, so the ranges couldn't cover them.
Control flow instructions are generated when the SSA representation is
converted to assembly, so that's the best place to extend the ranges
to cover them. (Before that, there's nothing to refer to, and afterward
the boundaries between blocks have been lost.) That requires block
information in the debugInfo type, which then flows down to make
everything else awkward. On the plus side, there's a little less copying
slices around than there used to be, so it should be a little faster.
Previously, the ranges for empty blocks were not very meaningful. That
was fine, because they had no Values to cover, so no debug information
was generated for them. But they do have control flow instructions
(that's why they exist) and so now it's important that the information
be correct. Introduce two sentinel values, BlockStart and BlockEnd, that
denote the boundary of a block, even if the block is empty. BlockEnd
replaces the previous SurvivedBlock flag.
There's one more problem: the last instruction in the function will be a
control flow instruction, so any live ranges need to be extended past
it. But there's no instruction after it to use as the end of the range.
Instead, leave the EndProg field of those ranges as nil and fix it up to
point to past the end of the assembled text at the very last moment.
Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b
Reviewed-on: https://go-review.googlesource.com/63010
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
After https://golang.org/cl/64793, we started to include Mach-O object
files which don't have symbol table into cgo archive.
However, toolchains didn't handle those files yet.
Fixes#21959
Change-Id: Ibb2f6492f1fa59368f2dfd4cff19783997539875
Reviewed-on: https://go-review.googlesource.com/65170
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This CL also make cmd/nm accept PE object file.
Fixes#21706
Change-Id: I4a528b7d53da1082e61523ebeba02c4c514a43a7
Reviewed-on: https://go-review.googlesource.com/64890
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
https://golang.org/cl/5822049 introduced the idea of linking together
all the cgo objects with -r, while also linking against -lgcc. This
was to fix http://golang.org/issue/3261: cgo code that requires libgcc
would break when using internal linking.
This approach introduced https://golang.org/issue/9510: multiple
different cgo packages could include the same libgcc object, leading
to a multiple definition error during the final link. That problem was
fixed by https://golang.org/cl/16741, as modified by
https://golang.org/cl/16993, which did the link against libgcc only
during the final link.
After https://golang.org/cl/16741, and, on Windows, the later
https://golang.org/cl/26670, ld -r no longer does anything useful.
So, remove it.
Doing this revealed that running ld -r on Darwin simplifies some
relocs by making them specific to a symbol rather than a section.
Correct the handling of unsigned relocations in internal linking mode
by offsetting by the symbol value. This only really comes up when
using the internal linker with C code that initializes a variable to
the address of a local constant, such as a C string (as in const char
*s = "str";). This change does not affect the normal case of external
linking, where the Add field is ignored. The test case is
misc/cgo/test/issue6612.go in internal linking mode.
The cmd/internal/goobj test can now see an external object with no
symbol table; fix it to not crash in that case.
Change-Id: I15e5b7b5a8f48136bc14bf4e1c4c473d5eb58062
Reviewed-on: https://go-review.googlesource.com/64793
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
When rewriting loads and stores accessing global variables to use the
GOT we were making use of REGTMP (R10). Unfortunately loads and stores
with large offsets (larger than 20-bits) were also using REGTMP,
causing it to be clobbered and subsequently a segmentation fault.
This can be fixed by using REGTMP2 (R11) for the rewrite. This is fine
because REGTMP2 only has a couple of uses in the assembler (division,
high multiplication and storage-to-storage instructions). We didn't
use REGTMP2 originally because it used to be used more frequently,
in particular for stores of constants to memory. However we have now
eliminated those uses.
This was found while writing a test case for CL 63030. That test case
is included in this CL.
Change-Id: I13956f1f3ca258a7c8a7ff0a7570d2848adf7f68
Reviewed-on: https://go-review.googlesource.com/65011
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Rematerializable ops can be inserted after the flagalloc phase,
they must therefore not clobber flags. This CL adds a check to
ensure this doesn't happen and fixes the instances where it
does currently.
amd64: ADDQconst and ADDLconst were recently changed to be
rematerializable in CL 54393 (only in tip, not 1.9). That change
has been reverted.
s390x: MOVDaddr could clobber flags when using dynamic linking due
to a ADD with immediate instruction. Change the code generation to
use LA/LAY instead.
Fixes#21080.
Change-Id: Ia85c882afa2a820a309e93775354b3169ec6d034
Reviewed-on: https://go-review.googlesource.com/63030
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
These are the last instructions missing to complete SSE3 support.
For reference what was missing was found by a tool [1]:
$ x86db-gogen list --extension SSE3 --not-known
ADDSUBPD xmmreg,xmmrm [rm: 66 0f d0 /r] PRESCOTT,SSE3,SO
ADDSUBPS xmmreg,xmmrm [rm: f2 0f d0 /r] PRESCOTT,SSE3,SO
[1] https://github.com/dlespiau/x86dbFixes#20293
Change-Id: Ib5a91bf64dcc5282cdb60eae740ae52b4db16ebd
Reviewed-on: https://go-review.googlesource.com/42990
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
This change makes it easier to express instructions
with arbitrary number of operands.
Rationale: previous approach with operand "hiding" does
not scale well, AVX and especially AVX512 have many
instructions with 3+ operands.
x86 asm backend is updated to handle up to 6 explicit operands.
It also fixes issue with 4-th immediate operand type checks.
All `ytab` tables are updated accordingly.
Changes to non-x86 backends only include these patterns:
`p.From3 = X` => `p.SetFrom3(X)`
`p.From3.X = Y` => `p.GetFrom3().X = Y`
Over time, other backends can adapt Prog.RestArgs
and reduce the amount of workarounds.
-- Performance --
x/benchmark/build:
$ benchstat upstream.bench patched.bench
name old time/op new time/op delta
Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10)
name old binary-size new binary-size delta
Build-48 10.3M ± 0% 10.3M ± 0% ~ (all equal)
name old build-time/op new build-time/op delta
Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10)
name old build-peak-RSS-bytes new build-peak-RSS-bytes delta
Build-48 145MB ± 5% 148MB ± 5% ~ (p=0.218 n=10+10)
name old build-user+sys-time/op new build-user+sys-time/op delta
Build-48 21.0s ± 2% 21.2s ± 2% ~ (p=0.075 n=10+10)
Microbenchmark shows a slight slowdown.
name old time/op new time/op delta
AMD64asm-4 49.5ms ± 1% 49.9ms ± 1% +0.67% (p=0.001 n=23+15)
func BenchmarkAMD64asm(b *testing.B) {
for i := 0; i < b.N; i++ {
TestAMD64EndToEnd(nil)
TestAMD64Encoder(nil)
}
}
Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183
Reviewed-on: https://go-review.googlesource.com/63490
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The compiler replaces any path of the form /path/to/goroot/src/net/port.go
with GOROOT/src/net/port.go so that the same object file is
produced if the GOROOT is moved. It was skipping this transformation
for any absolute path into the GOROOT that came from //line directives,
such as those generated by cmd/cgo.
Fixes#21373Fixes#21720Fixes#21825
Change-Id: I2784c701b4391cfb92e23efbcb091a84957d61dd
Reviewed-on: https://go-review.googlesource.com/63693
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>