var p *[2]uint32 = ...
p[0] = 0
p[1] = 0
When we combine these two 32-bit stores into a single 64-bit store,
use the line number of the first store, not the second one.
This differs from the default behavior because usually with the combining
that the compiler does, we use the line number of the last instruction
in the combo (e.g. load+add, we use the line number of the add).
This is the same behavior that gcc does in C (picking the line
number of the first of a set of combined stores).
Change-Id: Ie70bf6151755322d33ecd50e4d9caf62f7881784
Reviewed-on: https://go-review.googlesource.com/c/go/+/521678
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
It is one less dependent load away, and right next to another
field in the itab we also load as part of the type switch or
type assert.
Change-Id: If7aaa7814c47bd79a6c7ed4232ece0bc1d63550e
Reviewed-on: https://go-review.googlesource.com/c/go/+/533117
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This is the last of the getitab users to receive a cache.
We should now no longer see getitab (and callees) in profiles.
Hopefully.
Change-Id: I2ed72b9943095bbe8067c805da7f08e00706c98c
Reviewed-on: https://go-review.googlesource.com/c/go/+/531055
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Convert expand calls into a smaller number of focused
recursive rewrites, and rely on an enhanced version of
"decompose" to clean up afterwards.
Debugging information seems to emerge intact.
Change-Id: Ic46da4207e3a4da5c8e2c47b637b0e35abbe56bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/507295
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
That way we don't need to call into the runtime for every
type assertion (to an interface type).
name old time/op new time/op delta
TypeAssert-24 3.78ns ± 3% 1.00ns ± 1% -73.53% (p=0.000 n=10+8)
Change-Id: I0ba308aaf0f24a5495b4e13c814d35af0c58bfde
Reviewed-on: https://go-review.googlesource.com/c/go/+/529316
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
That way we don't need to call into the runtime when the type being
switched on has been seen many times before.
The cache is just a hash table of a sample of all the concrete types
that have been switched on at that source location. We record the
matching case number and the resulting itab for each concrete input
type.
The caches seldom get large. The only two in a run of all.bash that
get more than 100 entries, even with the sampling rate set to 1, are
test/fixedbugs/issue29264.go, with 101
test/fixedbugs/issue29312.go, with 254
Both happen at the type switch in fmt.(*pp).handleMethods, perhaps
unsurprisingly.
name old time/op new time/op delta
SwitchInterfaceTypePredictable-24 25.8ns ± 2% 2.5ns ± 3% -90.43% (p=0.000 n=10+10)
SwitchInterfaceTypeUnpredictable-24 37.5ns ± 2% 11.2ns ± 1% -70.02% (p=0.000 n=10+10)
Change-Id: I4961ac9547b7f15b03be6f55cdcb972d176955eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/526658
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
For type switches where the targets are interface types,
call into the runtime once instead of doing a sequence
of assert* calls.
name old time/op new time/op delta
SwitchInterfaceTypePredictable-24 26.6ns ± 1% 25.8ns ± 2% -2.86% (p=0.000 n=10+10)
SwitchInterfaceTypeUnpredictable-24 39.3ns ± 1% 37.5ns ± 2% -4.57% (p=0.000 n=10+10)
Not super helpful by itself, but this code organization allows
followon CLs that add caching to the lookups.
Change-Id: I7967f85a99171faa6c2550690e311bea8b54b01c
Reviewed-on: https://go-review.googlesource.com/c/go/+/526657
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Add a new form of RLDC which maps directly to the ISA definition
of rldc: RLDC Rs, $sh, $mb, Ra. This is used to generate mask
constants described below.
Using MOVD $-1, Rx; RLDC Rx, $sh, $mb, Rx, any mask constant
can be generated. A mask is a contiguous series of 1 bits, which
may wrap.
Change-Id: Ifcaae1114080ad58b5fdaa3e5fc9019e2051f282
Reviewed-on: https://go-review.googlesource.com/c/go/+/531120
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Check for shifted 16b constants, and transform them to avoid the load
penalty. This should be much faster than loading, and reduce binary
size by reducing the constant pool size.
Change-Id: I6834e08be7ca88e3b77449d226d08d199db84299
Reviewed-on: https://go-review.googlesource.com/c/go/+/531119
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This sequence can show up in the lowering pass on PPC64. If it
makes it to the latelower pass, it will cause an error because
it looks like it can be turned into RLDICL, but -1 isn't an
accepted mask.
Also, print more debug info if panic is called from
encodePPC64RotateMask.
Fixes#62698
Change-Id: I0f3322e2205357abe7fc28f96e05e3f7ad65567c
Reviewed-on: https://go-review.googlesource.com/c/go/+/529195
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
sparse conditional constant propagation can discover optimization
opportunities that cannot be found by just combining constant folding
and constant propagation and dead code elimination separately.
This is a re-submit of PR#59575, which fix a broken dominance relationship caught by ssacheck
Updates https://github.com/golang/go/issues/59399
Change-Id: I57482dee38f8e80a610aed4f64295e60c38b7a47
GitHub-Last-Rev: 830016f24e
GitHub-Pull-Request: golang/go#60469
Reviewed-on: https://go-review.googlesource.com/c/go/+/498795
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Generate RLDIC[LR] instead of MOVD mask, Rx; AND Rx, Ry, Rz.
This helps reduce code size, and reduces the latency caused
by the constant load.
Similarly, for smaller-than-register values, truncate constants
which exceed the range of the value's type to avoid needing to
load a constant.
Change-Id: I6019684795eb8962d4fd6d9585d08b17c15e7d64
Reviewed-on: https://go-review.googlesource.com/c/go/+/515576
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
For large interface -> concrete type switches, we can use a jump
table on some bits of the type hash instead of a binary search on
the type hash.
name old time/op new time/op delta
SwitchTypePredictable-24 1.99ns ± 2% 1.78ns ± 5% -10.87% (p=0.000 n=10+10)
SwitchTypeUnpredictable-24 11.0ns ± 1% 9.1ns ± 2% -17.55% (p=0.000 n=7+9)
Change-Id: Ida4768e5d62c3ce1c2701288b72664aaa9e64259
Reviewed-on: https://go-review.googlesource.com/c/go/+/521497
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
This lets us combine more write barriers, getting rid of some of the
test+branch and gcWriteBarrier* calls.
With the new write barriers, it's easy to add a few non-pointer writes
to the set of values written.
We allow up to 2 non-pointer writes between pointer writes. This is enough
for, for example, adjacent slice fields.
Fixes#62126
Change-Id: I872d0fa9cc4eb855e270ffc0223b39fde1723c4b
Reviewed-on: https://go-review.googlesource.com/c/go/+/521498
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
This CL adds FMADDS,FMSUBS,FNMADDS,FNMSUBS SSA support for riscv
Change-Id: I1e7dd322b46b9e0f4923dbba256303d69ed12066
Reviewed-on: https://go-review.googlesource.com/c/go/+/506616
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: M Zhuo <mzh@golangcn.org>
If we're not using the upper bits, don't bother issuing a
sign/zero extension operation.
For arm64, after CL 520916 which fixed a correctness bug with
extensions but as a side effect leaves many unnecessary ones
still in place.
Change-Id: I5f4fe4efbf2e9f80969ab5b9a6122fb812dc2ec0
Reviewed-on: https://go-review.googlesource.com/c/go/+/521496
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Stop using BTSconst and friends when ORLconst can be used instead.
OR can be issued by more function units than BTS can, so it could
lead to better IPC. OR might take a few more bytes to encode, but
not a lot more.
Still use BTSconst for cases where the constant otherwise wouldn't
fit and would require a separate movabs instruction to materialize
the constant. This happens when setting bits 31-63 of 64-bit targets.
Add BTS-to-memory operations so we don't need to load/bts/store.
Fixes#61694
Change-Id: I00379608df8fb0167cb01466e97d11dec7c1596c
Reviewed-on: https://go-review.googlesource.com/c/go/+/515755
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Fixes#61629
This reduce the pressure on regalloc because then the loop only keep alive
one value (the iterator) instead of the iterator and the upper bound since
the comparison now acts against an immediate, often zero which can be skipped.
This optimize things like:
for i := 0; i < n; i++ {
Or a range over a slice where the index is not used:
for _, v := range someSlice {
Or the new range over int from #61405:
for range n {
It is hit in 975 unique places while doing ./make.bash.
Change-Id: I5facff8b267a0b60ea3c1b9a58c4d74cdb38f03f
Reviewed-on: https://go-review.googlesource.com/c/go/+/512935
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
s==s is always true for strings. This comes up in NaN testing in
generic code, where we want x==x to compile completely away except for
float types.
Fixes#60777
Change-Id: I3ce054b5121354de2f9751b010fb409f148cb637
Reviewed-on: https://go-review.googlesource.com/c/go/+/503795
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
If we load 2 values and then store those 2 loaded values, we can likely
perform that operation with a single wider load and store.
Fixes#60709
Change-Id: Ifc5f92c2f1b174c6ed82a69070f16cec6853c770
Reviewed-on: https://go-review.googlesource.com/c/go/+/502295
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
(ANDCCconst [y] (MOV.*reg x)) should only be merged when zero
extending. Otherwise, sign bits are lost on negative values.
(ANDCCconst [0xFF] (MOVBreg x)) should be simplified to a zero
extension of x. Likewise for the MOVHreg variant.
Fixes#61297
Change-Id: I04e4fd7dc6a826e870681f37506620d48393698b
Reviewed-on: https://go-review.googlesource.com/c/go/+/508775
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
According to RISCV manual 11.6:
FMADD x,y,z computes x*y+z and
FNMADD x,y,z => -x*y-z
FMSUB x,y,z => x*y-z
FNMSUB x,y,z => -x*y+z respectively
However our implement of SSA convert FMADD -x,y,z to FNMADD x,y,z which
is wrong and should be convert to FNMSUB according to manual.
Change-Id: Ib297bc83824e121fd7dda171ed56ea9694a4e575
Reviewed-on: https://go-review.googlesource.com/c/go/+/506575
Run-TryBot: M Zhuo <mzh@golangcn.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Michael Munday <mike.munday@lowrisc.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Don't use the line number of the argument itself, as that may be from
arbitrarily earlier in the function.
Fixes#60673
Change-Id: Ifc0a2aaae221a256be3a4b0b2e04849bae4b79d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/502656
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Some testdir tests fail if GOEXPERIMENT=cgocheck2 is set. Fix this by
skipping these tests.
Change-Id: I58d4ef0cceb86bcf93220b4a44de9b9dc4879b16
Reviewed-on: https://go-review.googlesource.com/c/go/+/499675
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
sparse conditional constant propagation can discover optimization opportunities that cannot be found by just combining constant folding and constant propagation and dead code elimination separately.
Updates #59399
Change-Id: Ia954e906480654a6f0cc065d75b5912f96f36b2e
GitHub-Last-Rev: 90fc02db99
GitHub-Pull-Request: golang/go#59575
Reviewed-on: https://go-review.googlesource.com/c/go/+/483875
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
In Go 1.17, cmd/compile gained the ability to inline calls to
functions that contain function literals (aka "closures"). This was
implemented by duplicating the function literal body and emitting a
second LSym, because in general it might be optimized better than the
original function literal.
However, the second LSym was named simply as any other function
literal appearing literally in the enclosing function would be named.
E.g., if f has a closure "f.funcX", and f is inlined into g, we would
create "g.funcY" (N.B., X and Y need not be the same.). Users then
have no idea this function originally came from f.
With this CL, the inlined call stack is incorporated into the clone
LSym's name: instead of "g.funcY", it's named "g.f.funcY".
In the future, it seems desirable to arrange for the clone's name to
appear exactly as the original name, so stack traces remain the same
as when -l or -d=inlfuncswithclosures are used. But it's unclear
whether the linker supports that today, or whether any downstream
tooling would be confused by this.
Updates #60324.
Change-Id: Ifad0ccef7e959e72005beeecdfffd872f63982f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/497137
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Be more liberal about expanding the OR tree. Handle any tree shape
instead of a fully left or right associative tree.
Also remove tail feature, it isn't ever needed.
Change-Id: If16bebef94b952a604d6069e9be3d9129994cb6f
Reviewed-on: https://go-review.googlesource.com/c/go/+/494056
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ryan Berger <ryanbberger@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
This modifies some existing rules to allow more prefixed instructions
to be generated when using GOPPC64=power10. Some rules also check
if PCRel is available, which is currently supported for linux/ppc64le
and linux/ppc64 (internal linking only).
Prior to p10, DS-offset loads and stores had a 16 bit size limit for
the offset field. If the offset of the data for load or store was
beyond this range then an indexed load or store would be selected by
the rules.
In p10 the assembler can generate prefixed instructions in this case,
but does not if an indexed instruction was selected during the lowering
pass.
This allows many more cases to use prefixed loads or stores, reducing
function sizes and improving performance in some cases where the code
change happens in key loops.
For example in strconv BenchmarkAppendQuoteRune before:
12c5e4: 15 00 10 06 pla r10,1425660
12c5e8: fc c0 40 39
12c5ec: 00 00 6a e8 ld r3,0(r10)
12c5f0: 10 00 aa e8 ld r5,16(r10)
After this change:
12a828: 15 00 10 04 pld r3,1433272
12a82c: b8 de 60 e4
12a830: 15 00 10 04 pld r5,1433280
12a834: c0 de a0 e4
Performs better in the second case.
A testcase was added to verify that the rules correctly select a load or
store based on the offset and whether power10 or earlier.
Change-Id: I4335fed0bd9b8aba8a4f84d69b89f819cc464846
Reviewed-on: https://go-review.googlesource.com/c/go/+/477398
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
The effect and motivation is for the test to be selected when doing
'go test cmd' and not when doing 'go test std' since it's primarily
about testing the Go compiler and linker. Other than that, it's run
by all.bash and 'go test std cmd' as before.
For #56844.
Fixes#60059.
Change-Id: I2d499af013f9d9b8761fdf4573f8d27d80c1fccf
Reviewed-on: https://go-review.googlesource.com/c/go/+/493876
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Adds rules that rewrites statements such as ~P&~Q as ~(P|Q) and ~P|~Q as ~(P&Q), removing an extraneous instruction.
Change-Id: Icedb97df741680ddf9799df79df78657173aa500
GitHub-Last-Rev: f22e2350c9
GitHub-Pull-Request: golang/go#60018
Reviewed-on: https://go-review.googlesource.com/c/go/+/493175
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Stefan M <st3f4nm4d4@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Thanks to the recent addition of the memcombine pass, the
ppc64 ports now have the memcombine optimizations. Previously
in PPC64.rules, the memcombine rules were only added for
ppc64le targets due to the significant increase in size of
the rewritePPC64.go file when those rules were added. The
ppc64 and ppc64le rules had to be different because of the
byte order due to endianness differences.
This enables the memcombine tests to be run on ppc64 as well
as ppc64le.
Change-Id: I4081e2d94617a1b66541d536c0c2662e266c9c1e
Reviewed-on: https://go-review.googlesource.com/c/go/+/492615
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Lynn Boger <laboger@linux.vnet.ibm.com>
Add more rules to ensure that order doesn't matter.
Add memequal 0 rule.
Try to use a constant argument to memequal when one is available.
Fixes#59684
Change-Id: I36e85ffbd949396ed700ed6e8ec2bc3ae013f5d2
Reviewed-on: https://go-review.googlesource.com/c/go/+/485535
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Currently, the prove pass can get knowledge from some specific logic
operators only before the CFG is explored, which means that the bounds
information of the branch will be ignored.
This CL updates the facts table by the logic operators in every
branch. Combined with the branch information, this will be helpful for
BCE in some circumstances.
Fixes#57243
Change-Id: I0bd164f1b47804ccfc37879abe9788740b016fd5
Reviewed-on: https://go-review.googlesource.com/c/go/+/419555
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Under the right conditions we can optimize cmp comparisons to cmn
comparisons, such as:
func foo(a, b int) int {
var c int
if a + b < 0 {
c = 1
}
return c
}
Previously it's compiled as:
ADD R1, R0, R1
CMP $0, R1
CSET LT, R0
With this CL it's compiled as:
CMN R1, R0
CSET MI, R0
Here we need to pay attention to the overflow situation of a+b, the MI
flag means N==1, which doesn't honor the overflow flag V, its value
depends only on the sign of the result. So it has the same semantic of
the Go code, so it's correct.
Similarly, this CL also optimizes the case of >= comparison
using the PL conditional flag.
Change-Id: I47179faba5b30cca84ea69bafa2ad5241bf6dfba
Reviewed-on: https://go-review.googlesource.com/c/go/+/476116
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>