This adds four new deductions to the prove pass, all related to adding
or subtracting one from a value. This is the first hint of actual
arithmetic relations in the prove pass.
The most effective of these is
x-1 >= w && x > min ⇒ x > w
This helps eliminate bounds checks in code like
if x > 0 {
// do something with s[x-1]
}
Altogether, these deductions prove an additional 260 branches in std
and cmd. Furthermore, they will let us eliminate some tricky
compiler-inserted panics in the runtime that are interfering with
static analysis.
Fixes#23354.
Change-Id: I7088223e0e0cd6ff062a75c127eb4bb60e6dce02
Reviewed-on: https://go-review.googlesource.com/87480
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
This adds a few simple deductions to the prove pass' fact table to
derive unsigned concrete limits from signed concrete limits where
possible.
This tweak lets the pass prove 70 additional branch conditions in std
and cmd.
This is based on a comment from the recently-deleted factsTable.get:
"// TODO: also use signed data if lim.min >= 0".
Change-Id: Ib4340249e7733070f004a0aa31254adf5df8a392
Reviewed-on: https://go-review.googlesource.com/87479
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Reviewed-by: Keith Randall <khr@golang.org>
Currently the prove pass uses implication queries. For each block, it
collects the set of branch conditions leading to that block, and
queries this fact table for whether any of these facts imply the
block's own branch condition (or its inverse). This works remarkably
well considering it doesn't do any deduction on these facts, but it
has various downsides:
1. It requires an implementation both of adding facts to the table and
determining implications. These are very nearly duals of each
other, but require separate implementations. Likewise, the process
of asserting facts of dominating branch conditions is very nearly
the dual of the process of querying implied branch conditions.
2. It leads to less effective use of derived facts. For example, the
prove pass currently derives facts about the relations between len
and cap, but can't make use of these unless a branch condition is
in the exact form of a derived fact. If one of these derived facts
contradicts another fact, it won't notice or make use of this.
This CL changes the approach of the prove pass to instead use
*contradiction* instead of implication. Rather than ever querying a
branch condition, it simply adds branch conditions to the fact table.
If this leads to a contradiction (specifically, it makes the fact set
unsatisfiable), that branch is impossible and can be cut. As a result,
1. We can eliminate the code for determining implications
(factsTable.get disappears entirely). Also, there is now a single
implementation of visiting and asserting branch conditions, since
we don't have to flip them around to treat them as facts in one
place and queries in another.
2. Derived facts can be used effectively. It doesn't matter *why* the
fact table is unsatisfiable; a contradiction in any of the facts is
enough.
3. As an added benefit, it's now quite easy to avoid traversing beyond
provably-unreachable blocks. In contrast, the current
implementation always visits all blocks.
The prove pass already has nearly all of the mechanism necessary to
compute unsatisfiability, which means this both simplifies the code
and makes it more powerful.
The only complication is that the current implication procedure has a
hack for dealing with the 0 <= Args[0] condition of OpIsInBounds and
OpIsSliceInBounds. We replace this with asserting the appropriate fact
when we process one of these conditions. This seems much cleaner
anyway, and works because we can now take advantage of derived facts.
This has no measurable effect on compiler performance.
Effectiveness:
There is exactly one condition in all of std and cmd that this fails
to prove that the old implementation could: (int64(^uint(0)>>1) < x)
in encoding/gob. This can never be true because x is an int, and it's
basically coincidence that the old code gets this. (For example, it
fails to prove the similar (x < ^int64(^uint(0)>>1)) condition that
immediately precedes it, and even though the conditions are logically
unrelated, it wouldn't get the second one if it hadn't first processed
the first!)
It does, however, prove a few dozen additional branches. These come
from facts that are added to the fact table about the relations
between len and cap. These were almost never queried directly before,
but could lead to contradictions, which the unsat-based approach is
able to use.
There are exactly two branches in std and cmd that this implementation
proves in the *other* direction. This sounds scary, but is okay
because both occur in already-unreachable blocks, so it doesn't matter
what we chose. Because the fact table logic is sound but incomplete,
it fails to prove that the block isn't reachable, even though it is
able to prove that both outgoing branches are impossible. We could
turn these blocks into BlockExit blocks, but it doesn't seem worth the
trouble of the extra proof effort for something that happens twice in
all of std and cmd.
Tests:
This CL updates test/prove.go to change the expected messages because
it can no longer give a "reason" why it proved or disproved a
condition. It also adds a new test of a branch it couldn't prove
before.
It mostly guts test/sliceopt.go, removing everything related to slice
bounds optimizations and moving a few relevant tests to test/prove.go.
Much of this test is actually unreachable. The new prove pass figures
this out and doesn't try to prove anything about the unreachable
parts. The output on the unreachable parts is already suspect because
anything can be proved at that point, so it's really just a regression
test for an algorithm the compiler no longer uses.
This is a step toward fixing #23354. That issue is quite easy to fix
once we can use derived facts effectively.
Change-Id: Ia48a1b9ee081310579fe474e4a61857424ff8ce8
Reviewed-on: https://go-review.googlesource.com/87478
Reviewed-by: Keith Randall <khr@golang.org>
Currently we expand comparison with small constant strings into len check
and a sequence of byte comparisons. Generate 16/32/64-bit comparisons,
instead of bytewise on 386 and amd64. Also increase limits on what is
considered small constant string.
Shaves ~30kb (0.5%) from go executable.
This also updates test/prove.go to keep test case valid.
Change-Id: I99ae8871a1d00c96363c6d03d0b890782fa7e1d9
Reviewed-on: https://go-review.googlesource.com/38776
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
When we discover a relation x <= len(s), also discover the relation
x <= cap(s). That way, in situations like:
a := s[x:] // tests 0 <= x <= len(s)
b := s[:x] // tests 0 <= x <= cap(s)
the second check can be eliminated.
Fixes#16813
Change-Id: Ifc037920b6955e43bac1a1eaf6bac63a89cfbd44
Reviewed-on: https://go-review.googlesource.com/33633
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Reviewed-by: David Chase <drchase@google.com>
CSE opportunities were being missed for commutative ops. We used to
order the args of commutative ops (by arg ID) once at the start of CSE.
But that may not be enough.
i1 = (Load ptr mem)
i2 = (Load ptr mem)
x1 = (Add i1 j)
x2 = (Add i2 j)
Equivalent commutative ops x1 and x2 may not get their args ordered in
the same way because because at the start of CSE, we don't know that
the i values will be CSEd. If x1 and x2 get opposite orders we won't
CSE them.
Instead, (re)order the args of commutative operations by their
equivalence class IDs each time we partition an equivalence class.
Change-Id: Ic609fa83b85299782a5e85bf93dc6023fccf4b0c
Reviewed-on: https://go-review.googlesource.com/33632
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Adapt old test for prove's bounds check elimination.
Added missing rule to generic rules that lead to differences
between 32 and 64 bit platforms on sliceopt test.
Added debugging to prove.go that was helpful-to-necessary to
discover that missing rule.
Lowered debugging level on prove.go from 3 to 1; no idea
why it was previously 3.
Change-Id: I09de206aeb2fced9f2796efe2bfd4a59927eda0c
Reviewed-on: https://go-review.googlesource.com/23290
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Unroll s == "ab" to
len(s) == 2 && s[0] == 'a' && s[1] == 'b'
This generates faster and shorter code
by avoiding a runtime call.
Do something similar for !=.
The cutoff length is 6. This was chosen empirically
by examining binary sizes on arm, arm64, 386, and amd64
using the SSA backend.
For all architectures examined, 4, 5, and 6 were
the ideal cutoff, with identical binary sizes.
The distribution of constant string equality sizes
during 'go build -a std' is:
40.81% 622 len 0
14.11% 215 len 4
9.45% 144 len 1
7.81% 119 len 3
7.48% 114 len 5
5.12% 78 len 7
4.13% 63 len 2
3.54% 54 len 8
2.69% 41 len 6
1.18% 18 len 10
0.85% 13 len 9
0.66% 10 len 14
0.59% 9 len 17
0.46% 7 len 11
0.26% 4 len 12
0.20% 3 len 19
0.13% 2 len 13
0.13% 2 len 15
0.13% 2 len 16
0.07% 1 len 20
0.07% 1 len 23
0.07% 1 len 33
0.07% 1 len 36
A cutoff of length 6 covers most of the cases.
Benchmarks on amd64 comparing a string to a constant of length 3:
Cmp/1same-8 4.78ns ± 6% 0.94ns ± 9% -80.26% (p=0.000 n=20+20)
Cmp/1diffbytes-8 6.43ns ± 6% 0.96ns ±11% -85.13% (p=0.000 n=20+20)
Cmp/3same-8 4.71ns ± 5% 1.28ns ± 5% -72.90% (p=0.000 n=20+20)
Cmp/3difffirstbyte-8 6.33ns ± 7% 1.27ns ± 7% -79.90% (p=0.000 n=20+20)
Cmp/3difflastbyte-8 6.34ns ± 8% 1.26ns ± 9% -80.13% (p=0.000 n=20+20)
The change to the prove test preserves the
existing intent of the test. When the string was
short, there was a new "proved in bounds" report
that referred to individual byte comparisons.
Change-Id: I593ac303b0d11f275672090c5c786ea0c6b8da13
Reviewed-on: https://go-review.googlesource.com/26758
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
func f(a, b bool) bool {
return a || b
}
is now a single instructions (excluding loading and unloading the arguments):
v10 = ORB <bool> v11 v12 : AX
Change-Id: Iff63399410cb46909f4318ea1c3f45a029f4aa5e
Reviewed-on: https://go-review.googlesource.com/21872
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Handle this case:
if 0 <= i && i < len(a) {
use a[i]
}
Shaves about 5k from pkg/tools/linux_amd64/*.
Change-Id: I6675ff49aa306b0d241b074c5738e448204cd981
Reviewed-on: https://go-review.googlesource.com/21431
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
There are 5293 loop in the main go repository.
A survey of the top most common for loops:
18 for __k__ := 0; i < len(sa.Addr); i++ {
19 for __k__ := 0; ; i++ {
19 for __k__ := 0; i < 16; i++ {
25 for __k__ := 0; i < length; i++ {
30 for __k__ := 0; i < 8; i++ {
49 for __k__ := 0; i < len(s); i++ {
67 for __k__ := 0; i < n; i++ {
376 for __k__ := range __slice__ {
685 for __k__, __v__ := range __slice__ {
2074 for __, __v__ := range __slice__ {
The algorithm to find induction variables handles all cases
with an upper limit. It currently doesn't find related induction
variables such as c * ind or c + ind.
842 out of 22954 bound checks are removed for src/make.bash.
1957 out of 42952 bounds checks are removed for src/all.bash.
Things to do in follow-up CLs:
* Find the associated pointer for `for _, v := range a {}`
* Drop the NilChecks on the pointer.
* Replace the implicit induction variable by a loop over the pointer
Generated garbage can be reduced if we share the sdom between passes.
% benchstat old.txt new.txt
name old time/op new time/op delta
Template 337ms ± 3% 333ms ± 3% ~ (p=0.258 n=9+9)
GoTypes 1.11s ± 2% 1.10s ± 2% ~ (p=0.912 n=10+10)
Compiler 5.25s ± 1% 5.29s ± 2% ~ (p=0.077 n=9+9)
MakeBash 33.5s ± 1% 34.1s ± 2% +1.85% (p=0.011 n=9+9)
name old alloc/op new alloc/op delta
Template 63.6MB ± 0% 63.9MB ± 0% +0.52% (p=0.000 n=10+9)
GoTypes 218MB ± 0% 219MB ± 0% +0.59% (p=0.000 n=10+9)
Compiler 978MB ± 0% 985MB ± 0% +0.69% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Template 582k ± 0% 583k ± 0% +0.10% (p=0.000 n=10+10)
GoTypes 1.78M ± 0% 1.78M ± 0% +0.12% (p=0.000 n=10+10)
Compiler 7.68M ± 0% 7.69M ± 0% +0.05% (p=0.000 n=10+10)
name old text-bytes new text-bytes delta
HelloSize 581k ± 0% 581k ± 0% -0.08% (p=0.000 n=10+10)
CmdGoSize 6.40M ± 0% 6.39M ± 0% -0.08% (p=0.000 n=10+10)
name old data-bytes new data-bytes delta
HelloSize 3.66k ± 0% 3.66k ± 0% ~ (all samples are equal)
CmdGoSize 134k ± 0% 134k ± 0% ~ (all samples are equal)
name old bss-bytes new bss-bytes delta
HelloSize 126k ± 0% 126k ± 0% ~ (all samples are equal)
CmdGoSize 149k ± 0% 149k ± 0% ~ (all samples are equal)
name old exe-bytes new exe-bytes delta
HelloSize 947k ± 0% 946k ± 0% -0.01% (p=0.000 n=10+10)
CmdGoSize 9.92M ± 0% 9.91M ± 0% -0.06% (p=0.000 n=10+10)
Change-Id: Ie74bdff46fd602db41bb457333d3a762a0c3dc4d
Reviewed-on: https://go-review.googlesource.com/20517
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
Find comparisons to constants and propagate that information
down the dominator tree. Use it to resolve other constant
comparisons on the same variable.
So if we know x >= 7, then a x > 4 condition must return true.
This change allows us to use "_ = b[7]" hints to eliminate bounds checks.
Fixes#14900
Change-Id: Idbf230bd5b7da43de3ecb48706e21cf01bf812f7
Reviewed-on: https://go-review.googlesource.com/21008
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
* Refacts a bit saving and restoring parents restrictions
* Shaves ~100k from pkg/tools/linux_amd64,
but most of the savings come from the rewrite rules.
* Improves on the following artificial test case:
func f1(a4 bool, a6 bool) bool {
return a6 || (a6 || (a6 || a4)) || (a6 || (a4 || a6 || (false || a6)))
}
Change-Id: I714000f75a37a3a6617c6e6834c75bd23674215f
Reviewed-on: https://go-review.googlesource.com/20306
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
* It does very simple bounds checking elimination. E.g.
removes the second check in for i := range a { a[i]++; a[i++]; }
* Improves on the following redundant expression:
return a6 || (a6 || (a6 || a4)) || (a6 || (a4 || a6 || (false || a6)))
* Linear in the number of block edges.
I patched in CL 12960 that does bounds, nil and constant propagation
to make sure this CL is not just redundant. Size of pkg/tool/linux_amd64/*
(excluding compile which is affected by this change):
With IsInBounds and IsSliceInBounds
-this -12960 92285080
+this -12960 91947416
-this +12960 91978976
+this +12960 91923088
Gain is ~110% of 12960.
Without IsInBounds and IsSliceInBounds (older run)
-this -12960 95515512
+this -12960 95492536
-this +12960 95216920
+this +12960 95204440
Shaves 22k on its own.
* Can we handle IsInBounds better with this? In
for i := range a { a[i]++; } the bounds checking at a[i]
is not eliminated.
Change-Id: I98957427399145fb33693173fd4d5a8d71c7cc20
Reviewed-on: https://go-review.googlesource.com/19710
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>