1
0
mirror of https://github.com/golang/go synced 2024-11-12 00:20:22 -07:00
Commit Graph

9 Commits

Author SHA1 Message Date
Giovanni Bajo
e0d37a33ab cmd/compile: teach prove to handle expressions like len(s)-delta
When a loop has bound len(s)-delta, findIndVar detected it and
returned len(s) as (conservative) upper bound. This little lie
allowed loopbce to drop bound checks.

It is obviously more generic to teach prove about relations like
x+d<w for non-constant "w"; we already handled the case for
constant "w", so we just want to learn that if d<0, then x+d<w
proves that x<w.

To be able to remove the code from findIndVar, we also need
to teach prove that len() and cap() are always non-negative.

This CL allows to prove 633 more checks in cmd+std. Most
of them are cases where the code was already testing before
accessing a slice but the compiler didn't know it. For instance,
take strings.HasSuffix:

    func HasSuffix(s, suffix string) bool {
        return len(s) >= len(suffix) && s[len(s)-len(suffix):] == suffix
    }

When suffix is a literal string, the compiler now understands
that the explicit check is enough to not emit a slice check.

I also found a loopbce test that was incorrectly
written to detect an overflow but had a off-by-one (on the
conservative side), so it unexpectly passed with this CL; I
changed it to really trigger the overflow as intended.

Change-Id: Ib5abade337db46b8811425afebad4719b6e46c4a
Reviewed-on: https://go-review.googlesource.com/105635
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-04-29 09:38:32 +00:00
Giovanni Bajo
6d379add0f cmd/compile: in prove, detect loops with negative increments
To be effective, this also requires being able to relax constraints
on min/max bound inclusiveness; they are now exposed through a flags,
and prove has been updated to handle it correctly.

Change-Id: I3490e54461b7b9de8bc4ae40d3b5e2fa2d9f0556
Reviewed-on: https://go-review.googlesource.com/104041
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-04-29 09:38:18 +00:00
Giovanni Bajo
980fdb8dd5 cmd/compile: improve testing of induction variables
Test both minimum and maximum bound, and prepare
formatting for more advanced tests (inclusive / esclusive bounds).

Change-Id: Ibe432916d9c938343bc07943798bc9709ad71845
Reviewed-on: https://go-review.googlesource.com/104040
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-29 09:38:09 +00:00
Giovanni Bajo
7ec25d0acf cmd/compile: implement loop BCE in prove
Reuse findIndVar to discover induction variables, and then
register the facts we know about them into the facts table
when entering the loop block.

Moreover, handle "x+delta > w" while updating the facts table,
to be able to prove accesses to slices with constant offsets
such as slice[i-10].

Change-Id: I2a63d050ed58258136d54712ac7015b25c893d71
Reviewed-on: https://go-review.googlesource.com/104038
Run-TryBot: Giovanni Bajo <rasky@develer.com>
Reviewed-by: David Chase <drchase@google.com>
2018-04-29 09:37:35 +00:00
David Chase
0b6fbaae6e cmd/compile: make loop guard+rotate conditional on GOEXPERIMENT
Loops of the form "for i,e := range" needed to have their
condition rotated to the "bottom" for the preemptible loops
GOEXPERIMENT, but this caused a performance regression
because it degraded bounds check removal.  For now, make
the loop rotation/guarding conditional on the experiment.

Fixes #20711.
Updates #10958.

Change-Id: Icfba14cb3b13a910c349df8f84838cf4d9d20cf6
Reviewed-on: https://go-review.googlesource.com/46410
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-06-21 22:07:33 +00:00
David Chase
d71f36b5aa cmd/compile: check loop rescheduling with stack bound, not counter
After benchmarking with a compiler modified to have better
spill location, it became clear that this method of checking
was actually faster on (at least) two different architectures
(ppc64 and amd64) and it also provides more timely interruption
of loops.

This change adds a modified FOR loop node "FORUNTIL" that
checks after executing the loop body instead of before (i.e.,
always at least once).  This ensures that a pointer past the
end of a slice or array is not made visible to the garbage
collector.

Without the rescheduling checks inserted, the restructured
loop from this  change apparently provides a 1% geomean
improvement on PPC64 running the go1 benchmarks; the
improvement on AMD64 is only 0.12%.

Inserting the rescheduling check exposed some peculiar bug
with the ssa test code for s390x; this was updated based on
initial code actually generated for GOARCH=s390x to use
appropriate OpArg, OpAddr, and OpVarDef.

NaCl is disabled in testing.

Change-Id: Ieafaa9a61d2a583ad00968110ef3e7a441abca50
Reviewed-on: https://go-review.googlesource.com/36206
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-08 18:52:12 +00:00
philhofer
379567aad1 cmd/compile/ssa: more aggressive constant folding
Add rewrite rules that canonicalize the location
of constants in expressions, and fold conststants
that appear in operations that can be trivially
reassociated.

After this change, the compiler constant-folds
expressions like "4 + x - 1" and "4 & x & 1"

Benchmarks affected on darwin/amd64:

name                     old time/op    new time/op    delta
FmtFprintfInt-8            82.1ns ± 1%    81.7ns ± 1%  -0.46%  (p=0.023 n=8+9)
FmtFprintfIntInt-8          122ns ± 2%     120ns ± 2%  -1.48%  (p=0.047 n=10+10)
FmtManyArgs-8               493ns ± 0%     486ns ± 1%  -1.37%  (p=0.000 n=8+10)
Gzip-8                      230ms ± 0%     229ms ± 1%  -0.46%  (p=0.001 n=10+9)
HTTPClientServer-8         74.5µs ± 1%    73.7µs ± 1%  -1.11%  (p=0.000 n=10+10)
JSONDecode-8               51.7ms ± 0%    51.9ms ± 1%  +0.42%  (p=0.017 n=10+9)
RegexpMatchEasy0_32-8      82.6ns ± 1%    81.7ns ± 0%  -1.02%  (p=0.000 n=9+8)
RegexpMatchMedium_32-8      121ns ± 1%     120ns ± 1%  -1.48%  (p=0.001 n=10+10)
Revcomp-8                   426ms ± 1%     400ms ± 1%  -6.16%  (p=0.000 n=10+10)
TimeFormat-8                330ns ± 1%     327ns ± 0%  -0.82%  (p=0.000 n=10+10)

name                     old speed      new speed      delta
Gzip-8                   84.4MB/s ± 0%  84.8MB/s ± 1%  +0.47%  (p=0.001 n=10+9)
JSONDecode-8             37.6MB/s ± 0%  37.4MB/s ± 1%  -0.42%  (p=0.016 n=10+9)
RegexpMatchEasy0_32-8     387MB/s ± 1%   392MB/s ± 0%  +1.06%  (p=0.000 n=9+8)
RegexpMatchMedium_32-8   8.21MB/s ± 1%  8.34MB/s ± 1%  +1.58%  (p=0.000 n=10+9)
Revcomp-8                 597MB/s ± 1%   636MB/s ± 1%  +6.57%  (p=0.000 n=10+10)

Change-Id: Ie37ff91605b76a984a8400dfd1e34f50bf61c864
Reviewed-on: https://go-review.googlesource.com/37290
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-28 20:25:33 +00:00
Alexandru Moșoi
6c6089b3fd cmd/compile: bce when max and limit are consts
Removes 49 more bound checks in make.bash. For example:

var a[100]int
for i := 0; i < 50; i++ {
  use a[i+25]
}

Change-Id: I85e0130ee5d07f0ece9b17044bba1a2047414ce7
Reviewed-on: https://go-review.googlesource.com/21379
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-11 16:01:22 +00:00
Alexandru Moșoi
b91cc53033 cmd/compile/internal/ssa: BCE for induction variables
There are 5293 loop in the main go repository.
A survey of the top most common for loops:

     18 for __k__ := 0; i < len(sa.Addr); i++ {
     19 for __k__ := 0; ; i++ {
     19 for __k__ := 0; i < 16; i++ {
     25 for __k__ := 0; i < length; i++ {
     30 for __k__ := 0; i < 8; i++ {
     49 for __k__ := 0; i < len(s); i++ {
     67 for __k__ := 0; i < n; i++ {
    376 for __k__ := range __slice__ {
    685 for __k__, __v__ := range __slice__ {
   2074 for __, __v__ := range __slice__ {

The algorithm to find induction variables handles all cases
with an upper limit. It currently doesn't find related induction
variables such as c * ind or c + ind.

842 out of 22954 bound checks are removed for src/make.bash.
1957 out of 42952 bounds checks are removed for src/all.bash.

Things to do in follow-up CLs:
* Find the associated pointer for `for _, v := range a {}`
* Drop the NilChecks on the pointer.
* Replace the implicit induction variable by a loop over the pointer

Generated garbage can be reduced if we share the sdom between passes.

% benchstat old.txt new.txt
name       old time/op     new time/op     delta
Template       337ms ± 3%      333ms ± 3%    ~             (p=0.258 n=9+9)
GoTypes        1.11s ± 2%      1.10s ± 2%    ~           (p=0.912 n=10+10)
Compiler       5.25s ± 1%      5.29s ± 2%    ~             (p=0.077 n=9+9)
MakeBash       33.5s ± 1%      34.1s ± 2%  +1.85%          (p=0.011 n=9+9)

name       old alloc/op    new alloc/op    delta
Template      63.6MB ± 0%     63.9MB ± 0%  +0.52%         (p=0.000 n=10+9)
GoTypes        218MB ± 0%      219MB ± 0%  +0.59%         (p=0.000 n=10+9)
Compiler       978MB ± 0%      985MB ± 0%  +0.69%        (p=0.000 n=10+10)

name       old allocs/op   new allocs/op   delta
Template        582k ± 0%       583k ± 0%  +0.10%        (p=0.000 n=10+10)
GoTypes        1.78M ± 0%      1.78M ± 0%  +0.12%        (p=0.000 n=10+10)
Compiler       7.68M ± 0%      7.69M ± 0%  +0.05%        (p=0.000 n=10+10)

name       old text-bytes  new text-bytes  delta
HelloSize       581k ± 0%       581k ± 0%  -0.08%        (p=0.000 n=10+10)
CmdGoSize      6.40M ± 0%      6.39M ± 0%  -0.08%        (p=0.000 n=10+10)

name       old data-bytes  new data-bytes  delta
HelloSize      3.66k ± 0%      3.66k ± 0%    ~     (all samples are equal)
CmdGoSize       134k ± 0%       134k ± 0%    ~     (all samples are equal)

name       old bss-bytes   new bss-bytes   delta
HelloSize       126k ± 0%       126k ± 0%    ~     (all samples are equal)
CmdGoSize       149k ± 0%       149k ± 0%    ~     (all samples are equal)

name       old exe-bytes   new exe-bytes   delta
HelloSize       947k ± 0%       946k ± 0%  -0.01%        (p=0.000 n=10+10)
CmdGoSize      9.92M ± 0%      9.91M ± 0%  -0.06%        (p=0.000 n=10+10)

Change-Id: Ie74bdff46fd602db41bb457333d3a762a0c3dc4d
Reviewed-on: https://go-review.googlesource.com/20517
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
2016-04-01 09:37:58 +00:00