qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-05 16:06:10 -07:00

Author	SHA1	Message	Date
David Chase	d71f36b5aa	cmd/compile: check loop rescheduling with stack bound, not counter After benchmarking with a compiler modified to have better spill location, it became clear that this method of checking was actually faster on (at least) two different architectures (ppc64 and amd64) and it also provides more timely interruption of loops. This change adds a modified FOR loop node "FORUNTIL" that checks after executing the loop body instead of before (i.e., always at least once). This ensures that a pointer past the end of a slice or array is not made visible to the garbage collector. Without the rescheduling checks inserted, the restructured loop from this change apparently provides a 1% geomean improvement on PPC64 running the go1 benchmarks; the improvement on AMD64 is only 0.12%. Inserting the rescheduling check exposed some peculiar bug with the ssa test code for s390x; this was updated based on initial code actually generated for GOARCH=s390x to use appropriate OpArg, OpAddr, and OpVarDef. NaCl is disabled in testing. Change-Id: Ieafaa9a61d2a583ad00968110ef3e7a441abca50 Reviewed-on: https://go-review.googlesource.com/36206 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-08 18:52:12 +00:00
philhofer	379567aad1	cmd/compile/ssa: more aggressive constant folding Add rewrite rules that canonicalize the location of constants in expressions, and fold conststants that appear in operations that can be trivially reassociated. After this change, the compiler constant-folds expressions like "4 + x - 1" and "4 & x & 1" Benchmarks affected on darwin/amd64: name old time/op new time/op delta FmtFprintfInt-8 82.1ns ± 1% 81.7ns ± 1% -0.46% (p=0.023 n=8+9) FmtFprintfIntInt-8 122ns ± 2% 120ns ± 2% -1.48% (p=0.047 n=10+10) FmtManyArgs-8 493ns ± 0% 486ns ± 1% -1.37% (p=0.000 n=8+10) Gzip-8 230ms ± 0% 229ms ± 1% -0.46% (p=0.001 n=10+9) HTTPClientServer-8 74.5µs ± 1% 73.7µs ± 1% -1.11% (p=0.000 n=10+10) JSONDecode-8 51.7ms ± 0% 51.9ms ± 1% +0.42% (p=0.017 n=10+9) RegexpMatchEasy0_32-8 82.6ns ± 1% 81.7ns ± 0% -1.02% (p=0.000 n=9+8) RegexpMatchMedium_32-8 121ns ± 1% 120ns ± 1% -1.48% (p=0.001 n=10+10) Revcomp-8 426ms ± 1% 400ms ± 1% -6.16% (p=0.000 n=10+10) TimeFormat-8 330ns ± 1% 327ns ± 0% -0.82% (p=0.000 n=10+10) name old speed new speed delta Gzip-8 84.4MB/s ± 0% 84.8MB/s ± 1% +0.47% (p=0.001 n=10+9) JSONDecode-8 37.6MB/s ± 0% 37.4MB/s ± 1% -0.42% (p=0.016 n=10+9) RegexpMatchEasy0_32-8 387MB/s ± 1% 392MB/s ± 0% +1.06% (p=0.000 n=9+8) RegexpMatchMedium_32-8 8.21MB/s ± 1% 8.34MB/s ± 1% +1.58% (p=0.000 n=10+9) Revcomp-8 597MB/s ± 1% 636MB/s ± 1% +6.57% (p=0.000 n=10+10) Change-Id: Ie37ff91605b76a984a8400dfd1e34f50bf61c864 Reviewed-on: https://go-review.googlesource.com/37290 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-28 20:25:33 +00:00
Alexandru Moșoi	6c6089b3fd	cmd/compile: bce when max and limit are consts Removes 49 more bound checks in make.bash. For example: var a[100]int for i := 0; i < 50; i++ { use a[i+25] } Change-Id: I85e0130ee5d07f0ece9b17044bba1a2047414ce7 Reviewed-on: https://go-review.googlesource.com/21379 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-11 16:01:22 +00:00
Alexandru Moșoi	b91cc53033	cmd/compile/internal/ssa: BCE for induction variables There are 5293 loop in the main go repository. A survey of the top most common for loops: 18 for __k__ := 0; i < len(sa.Addr); i++ { 19 for __k__ := 0; ; i++ { 19 for __k__ := 0; i < 16; i++ { 25 for __k__ := 0; i < length; i++ { 30 for __k__ := 0; i < 8; i++ { 49 for __k__ := 0; i < len(s); i++ { 67 for __k__ := 0; i < n; i++ { 376 for __k__ := range __slice__ { 685 for __k__, __v__ := range __slice__ { 2074 for __, __v__ := range __slice__ { The algorithm to find induction variables handles all cases with an upper limit. It currently doesn't find related induction variables such as c * ind or c + ind. 842 out of 22954 bound checks are removed for src/make.bash. 1957 out of 42952 bounds checks are removed for src/all.bash. Things to do in follow-up CLs: * Find the associated pointer for `for _, v := range a {}` * Drop the NilChecks on the pointer. * Replace the implicit induction variable by a loop over the pointer Generated garbage can be reduced if we share the sdom between passes. % benchstat old.txt new.txt name old time/op new time/op delta Template 337ms ± 3% 333ms ± 3% ~ (p=0.258 n=9+9) GoTypes 1.11s ± 2% 1.10s ± 2% ~ (p=0.912 n=10+10) Compiler 5.25s ± 1% 5.29s ± 2% ~ (p=0.077 n=9+9) MakeBash 33.5s ± 1% 34.1s ± 2% +1.85% (p=0.011 n=9+9) name old alloc/op new alloc/op delta Template 63.6MB ± 0% 63.9MB ± 0% +0.52% (p=0.000 n=10+9) GoTypes 218MB ± 0% 219MB ± 0% +0.59% (p=0.000 n=10+9) Compiler 978MB ± 0% 985MB ± 0% +0.69% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Template 582k ± 0% 583k ± 0% +0.10% (p=0.000 n=10+10) GoTypes 1.78M ± 0% 1.78M ± 0% +0.12% (p=0.000 n=10+10) Compiler 7.68M ± 0% 7.69M ± 0% +0.05% (p=0.000 n=10+10) name old text-bytes new text-bytes delta HelloSize 581k ± 0% 581k ± 0% -0.08% (p=0.000 n=10+10) CmdGoSize 6.40M ± 0% 6.39M ± 0% -0.08% (p=0.000 n=10+10) name old data-bytes new data-bytes delta HelloSize 3.66k ± 0% 3.66k ± 0% ~ (all samples are equal) CmdGoSize 134k ± 0% 134k ± 0% ~ (all samples are equal) name old bss-bytes new bss-bytes delta HelloSize 126k ± 0% 126k ± 0% ~ (all samples are equal) CmdGoSize 149k ± 0% 149k ± 0% ~ (all samples are equal) name old exe-bytes new exe-bytes delta HelloSize 947k ± 0% 946k ± 0% -0.01% (p=0.000 n=10+10) CmdGoSize 9.92M ± 0% 9.91M ± 0% -0.06% (p=0.000 n=10+10) Change-Id: Ie74bdff46fd602db41bb457333d3a762a0c3dc4d Reviewed-on: https://go-review.googlesource.com/20517 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>	2016-04-01 09:37:58 +00:00

4 Commits