1
0
mirror of https://github.com/golang/go synced 2024-11-23 15:30:05 -07:00
The Go programming language
Go to file
Ben Shi ebb77aa867 cmd/compile/internal/ssa: optimize arm64 with FNMULS/FNMULD
FNMULS&FNMULD are efficient arm64 instructions, which can be used
to improve FP performance. This CL use them to optimize pairs of neg-mul
operations.

Here are benchmark test results on Raspberry Pi 3 with ArchLinux.

1. A special test case gets about 15% improvement.
(https://github.com/benshi001/ugo1/blob/master/fpmul_test.go)
FPMul-4                     485µs ± 0%     410µs ± 0%  -15.49%  (p=0.000 n=26+23)

2. There is little regression in the go1 benchmark (excluding noise).
name                     old time/op    new time/op    delta
BinaryTree17-4              42.0s ± 3%     42.1s ± 2%    ~     (p=0.542 n=39+40)
Fannkuch11-4                33.3s ± 3%     32.9s ± 1%    ~     (p=0.200 n=40+32)
FmtFprintfEmpty-4           534ns ± 0%     534ns ± 0%    ~     (all equal)
FmtFprintfString-4         1.09µs ± 1%    1.09µs ± 0%    ~     (p=0.950 n=32+32)
FmtFprintfInt-4            1.14µs ± 0%    1.14µs ± 1%    ~     (p=0.571 n=32+31)
FmtFprintfIntInt-4         1.79µs ± 3%    1.76µs ± 0%  -1.42%  (p=0.004 n=40+34)
FmtFprintfPrefixedInt-4    2.17µs ± 0%    2.17µs ± 0%    ~     (p=0.073 n=31+34)
FmtFprintfFloat-4          3.33µs ± 3%    3.28µs ± 0%  -1.46%  (p=0.001 n=40+34)
FmtManyArgs-4              7.28µs ± 6%    7.19µs ± 0%    ~     (p=0.641 n=40+33)
GobDecode-4                96.5ms ± 4%    96.5ms ± 9%    ~     (p=0.214 n=40+40)
GobEncode-4                79.5ms ± 0%    80.7ms ± 4%  +1.51%  (p=0.000 n=34+40)
Gzip-4                      4.53s ± 4%     4.56s ± 4%  +0.60%  (p=0.000 n=40+40)
Gunzip-4                    451ms ± 3%     442ms ± 0%  -1.93%  (p=0.000 n=40+32)
HTTPClientServer-4          530µs ± 1%     535µs ± 1%  +0.88%  (p=0.000 n=39+39)
JSONEncode-4                214ms ± 4%     211ms ± 0%    ~     (p=0.059 n=40+31)
JSONDecode-4                865ms ± 5%     864ms ± 4%  -0.06%  (p=0.003 n=40+40)
Mandelbrot200-4            52.0ms ± 3%    52.1ms ± 3%    ~     (p=0.556 n=40+40)
GoParse-4                  43.1ms ± 8%    42.1ms ± 0%    ~     (p=0.083 n=40+33)
RegexpMatchEasy0_32-4      1.02µs ± 3%    1.02µs ± 4%  +0.06%  (p=0.020 n=40+40)
RegexpMatchEasy0_1K-4      3.90µs ± 0%    3.96µs ± 3%  +1.58%  (p=0.000 n=31+40)
RegexpMatchEasy1_32-4       967ns ± 4%     981ns ± 3%  +1.40%  (p=0.000 n=40+40)
RegexpMatchEasy1_1K-4      6.41µs ± 4%    6.43µs ± 3%    ~     (p=0.386 n=40+40)
RegexpMatchMedium_32-4     1.76µs ± 3%    1.78µs ± 3%  +1.08%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4      561µs ± 0%     562µs ± 0%  +0.09%  (p=0.003 n=34+31)
RegexpMatchHard_32-4       31.5µs ± 2%    31.1µs ± 4%  -1.17%  (p=0.000 n=30+40)
RegexpMatchHard_1K-4        960µs ± 3%     950µs ± 4%  -1.02%  (p=0.016 n=40+40)
Revcomp-4                   7.79s ± 7%     7.79s ± 4%    ~     (p=0.859 n=40+40)
Template-4                  889ms ± 6%     872ms ± 3%  -1.86%  (p=0.025 n=40+31)
TimeParse-4                4.80µs ± 0%    4.89µs ± 3%  +1.71%  (p=0.001 n=31+40)
TimeFormat-4               4.70µs ± 1%    4.78µs ± 3%  +1.57%  (p=0.000 n=33+40)
[Geo mean]                  710µs          709µs       -0.13%

name                     old speed      new speed      delta
GobDecode-4              7.96MB/s ± 4%  7.96MB/s ± 9%    ~     (p=0.174 n=40+40)
GobEncode-4              9.65MB/s ± 0%  9.51MB/s ± 4%  -1.45%  (p=0.000 n=34+40)
Gzip-4                   4.29MB/s ± 4%  4.26MB/s ± 4%  -0.59%  (p=0.000 n=40+40)
Gunzip-4                 43.0MB/s ± 3%  43.9MB/s ± 0%  +1.90%  (p=0.000 n=40+32)
JSONEncode-4             9.09MB/s ± 4%  9.22MB/s ± 0%    ~     (p=0.429 n=40+31)
JSONDecode-4             2.25MB/s ± 5%  2.25MB/s ± 4%    ~     (p=0.278 n=40+40)
GoParse-4                1.35MB/s ± 7%  1.37MB/s ± 0%    ~     (p=0.071 n=40+25)
RegexpMatchEasy0_32-4    31.5MB/s ± 3%  31.5MB/s ± 4%  -0.08%  (p=0.018 n=40+40)
RegexpMatchEasy0_1K-4     263MB/s ± 0%   259MB/s ± 3%  -1.51%  (p=0.000 n=31+40)
RegexpMatchEasy1_32-4    33.1MB/s ± 4%  32.6MB/s ± 3%  -1.38%  (p=0.000 n=40+40)
RegexpMatchEasy1_1K-4     160MB/s ± 4%   159MB/s ± 3%    ~     (p=0.364 n=40+40)
RegexpMatchMedium_32-4    565kB/s ± 3%   562kB/s ± 2%    ~     (p=0.208 n=40+40)
RegexpMatchMedium_1K-4   1.82MB/s ± 0%  1.82MB/s ± 0%  -0.27%  (p=0.000 n=34+31)
RegexpMatchHard_32-4     1.02MB/s ± 3%  1.03MB/s ± 4%  +1.04%  (p=0.000 n=32+40)
RegexpMatchHard_1K-4     1.07MB/s ± 4%  1.08MB/s ± 4%  +0.94%  (p=0.003 n=40+40)
Revcomp-4                32.6MB/s ± 7%  32.6MB/s ± 4%    ~     (p=0.965 n=40+40)
Template-4               2.18MB/s ± 6%  2.22MB/s ± 3%  +1.83%  (p=0.020 n=40+31)
[Geo mean]               7.77MB/s       7.78MB/s       +0.16%

3. There is little change in the compilecmp benchmark (excluding noise).
name        old time/op       new time/op       delta
Template          2.37s ± 3%        2.35s ± 4%    ~     (p=0.529 n=10+10)
Unicode           1.38s ± 8%        1.36s ± 5%    ~     (p=0.247 n=10+10)
GoTypes           8.10s ± 2%        8.10s ± 2%    ~     (p=0.971 n=10+10)
Compiler          40.5s ± 4%        40.8s ± 1%    ~     (p=0.529 n=10+10)
SSA                115s ± 2%         115s ± 3%    ~     (p=0.684 n=10+10)
Flate             1.45s ± 5%        1.46s ± 3%    ~     (p=0.796 n=10+10)
GoParser          1.86s ± 4%        1.84s ± 2%    ~     (p=0.095 n=9+10)
Reflect           5.11s ± 2%        5.13s ± 2%    ~     (p=0.315 n=10+10)
Tar               2.22s ± 3%        2.23s ± 1%    ~     (p=0.299 n=9+7)
XML               2.72s ± 3%        2.72s ± 3%    ~     (p=0.912 n=10+10)
[Geo mean]        5.03s             5.02s       -0.21%

name        old user-time/op  new user-time/op  delta
Template          2.92s ± 2%        2.89s ± 1%    ~     (p=0.247 n=10+10)
Unicode           1.71s ± 5%        1.69s ± 4%    ~     (p=0.393 n=10+10)
GoTypes           9.78s ± 2%        9.76s ± 2%    ~     (p=0.631 n=10+10)
Compiler          49.1s ± 2%        49.1s ± 1%    ~     (p=0.796 n=10+10)
SSA                144s ± 1%         144s ± 2%    ~     (p=0.796 n=10+10)
Flate             1.74s ± 2%        1.73s ± 3%    ~     (p=0.842 n=10+9)
GoParser          2.23s ± 3%        2.25s ± 2%    ~     (p=0.143 n=10+10)
Reflect           5.93s ± 3%        5.98s ± 2%    ~     (p=0.211 n=10+9)
Tar               2.65s ± 2%        2.69s ± 3%  +1.51%  (p=0.010 n=9+10)
XML               3.25s ± 2%        3.21s ± 1%  -1.24%  (p=0.035 n=10+9)
[Geo mean]        6.07s             6.07s       -0.08%

name        old text-bytes    new text-bytes    delta
HelloSize         641kB ± 0%        641kB ± 0%    ~     (all equal)

name        old data-bytes    new data-bytes    delta
HelloSize        9.46kB ± 0%       9.46kB ± 0%    ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize         125kB ± 0%        125kB ± 0%    ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.24MB ± 0%       1.24MB ± 0%    ~     (all equal)

Change-Id: Id095d998c380eef929755124084df02446a6b7c1
Reviewed-on: https://go-review.googlesource.com/92555
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-14 15:22:05 +00:00
.github all: restore changes from faulty merge/revert 2018-02-12 20:13:59 +00:00
api syscall: support Getwd on all BSDs 2018-02-13 15:41:19 +00:00
doc doc/articles/wiki: highlight the use of _ warning 2018-02-14 04:54:37 +00:00
lib/time all: use HTTPS for iana.org links 2018-02-13 18:36:48 +00:00
misc misc/cgo/testcshared: increase sleep in TestUnexportedSymbols 2018-02-14 15:03:29 +00:00
src cmd/compile/internal/ssa: optimize arm64 with FNMULS/FNMULD 2018-02-14 15:22:05 +00:00
test cmd/compile: fix constant folding of right shifts 2018-02-14 00:03:36 +00:00
.gitattributes .gitattributes: prevent all magic line ending changes 2014-12-12 23:14:54 +00:00
.gitignore .gitignore: ignore src/cmd/dist/dist 2017-10-28 21:55:49 +00:00
AUTHORS A+C: update late Go 1.10 contributors 2018-01-06 04:52:00 +00:00
CONTRIBUTING.md all: restore changes from faulty merge/revert 2018-02-12 20:13:59 +00:00
CONTRIBUTORS A+C: update late Go 1.10 contributors 2018-01-06 04:52:00 +00:00
favicon.ico website: recreate 16px and 32px favicon 2016-08-25 15:43:32 +00:00
LICENSE doc: revert copyright date to 2009 2016-06-01 22:40:04 +00:00
PATENTS
README.md all: restore changes from faulty merge/revert 2018-02-12 20:13:59 +00:00
robots.txt

The Go Programming Language

Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.

Gopher image Gopher image by Renee French, licensed under Creative Commons 3.0 Attributions license.

Our canonical Git repository is located at https://go.googlesource.com/go. There is a mirror of the repository at https://github.com/golang/go.

Unless otherwise noted, the Go source files are distributed under the BSD-style license found in the LICENSE file.

Download and Install

Binary Distributions

Official binary distributions are available at https://golang.org/dl/.

After downloading a binary release, visit https://golang.org/doc/install or load doc/install.html in your web browser for installation instructions.

Install From Source

If a binary distribution is not available for your combination of operating system and architecture, visit https://golang.org/doc/install/source or load doc/install-source.html in your web browser for source installation instructions.

Contributing

Go is the work of hundreds of contributors. We appreciate your help!

To contribute, please read the contribution guidelines: https://golang.org/doc/contribute.html

Note that the Go project uses the issue tracker for bug reports and proposals only. See https://golang.org/wiki/Questions for a list of places to ask questions about the Go language.