1
0
mirror of https://github.com/golang/go synced 2024-11-18 17:44:47 -07:00
go/src
Ben Shi aaf73c6d1e cmd/compile: optimize ARM64 with shifted register indexed load/store
ARM64 supports efficient instructions which combine shift, addition, load/store
together. Such as "MOVD (R0)(R1<<3), R2" and "MOVWU R6, (R4)(R1<<2)".

This CL optimizes the compiler to emit such efficient instuctions. And below
is some test data.

1. binary size before/after
binary                 size change
pkg/linux_arm64        +80.1KB
pkg/tool/linux_arm64   +121.9KB
go                     -4.3KB
gofmt                  -64KB

2. go1 benchmark
There is big improvement for the test case Fannkuch11, and slight
improvement for sme others, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              43.9s ± 2%     44.0s ± 2%     ~     (p=0.820 n=30+30)
Fannkuch11-4                30.6s ± 2%     24.5s ± 3%  -19.93%  (p=0.000 n=25+30)
FmtFprintfEmpty-4           500ns ± 0%     499ns ± 0%   -0.11%  (p=0.000 n=23+25)
FmtFprintfString-4         1.03µs ± 0%    1.04µs ± 3%     ~     (p=0.065 n=29+30)
FmtFprintfInt-4            1.15µs ± 3%    1.15µs ± 4%   -0.56%  (p=0.000 n=30+30)
FmtFprintfIntInt-4         1.80µs ± 5%    1.82µs ± 0%     ~     (p=0.094 n=30+24)
FmtFprintfPrefixedInt-4    2.17µs ± 5%    2.20µs ± 0%     ~     (p=0.100 n=30+23)
FmtFprintfFloat-4          3.08µs ± 3%    3.09µs ± 4%     ~     (p=0.123 n=30+30)
FmtManyArgs-4              7.41µs ± 4%    7.17µs ± 1%   -3.26%  (p=0.000 n=30+23)
GobDecode-4                93.7ms ± 0%    94.7ms ± 4%     ~     (p=0.685 n=24+30)
GobEncode-4                78.7ms ± 7%    77.1ms ± 0%     ~     (p=0.729 n=30+23)
Gzip-4                      4.01s ± 0%     3.97s ± 5%   -1.11%  (p=0.037 n=24+30)
Gunzip-4                    389ms ± 4%     384ms ± 0%     ~     (p=0.155 n=30+23)
HTTPClientServer-4          536µs ± 1%     537µs ± 1%     ~     (p=0.236 n=30+30)
JSONEncode-4                179ms ± 1%     182ms ± 6%     ~     (p=0.763 n=24+30)
JSONDecode-4                843ms ± 0%     839ms ± 6%   -0.42%  (p=0.003 n=25+30)
Mandelbrot200-4            46.5ms ± 0%    46.5ms ± 0%   +0.02%  (p=0.000 n=26+26)
GoParse-4                  44.3ms ± 6%    43.3ms ± 0%     ~     (p=0.067 n=30+27)
RegexpMatchEasy0_32-4      1.07µs ± 7%    1.07µs ± 4%     ~     (p=0.835 n=30+30)
RegexpMatchEasy0_1K-4      5.51µs ± 0%    5.49µs ± 0%   -0.35%  (p=0.000 n=23+26)
RegexpMatchEasy1_32-4      1.01µs ± 0%    1.02µs ± 4%   +0.96%  (p=0.014 n=24+30)
RegexpMatchEasy1_1K-4      7.43µs ± 0%    7.18µs ± 0%   -3.41%  (p=0.000 n=23+24)
RegexpMatchMedium_32-4     1.78µs ± 0%    1.81µs ± 4%   +1.47%  (p=0.012 n=23+30)
RegexpMatchMedium_1K-4      547µs ± 1%     542µs ± 3%   -0.90%  (p=0.003 n=24+30)
RegexpMatchHard_32-4       30.4µs ± 0%    29.7µs ± 0%   -2.15%  (p=0.000 n=19+23)
RegexpMatchHard_1K-4        913µs ± 0%     915µs ± 6%   +0.25%  (p=0.012 n=24+30)
Revcomp-4                   6.32s ± 1%     6.42s ± 4%     ~     (p=0.342 n=25+30)
Template-4                  868ms ± 6%     878ms ± 6%   +1.15%  (p=0.000 n=30+30)
TimeParse-4                4.57µs ± 4%    4.59µs ± 3%   +0.65%  (p=0.010 n=29+30)
TimeFormat-4               4.51µs ± 0%    4.50µs ± 0%   -0.27%  (p=0.000 n=27+24)
[Geo mean]                  695µs          689µs        -0.92%

name                     old speed      new speed      delta
GobDecode-4              8.19MB/s ± 0%  8.12MB/s ± 4%     ~     (p=0.680 n=24+30)
GobEncode-4              9.76MB/s ± 7%  9.96MB/s ± 0%     ~     (p=0.616 n=30+23)
Gzip-4                   4.84MB/s ± 0%  4.89MB/s ± 4%   +1.16%  (p=0.030 n=24+30)
Gunzip-4                 49.9MB/s ± 4%  50.6MB/s ± 0%     ~     (p=0.162 n=30+23)
JSONEncode-4             10.9MB/s ± 1%  10.7MB/s ± 6%     ~     (p=0.575 n=24+30)
JSONDecode-4             2.30MB/s ± 0%  2.32MB/s ± 5%   +0.72%  (p=0.003 n=22+30)
GoParse-4                1.31MB/s ± 6%  1.34MB/s ± 0%   +2.26%  (p=0.002 n=30+27)
RegexpMatchEasy0_32-4    30.0MB/s ± 6%  30.0MB/s ± 4%     ~     (p=1.000 n=30+30)
RegexpMatchEasy0_1K-4     186MB/s ± 0%   187MB/s ± 0%   +0.35%  (p=0.000 n=23+26)
RegexpMatchEasy1_32-4    31.8MB/s ± 0%  31.5MB/s ± 4%   -0.92%  (p=0.012 n=25+30)
RegexpMatchEasy1_1K-4     138MB/s ± 0%   143MB/s ± 0%   +3.53%  (p=0.000 n=23+24)
RegexpMatchMedium_32-4    560kB/s ± 0%   553kB/s ± 4%   -1.19%  (p=0.005 n=23+30)
RegexpMatchMedium_1K-4   1.87MB/s ± 0%  1.89MB/s ± 3%   +1.04%  (p=0.002 n=24+30)
RegexpMatchHard_32-4     1.05MB/s ± 0%  1.08MB/s ± 0%   +2.40%  (p=0.000 n=19+23)
RegexpMatchHard_1K-4     1.12MB/s ± 0%  1.12MB/s ± 5%   +0.12%  (p=0.006 n=25+30)
Revcomp-4                40.2MB/s ± 1%  39.6MB/s ± 4%     ~     (p=0.242 n=25+30)
Template-4               2.24MB/s ± 6%  2.21MB/s ± 6%   -1.15%  (p=0.000 n=30+30)
[Geo mean]               7.87MB/s       7.91MB/s        +0.44%

Change-Id: If374cb7abf83537aa0a176f73c0f736f7800db03
Reviewed-on: https://go-review.googlesource.com/108735
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-27 20:02:05 +00:00
..
archive archive/zip: prevent writing data for a directory 2018-04-26 15:57:06 +00:00
bufio bufio: document ReadFrom/WriteTo calls to underlying methods 2018-03-28 22:21:52 +00:00
builtin builtin: improve docs for make slice 2017-11-18 01:48:52 +00:00
bytes
cmd cmd/compile: optimize ARM64 with shifted register indexed load/store 2018-04-27 20:02:05 +00:00
compress compress/flate: optimize huffSym 2018-04-17 22:37:49 +00:00
container container/heap: fix comments style 2018-04-11 20:11:09 +00:00
context context: avoid defer in the cancelCtx.Err method 2018-04-15 21:35:53 +00:00
crypto crypto/md5: unnecessary conversion 2018-04-24 15:49:43 +00:00
database/sql database/sql: remove unnecessary else conditions 2018-04-19 18:57:52 +00:00
debug debug/elf: add riscv64 relocations 2018-04-18 13:19:31 +00:00
encoding encoding/base64: fix format error 2018-04-25 20:22:16 +00:00
errors
expvar all: use strings.Builder instead of bytes.Buffer where appropriate 2018-03-26 23:05:53 +00:00
flag flag: correct zero values when printing defaults 2018-04-01 20:17:22 +00:00
fmt fmt: make %v doc for compound objects consistent 2018-04-17 23:47:44 +00:00
go go/types: fix format errors 2018-04-25 20:22:06 +00:00
hash
html text/template: copy Decl field when copying PipeNode 2018-04-10 14:26:58 +00:00
image
index/suffixarray
internal os: os: make Stat("*.txt") fail on windows 2018-04-27 10:04:48 +00:00
io io/ioutil: change TempFile prefix to a pattern 2018-04-12 20:00:25 +00:00
log
math math: add a testcase for Mod and Remainder respectively 2018-04-17 03:17:22 +00:00
mime mime: add wasm architecture 2018-04-13 20:20:12 +00:00
net net: add support for splice(2) in (*TCPConn).ReadFrom on Linux 2018-04-24 14:14:56 +00:00
os os: os: make Stat("*.txt") fail on windows 2018-04-27 10:04:48 +00:00
path path/filepath: fix Win32 tests missing 'chcp' 2018-04-26 18:25:15 +00:00
plugin
reflect reflect: define MyBuffer more locally in TestImplicitMapConversion 2018-04-18 12:47:39 +00:00
regexp regexp: use sync.Pool to cache regexp.machine objects 2018-04-03 16:03:19 +00:00
runtime cmd/compile: add softfloat support to mips64{,le} 2018-04-27 14:50:17 +00:00
sort sort: fix typo in comment 2018-04-22 22:32:11 +00:00
strconv strconv: make code formatting more consistent in doc.go 2018-03-19 12:53:16 +00:00
strings strings: clarify Replacer's replacement order 2018-04-26 15:11:58 +00:00
sync sync: hide test of misuse of Cond from vet 2018-04-25 02:49:46 +00:00
syscall syscall: 32-bit MIPS splice system call returns int, not int64 2018-04-26 17:08:53 +00:00
testing testing: fix typo mistake 2018-04-27 13:29:12 +00:00
text text/template: improve comment example in doc 2018-04-19 09:21:51 +00:00
time time: increase test coverage for Time.Sub 2018-04-16 21:14:40 +00:00
unicode
unsafe
vendor/golang_org/x net/http: omit forbidden Trailer headers from response 2018-04-16 17:44:41 +00:00
all.bash
all.bat
all.rc
androidtest.bash
bootstrap.bash
buildall.bash
clean.bash
clean.bat
clean.rc
cmp.bash
iostest.bash
make.bash
make.bat
Make.dist
make.rc
naclmake.bash nacl*.bash: pass flags to make.bash 2018-02-14 17:09:31 +00:00
nacltest.bash
race.bash
race.bat
run.bash src/run.bash: remove some trailing whitespace 2018-04-01 16:12:47 +00:00
run.bat
run.rc