1
0
mirror of https://github.com/golang/go synced 2024-11-23 15:00:03 -07:00
go/src/internal
erifan01 de28555c0b internal/bytealg: optimize Equal on arm64
Currently the 16-byte loop chunk16_loop is implemented with NEON instructions LD1, VMOV and VCMEQ.
Using scalar instructions LDP and CMP to achieve this loop can reduce the number of clock cycles.
For cases where the length of strings are between 4 to 15 bytes, loading the last 8 or 4 bytes at
a time to reduce the number of comparisons.

Benchmarks:
name                 old time/op    new time/op    delta
Equal/0-8              5.51ns ± 0%    5.84ns ±14%     ~     (p=0.246 n=7+8)
Equal/1-8              10.5ns ± 0%    10.5ns ± 0%     ~     (all equal)
Equal/6-8              14.0ns ± 0%    12.5ns ± 0%  -10.71%  (p=0.000 n=8+8)
Equal/9-8              13.5ns ± 0%    12.5ns ± 0%   -7.41%  (p=0.000 n=8+8)
Equal/15-8             15.5ns ± 0%    12.5ns ± 0%  -19.35%  (p=0.000 n=8+8)
Equal/16-8             14.0ns ± 0%    13.0ns ± 0%   -7.14%  (p=0.000 n=8+8)
Equal/20-8             16.5ns ± 0%    16.0ns ± 0%   -3.03%  (p=0.000 n=8+8)
Equal/32-8             16.5ns ± 0%    15.3ns ± 0%   -7.27%  (p=0.000 n=8+8)
Equal/4K-8              552ns ± 0%     553ns ± 0%     ~     (p=0.315 n=8+8)
Equal/4M-8             1.13ms ±23%    1.20ms ±27%     ~     (p=0.442 n=8+8)
Equal/64M-8            32.9ms ± 0%    32.6ms ± 0%   -1.15%  (p=0.000 n=8+8)
CompareBytesEqual-8    12.0ns ± 0%    12.0ns ± 0%     ~     (all equal)

Change-Id: If317ecdcc98e31883d37fd7d42b113b548c5bd2a
Reviewed-on: https://go-review.googlesource.com/112496
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
2018-09-12 19:56:55 +00:00
..
bytealg internal/bytealg: optimize Equal on arm64 2018-09-12 19:56:55 +00:00
cpu runtime: replace sys.CacheLineSize by corresponding internal/cpu const and vars 2018-08-24 18:28:25 +00:00
nettrace
poll internal/poll: handle zero-byte write in FD.WriteTo 2018-09-09 04:42:15 +00:00
race
singleflight net: don't let cancelation of a DNS lookup affect another lookup 2018-03-16 13:39:38 +00:00
syscall internal/syscall/unix: remove unnecessary empty.s 2018-08-28 14:05:21 +00:00
testenv all: skip unsupported tests for js/wasm 2018-04-30 19:39:18 +00:00
testlog
trace all: fix typos detected by github.com/client9/misspell 2018-08-23 15:54:07 +00:00