1
0
mirror of https://github.com/golang/go synced 2024-11-14 08:00:22 -07:00
Commit Graph

3 Commits

Author SHA1 Message Date
Cherry Zhang
a92a77c56f cmd/internal/obj/arm64: fix handling of unaligned offset between 256 and 504
C_PPAUTO was matching offsets that is a multiple 8. But this
condition is dropped in CL 55610, causing unaligned offset
between 256 and 504 mistakenly matched to some classes, e.g.
C_UAUTO8K. This CL restores this condition, also fixes an
error that C_PPAUTO shouldn't match C_PSAUTO, because the
latter is not guaranteed to be multiple of 8. C_PPAUTO_8 is
unnecessary, removed.

Fixes #21992.

Change-Id: I75d5a0e5f5dc3dae335721fbec1bbcd4a3b862f2
Reviewed-on: https://go-review.googlesource.com/65730
Reviewed-by: David Chase <drchase@google.com>
2017-10-05 22:28:17 +00:00
Cherry Zhang
6464e5dc4b cmd/compile: do not fold offset into load/store for args on ARM64
Args may be not at 8-byte aligned offset to SP. When the stack
frame is large, folding the offset of args may cause large
unaligned offsets that does not fit in a machine instruction on
ARM64. Therefore disable folding offsets for args.

This has small performance impact (see below). A better fix would
be letting the assembler backend fix up the offset by loading it
into a register if it doesn't fit into an instruction. And the
compiler can simply generate large load/stores with offset. Since
in most of the cases the offset is aligned or the stack frame is
small, it can fit in an instruction and no fixup is needed. But
this is too complicated for Go 1.8.

name                     old time/op    new time/op    delta
BinaryTree17-8              8.30s ± 0%     8.31s ± 0%    ~     (p=0.579 n=10+10)
Fannkuch11-8                6.14s ± 0%     6.18s ± 0%  +0.53%  (p=0.000 n=9+10)
FmtFprintfEmpty-8           117ns ± 0%     117ns ± 0%    ~     (all equal)
FmtFprintfString-8          196ns ± 0%     197ns ± 0%  +0.72%  (p=0.000 n=10+10)
FmtFprintfInt-8             204ns ± 0%     205ns ± 0%  +0.49%  (p=0.000 n=9+10)
FmtFprintfIntInt-8          302ns ± 0%     307ns ± 1%  +1.46%  (p=0.000 n=10+10)
FmtFprintfPrefixedInt-8     329ns ± 2%     326ns ± 0%    ~     (p=0.083 n=10+10)
FmtFprintfFloat-8           540ns ± 0%     542ns ± 0%  +0.46%  (p=0.000 n=8+7)
FmtManyArgs-8              1.20µs ± 1%    1.19µs ± 1%  -1.02%  (p=0.000 n=10+10)
GobDecode-8                17.3ms ± 1%    17.8ms ± 0%  +2.75%  (p=0.000 n=10+7)
GobEncode-8                15.3ms ± 1%    15.4ms ± 0%  +0.57%  (p=0.004 n=9+10)
Gzip-8                      789ms ± 0%     803ms ± 0%  +1.78%  (p=0.000 n=9+10)
Gunzip-8                    128ms ± 0%     130ms ± 0%  +1.73%  (p=0.000 n=10+9)
HTTPClientServer-8          202µs ± 6%     201µs ±10%    ~     (p=0.739 n=10+10)
JSONEncode-8               42.0ms ± 0%    42.1ms ± 0%  +0.19%  (p=0.028 n=10+9)
JSONDecode-8                159ms ± 0%     161ms ± 0%  +1.05%  (p=0.000 n=9+10)
Mandelbrot200-8            10.1ms ± 0%    10.1ms ± 0%  -0.07%  (p=0.000 n=10+9)
GoParse-8                  8.46ms ± 1%    8.61ms ± 1%  +1.77%  (p=0.000 n=10+10)
RegexpMatchEasy0_32-8       227ns ± 1%     226ns ± 0%  -0.35%  (p=0.001 n=10+9)
RegexpMatchEasy0_1K-8      1.63µs ± 0%    1.63µs ± 0%  -0.13%  (p=0.000 n=10+9)
RegexpMatchEasy1_32-8       250ns ± 0%     249ns ± 0%  -0.40%  (p=0.001 n=8+9)
RegexpMatchEasy1_1K-8      2.07µs ± 0%    2.08µs ± 0%  +0.05%  (p=0.027 n=9+9)
RegexpMatchMedium_32-8      350ns ± 0%     350ns ± 0%    ~     (p=0.412 n=9+8)
RegexpMatchMedium_1K-8      104µs ± 0%     104µs ± 0%  +0.31%  (p=0.000 n=10+7)
RegexpMatchHard_32-8       5.82µs ± 0%    5.82µs ± 0%    ~     (p=0.937 n=9+9)
RegexpMatchHard_1K-8        176µs ± 0%     176µs ± 0%  +0.03%  (p=0.000 n=9+8)
Revcomp-8                   1.36s ± 1%     1.37s ± 1%    ~     (p=0.218 n=10+10)
Template-8                  151ms ± 1%     156ms ± 1%  +3.21%  (p=0.000 n=10+10)
TimeParse-8                 737ns ± 0%     758ns ± 2%  +2.74%  (p=0.000 n=10+10)
TimeFormat-8                801ns ± 2%     789ns ± 1%  -1.51%  (p=0.000 n=10+10)
[Geo mean]                  142µs          143µs       +0.50%

Fixes #19137.

Change-Id: Ib8a21ea98c0ffb2d282a586535b213cc163e1b67
Reviewed-on: https://go-review.googlesource.com/37251
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-02-21 19:39:08 +00:00
Cherry Zhang
3557d54609 cmd/compile: check both syms when folding address into load/store on ARM64
The rules for folding addresses into load/stores checks sym1 is
not on stack (because the stack offset is not known at that point).
But sym1 could be nil, which invalidates the check. Check merged
sym instead.

Fixes #19137.

Change-Id: I8574da22ced1216bb5850403d8f08ec60a8d1005
Reviewed-on: https://go-review.googlesource.com/37145
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-02-17 21:23:24 +00:00