Alberto Donizetti
644b2dafc2
test/codegen: add copyright headers to new codegen files
...
Change-Id: I9fe6572d1043ef9ee09c0925059ded554ad24c6b
Reviewed-on: https://go-review.googlesource.com/98215
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 20:13:13 +00:00
Michael Fraenkel
5b071bfa88
cmd/compile: convert type during finishcompare
...
When recursively calling walkexpr, r.Type is still the untyped value.
It then sometimes recursively calls finishcompare, which complains that
you can't compare the resulting expression to that untyped value.
Updates #23834 .
Change-Id: I6b7acd3970ceaff8da9216bfa0ae24aca5dee828
Reviewed-on: https://go-review.googlesource.com/97856
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-03-02 19:48:23 +00:00
Than McIntosh
9b95611e38
cmd/compile: add DWARF register mappings for ARM64.
...
Add DWARF register mappings for ARM64, so that that arch will become
usable with "-dwarflocationlists". [NB: I've plugged in a set of
numbers from the doc, but this will require additional manual testing.]
Change-Id: Id9aa63857bc8b4f5c825f49274101cf372e9e856
Reviewed-on: https://go-review.googlesource.com/82515
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-02 19:40:29 +00:00
Alessandro Arzilli
eca41af012
cmd/link: fix up debug_range for dsymutil (revert CL 72371)
...
Dsymutil, an utility used on macOS when externally linking executables,
does not support base address selector entries in debug_ranges.
CL 73271 worked around this problem by removing base address selectors
and emitting CU-relative relocations for each list entry.
This commit, as an optimization, reintroduces the base address
selectors and changes the linker to remove them again, but only when it
knows that it will have to invoke the external linker on macOS.
Compilecmp comparing master with a branch that has scope tracking
always enabled:
completed 15 of 15, estimated time remaining 0s (eta 2:43PM)
name old time/op new time/op delta
Template 272ms ± 8% 257ms ± 5% -5.33% (p=0.000 n=15+14)
Unicode 124ms ± 7% 122ms ± 5% ~ (p=0.210 n=14+14)
GoTypes 873ms ± 3% 870ms ± 5% ~ (p=0.856 n=15+13)
Compiler 4.49s ± 2% 4.49s ± 5% ~ (p=0.982 n=14+14)
SSA 11.8s ± 4% 11.8s ± 3% ~ (p=0.653 n=15+15)
Flate 163ms ± 6% 164ms ± 9% ~ (p=0.914 n=14+15)
GoParser 203ms ± 6% 202ms ±10% ~ (p=0.571 n=14+14)
Reflect 547ms ± 7% 542ms ± 4% ~ (p=0.914 n=15+14)
Tar 244ms ± 7% 237ms ± 3% -2.80% (p=0.002 n=14+13)
XML 289ms ± 6% 289ms ± 5% ~ (p=0.839 n=14+14)
[Geo mean] 537ms 531ms -1.10%
name old user-time/op new user-time/op delta
Template 360ms ± 4% 341ms ± 7% -5.16% (p=0.000 n=14+14)
Unicode 189ms ±11% 190ms ± 8% ~ (p=0.844 n=15+15)
GoTypes 1.13s ± 4% 1.14s ± 7% ~ (p=0.582 n=15+14)
Compiler 5.34s ± 2% 5.40s ± 4% +1.19% (p=0.036 n=11+13)
SSA 14.7s ± 2% 14.7s ± 3% ~ (p=0.602 n=15+15)
Flate 211ms ± 7% 214ms ± 8% ~ (p=0.252 n=14+14)
GoParser 267ms ±12% 266ms ± 2% ~ (p=0.837 n=15+11)
Reflect 706ms ± 4% 701ms ± 3% ~ (p=0.213 n=14+12)
Tar 331ms ± 9% 320ms ± 5% -3.30% (p=0.025 n=15+14)
XML 378ms ± 4% 373ms ± 6% ~ (p=0.253 n=14+15)
[Geo mean] 704ms 700ms -0.58%
name old alloc/op new alloc/op delta
Template 38.0MB ± 0% 38.4MB ± 0% +1.12% (p=0.000 n=15+15)
Unicode 28.8MB ± 0% 28.8MB ± 0% +0.17% (p=0.000 n=15+15)
GoTypes 112MB ± 0% 114MB ± 0% +1.47% (p=0.000 n=15+15)
Compiler 465MB ± 0% 473MB ± 0% +1.71% (p=0.000 n=15+15)
SSA 1.48GB ± 0% 1.53GB ± 0% +3.07% (p=0.000 n=15+15)
Flate 24.3MB ± 0% 24.7MB ± 0% +1.67% (p=0.000 n=15+15)
GoParser 30.7MB ± 0% 31.0MB ± 0% +1.15% (p=0.000 n=12+15)
Reflect 76.3MB ± 0% 77.1MB ± 0% +0.97% (p=0.000 n=15+15)
Tar 39.2MB ± 0% 39.6MB ± 0% +0.91% (p=0.000 n=15+15)
XML 41.5MB ± 0% 42.0MB ± 0% +1.29% (p=0.000 n=15+15)
[Geo mean] 77.5MB 78.6MB +1.35%
name old allocs/op new allocs/op delta
Template 385k ± 0% 387k ± 0% +0.51% (p=0.000 n=15+15)
Unicode 342k ± 0% 343k ± 0% +0.10% (p=0.000 n=14+15)
GoTypes 1.19M ± 0% 1.19M ± 0% +0.62% (p=0.000 n=15+15)
Compiler 4.51M ± 0% 4.54M ± 0% +0.50% (p=0.000 n=14+15)
SSA 12.2M ± 0% 12.4M ± 0% +1.12% (p=0.000 n=14+15)
Flate 234k ± 0% 236k ± 0% +0.60% (p=0.000 n=15+15)
GoParser 318k ± 0% 320k ± 0% +0.60% (p=0.000 n=15+15)
Reflect 974k ± 0% 977k ± 0% +0.27% (p=0.000 n=15+15)
Tar 395k ± 0% 397k ± 0% +0.37% (p=0.000 n=14+15)
XML 404k ± 0% 407k ± 0% +0.53% (p=0.000 n=15+15)
[Geo mean] 794k 798k +0.52%
name old text-bytes new text-bytes delta
HelloSize 680kB ± 0% 680kB ± 0% ~ (all equal)
name old data-bytes new data-bytes delta
HelloSize 9.62kB ± 0% 9.62kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.11MB ± 0% 1.13MB ± 0% +1.85% (p=0.000 n=15+15)
Change-Id: I61c98ba0340cb798034b2bb55e3ab3a58ac1cf23
Reviewed-on: https://go-review.googlesource.com/98075
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-02 19:33:44 +00:00
Heschi Kreinick
9dc351beba
cmd/compile/internal/ssa: batch up all zero-width instructions
...
When generating location lists, batch up changes for all zero-width
instructions, not just phis. This prevents the creation of location list
entries that don't actually cover any instructions.
This isn't perfect because of the caveats in the prior CL (Copy is
zero-width sometimes) but in practice this seems to fix all of the empty
lists in std.
Change-Id: Ice4a9ade36b6b24ca111d1494c414eec96e5af25
Reviewed-on: https://go-review.googlesource.com/97958
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-03-02 18:55:56 +00:00
Heschi Kreinick
caa1b4afbd
cmd/compile/internal/ssa: note zero-width Ops
...
Add a bool to opInfo to indicate if an Op never results in any
instructions. This is a conservative approximation: some operations,
like Copy, may or may not generate code depending on their arguments.
I built the list by reading each arch's ssaGenValue function. Hopefully
I got them all.
Change-Id: I130b251b65f18208294e129bb7ddc3f91d57d31d
Reviewed-on: https://go-review.googlesource.com/97957
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-02 18:55:45 +00:00
Zhou Peng
b77aad0891
runtime: fix typo, func comments should start with function name
...
Change-Id: I289af4884583537639800e37928c22814d38cba9
Reviewed-on: https://go-review.googlesource.com/98115
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2018-03-02 12:03:30 +00:00
Alessandro Arzilli
3fca7306f4
cmd/compile: optimize scope tracking
...
1. Detect and remove the markers of lexical scopes that don't contain
any variables early in noder, instead of waiting until the end of DWARF
generation.
This saves memory by never allocating some of the markers and optimizes
some of the algorithms that depend on the number of scopes.
2. Assign scopes to Progs by doing, for each Prog, a binary search over
the markers array. This is faster, compared to sorting the Prog list
because there are fewer markers than there are Progs.
completed 15 of 15, estimated time remaining 0s (eta 2:30PM)
name old time/op new time/op delta
Template 274ms ± 5% 260ms ± 6% -4.91% (p=0.000 n=15+15)
Unicode 126ms ± 5% 127ms ± 9% ~ (p=0.856 n=13+15)
GoTypes 861ms ± 5% 857ms ± 4% ~ (p=0.595 n=15+15)
Compiler 4.11s ± 4% 4.12s ± 5% ~ (p=1.000 n=15+15)
SSA 10.7s ± 2% 10.9s ± 4% +2.01% (p=0.002 n=14+14)
Flate 163ms ± 4% 166ms ± 9% ~ (p=0.134 n=14+15)
GoParser 203ms ± 4% 205ms ± 6% ~ (p=0.461 n=15+15)
Reflect 544ms ± 5% 549ms ± 4% ~ (p=0.174 n=15+15)
Tar 249ms ± 9% 245ms ± 6% ~ (p=0.285 n=15+15)
XML 286ms ± 4% 291ms ± 5% ~ (p=0.081 n=15+15)
[Geo mean] 528ms 529ms +0.14%
name old user-time/op new user-time/op delta
Template 358ms ± 7% 354ms ± 5% ~ (p=0.242 n=14+15)
Unicode 189ms ±11% 191ms ±10% ~ (p=0.438 n=15+15)
GoTypes 1.15s ± 4% 1.14s ± 3% ~ (p=0.405 n=15+15)
Compiler 5.36s ± 6% 5.35s ± 5% ~ (p=0.588 n=15+15)
SSA 14.6s ± 3% 15.0s ± 4% +2.58% (p=0.000 n=15+15)
Flate 214ms ±12% 216ms ± 8% ~ (p=0.539 n=15+15)
GoParser 267ms ± 6% 270ms ± 5% ~ (p=0.569 n=15+15)
Reflect 712ms ± 5% 709ms ± 4% ~ (p=0.894 n=15+15)
Tar 329ms ± 8% 330ms ± 5% ~ (p=0.974 n=14+15)
XML 371ms ± 3% 381ms ± 5% +2.85% (p=0.002 n=13+15)
[Geo mean] 705ms 709ms +0.62%
name old alloc/op new alloc/op delta
Template 38.0MB ± 0% 38.4MB ± 0% +1.27% (p=0.000 n=15+14)
Unicode 28.8MB ± 0% 28.8MB ± 0% +0.16% (p=0.000 n=15+14)
GoTypes 112MB ± 0% 114MB ± 0% +1.64% (p=0.000 n=15+15)
Compiler 465MB ± 0% 474MB ± 0% +1.91% (p=0.000 n=15+15)
SSA 1.48GB ± 0% 1.53GB ± 0% +3.32% (p=0.000 n=15+15)
Flate 24.3MB ± 0% 24.8MB ± 0% +1.77% (p=0.000 n=14+15)
GoParser 30.7MB ± 0% 31.1MB ± 0% +1.27% (p=0.000 n=15+15)
Reflect 76.3MB ± 0% 77.1MB ± 0% +1.03% (p=0.000 n=15+15)
Tar 39.2MB ± 0% 39.6MB ± 0% +1.02% (p=0.000 n=13+15)
XML 41.5MB ± 0% 42.1MB ± 0% +1.45% (p=0.000 n=15+15)
[Geo mean] 77.5MB 78.7MB +1.48%
name old allocs/op new allocs/op delta
Template 385k ± 0% 387k ± 0% +0.54% (p=0.000 n=15+15)
Unicode 342k ± 0% 343k ± 0% +0.10% (p=0.000 n=15+15)
GoTypes 1.19M ± 0% 1.19M ± 0% +0.64% (p=0.000 n=14+15)
Compiler 4.51M ± 0% 4.54M ± 0% +0.53% (p=0.000 n=15+15)
SSA 12.2M ± 0% 12.4M ± 0% +1.16% (p=0.000 n=15+15)
Flate 234k ± 0% 236k ± 0% +0.63% (p=0.000 n=14+15)
GoParser 318k ± 0% 320k ± 0% +0.63% (p=0.000 n=15+15)
Reflect 974k ± 0% 977k ± 0% +0.28% (p=0.000 n=15+15)
Tar 395k ± 0% 397k ± 0% +0.38% (p=0.000 n=15+13)
XML 404k ± 0% 407k ± 0% +0.55% (p=0.000 n=15+15)
[Geo mean] 794k 799k +0.55%
name old text-bytes new text-bytes delta
HelloSize 680kB ± 0% 680kB ± 0% ~ (all equal)
name old data-bytes new data-bytes delta
HelloSize 9.62kB ± 0% 9.62kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.11MB ± 0% 1.12MB ± 0% +1.11% (p=0.000 n=15+15)
Change-Id: I95a0173ee28c52be1a4851d2a6e389529e74bf28
Reviewed-on: https://go-review.googlesource.com/95396
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-02 10:38:41 +00:00
Tobias Klauser
1023b016d5
syscall: fix nil pointer dereference in Select on linux/{arm64,mips64x}
...
The timeout parameter might be nil, don't dereference it
unconditionally.
Fixes #24189
Change-Id: I03e6a1ab74fe30322ce6bcfd3d6c42130b6d61be
Reviewed-on: https://go-review.googlesource.com/97819
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 08:18:31 +00:00
Brad Fitzpatrick
1fadbc1a76
Revert "runtime: use bytes.IndexByte in findnull"
...
This reverts commit 7365fac2db
.
Reason for revert: breaks the build on some architectures, reading unmapped pages?
Change-Id: I3a8c02dc0b649269faacea79ecd8213defa97c54
Reviewed-on: https://go-review.googlesource.com/97995
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-01 22:22:51 +00:00
Heschi Kreinick
f1fc9da316
cmd/link: fix up location lists for dsymutil
...
LLVM tools, particularly lldb and dsymutil, don't support base address
selection entries in location lists. When targeting GOOS=darwin,
mode, have the linker translate location lists to CU-relative form
instead.
Technically, this isn't necessary when linking internally, as long as
nobody plans to use anything other than Delve to look at the DWARF. But
someone might want to use lldb, and it's really confusing when dwarfdump
shows gibberish for the location entries. The performance cost isn't
noticeable, so enable it even for internal linking.
Doing this in the linker is a little weird, but it was more expensive in
the compiler, probably because the compiler is much more stressful to
the GC. Also, if we decide to only do it for external linking, the
compiler can't see the link mode.
Benchmark before and after this commit on Mac with -dwarflocationlists=1:
name old time/op new time/op delta
StdCmd 21.3s ± 1% 21.3s ± 1% ~ (p=0.310 n=27+27)
Only StdCmd is relevant, because only StdCmd runs the linker. Whatever
the cost is here, it's not very large.
Change-Id: Ic8ef780d0e263230ce6aa3ca3a32fc9abd750b1e
Reviewed-on: https://go-review.googlesource.com/97956
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-03-01 22:06:03 +00:00
Heschi Kreinick
bff29f2d17
cmd/compile/internal/ssa: avoid accidental list ends
...
Some SSA values don't translate into any instructions. If a function
began with two of them, and both modified the storage of the same
variable, we'd end up with a location list entry that started and ended
at 0. That looks like an end-of-list entry, which would then confuse
downstream tools, particularly the fixup in the linker.
"Fix" this by changing the end of such entries to 1. Should be harmless,
since AFAIK we don't generate any 1-byte instructions. Later CLs will
reduce the frequency of these entries anyway.
Change-Id: I9b7e5e69f914244cc826fb9f4a6acfe2dc695f81
Reviewed-on: https://go-review.googlesource.com/97955
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-03-01 22:03:37 +00:00
Alessandro Arzilli
87736fc450
cmd/compile: fix dwarf ranges of inlined subroutine entries
...
DWARF ranges are half-open.
Fixes #23928
Change-Id: I71b3384d1bc2c65bd37ca8a02a0b7ff48fec3688
Reviewed-on: https://go-review.googlesource.com/94816
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 21:13:40 +00:00
Cherry Zhang
2baed3856d
cmd/asm: fix assembling return jump
...
In RET instruction, the operand is the return jump's target,
which should be put in Prog.To.
Add an action "buildrundir" to the test driver, which builds
(compile+assemble+link) the code in a directory and runs the
resulting binary.
Fixes #23838 .
Change-Id: I7ebe7eda49024b40a69a24857322c5ca9c67babb
Reviewed-on: https://go-review.googlesource.com/94175
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-01 21:11:16 +00:00
Balaram Makam
213a75171d
runtime: improve arm64 memmove implementation
...
Improve runtime memmove_arm64.s specializing for small copies and
processing 32 bytes per iteration for 32 bytes or more.
Benchmark results of runtime/Memmove on Amberwing:
name old time/op new time/op delta
Memmove/0 7.61ns ± 0% 7.20ns ± 0% ~ (p=0.053 n=5+7)
Memmove/1 9.28ns ± 0% 8.80ns ± 0% -5.17% (p=0.000 n=4+8)
Memmove/2 9.65ns ± 0% 9.20ns ± 0% -4.68% (p=0.000 n=5+8)
Memmove/3 10.0ns ± 0% 9.2ns ± 0% -7.83% (p=0.000 n=5+8)
Memmove/4 10.6ns ± 0% 9.2ns ± 0% -13.21% (p=0.000 n=5+8)
Memmove/5 11.0ns ± 0% 9.2ns ± 0% -16.36% (p=0.000 n=5+8)
Memmove/6 12.4ns ± 0% 9.2ns ± 0% -25.81% (p=0.000 n=5+8)
Memmove/7 13.1ns ± 0% 9.2ns ± 0% -29.56% (p=0.000 n=5+8)
Memmove/8 9.10ns ± 1% 9.20ns ± 0% +1.08% (p=0.002 n=5+8)
Memmove/9 9.67ns ± 0% 9.20ns ± 0% -4.88% (p=0.000 n=5+8)
Memmove/10 10.4ns ± 0% 9.2ns ± 0% -11.54% (p=0.000 n=5+8)
Memmove/11 10.9ns ± 0% 9.2ns ± 0% -15.60% (p=0.000 n=5+8)
Memmove/12 11.5ns ± 0% 9.2ns ± 0% -20.00% (p=0.000 n=5+8)
Memmove/13 12.4ns ± 0% 9.2ns ± 0% -25.81% (p=0.000 n=5+8)
Memmove/14 13.1ns ± 0% 9.2ns ± 0% -29.77% (p=0.000 n=5+8)
Memmove/15 13.8ns ± 0% 9.2ns ± 0% -33.33% (p=0.000 n=5+8)
Memmove/16 9.70ns ± 0% 9.20ns ± 0% -5.19% (p=0.000 n=5+8)
Memmove/32 10.6ns ± 0% 9.2ns ± 0% -13.21% (p=0.000 n=4+8)
Memmove/64 13.4ns ± 0% 10.2ns ± 0% -23.88% (p=0.000 n=4+8)
Memmove/128 18.1ns ± 1% 13.2ns ± 0% -26.99% (p=0.000 n=5+8)
Memmove/256 25.2ns ± 0% 16.4ns ± 0% -34.92% (p=0.000 n=5+8)
Memmove/512 36.4ns ± 0% 22.8ns ± 0% -37.36% (p=0.000 n=5+8)
Memmove/1024 70.1ns ± 0% 36.8ns ±11% -47.49% (p=0.002 n=5+8)
Memmove/2048 121ns ± 0% 61ns ± 0% ~ (p=0.053 n=5+7)
Memmove/4096 224ns ± 0% 120ns ± 0% -46.43% (p=0.000 n=5+8)
MemmoveUnalignedDst/0 8.40ns ± 0% 8.00ns ± 0% -4.76% (p=0.000 n=5+8)
MemmoveUnalignedDst/1 9.87ns ± 1% 10.00ns ± 0% ~ (p=0.070 n=5+8)
MemmoveUnalignedDst/2 10.6ns ± 0% 10.4ns ± 0% -1.89% (p=0.000 n=5+8)
MemmoveUnalignedDst/3 10.8ns ± 0% 10.4ns ± 0% -3.70% (p=0.000 n=5+8)
MemmoveUnalignedDst/4 10.9ns ± 0% 10.3ns ± 0% ~ (p=0.053 n=5+7)
MemmoveUnalignedDst/5 11.5ns ± 0% 10.3ns ± 1% -10.22% (p=0.000 n=4+8)
MemmoveUnalignedDst/6 13.2ns ± 0% 10.4ns ± 1% -21.50% (p=0.000 n=5+8)
MemmoveUnalignedDst/7 13.7ns ± 0% 10.3ns ± 1% -24.64% (p=0.000 n=4+8)
MemmoveUnalignedDst/8 10.1ns ± 0% 10.4ns ± 0% +2.97% (p=0.002 n=5+8)
MemmoveUnalignedDst/9 10.7ns ± 0% 10.4ns ± 0% -2.80% (p=0.000 n=5+8)
MemmoveUnalignedDst/10 11.2ns ± 1% 10.4ns ± 0% -6.81% (p=0.000 n=5+8)
MemmoveUnalignedDst/11 11.6ns ± 0% 10.4ns ± 0% -10.34% (p=0.000 n=5+8)
MemmoveUnalignedDst/12 12.5ns ± 2% 10.4ns ± 0% -16.53% (p=0.000 n=5+8)
MemmoveUnalignedDst/13 13.7ns ± 0% 10.4ns ± 0% -24.09% (p=0.000 n=5+8)
MemmoveUnalignedDst/14 14.0ns ± 0% 10.4ns ± 0% -25.71% (p=0.000 n=5+8)
MemmoveUnalignedDst/15 14.6ns ± 0% 10.4ns ± 0% -28.77% (p=0.000 n=5+8)
MemmoveUnalignedDst/16 10.5ns ± 0% 10.4ns ± 0% -0.95% (p=0.000 n=5+8)
MemmoveUnalignedDst/32 12.4ns ± 0% 11.6ns ± 0% -6.05% (p=0.000 n=5+8)
MemmoveUnalignedDst/64 15.2ns ± 0% 12.3ns ± 0% -19.08% (p=0.000 n=5+8)
MemmoveUnalignedDst/128 18.7ns ± 0% 15.2ns ± 0% -18.72% (p=0.000 n=5+8)
MemmoveUnalignedDst/256 25.1ns ± 0% 18.6ns ± 0% -25.90% (p=0.000 n=5+8)
MemmoveUnalignedDst/512 37.8ns ± 0% 24.4ns ± 0% -35.45% (p=0.000 n=5+8)
MemmoveUnalignedDst/1024 74.6ns ± 0% 40.4ns ± 0% ~ (p=0.053 n=5+7)
MemmoveUnalignedDst/2048 133ns ± 0% 75ns ± 0% -43.91% (p=0.000 n=5+8)
MemmoveUnalignedDst/4096 247ns ± 0% 141ns ± 0% -42.91% (p=0.000 n=5+8)
MemmoveUnalignedSrc/0 8.40ns ± 0% 8.00ns ± 0% -4.76% (p=0.000 n=5+8)
MemmoveUnalignedSrc/1 9.81ns ± 0% 10.00ns ± 0% +1.98% (p=0.002 n=5+8)
MemmoveUnalignedSrc/2 10.5ns ± 0% 10.0ns ± 0% -4.76% (p=0.000 n=5+8)
MemmoveUnalignedSrc/3 10.7ns ± 1% 10.0ns ± 0% -6.89% (p=0.000 n=5+8)
MemmoveUnalignedSrc/4 11.3ns ± 0% 10.0ns ± 0% -11.50% (p=0.000 n=5+8)
MemmoveUnalignedSrc/5 11.6ns ± 0% 10.0ns ± 0% -13.79% (p=0.000 n=5+8)
MemmoveUnalignedSrc/6 13.6ns ± 0% 10.0ns ± 0% -26.47% (p=0.000 n=5+8)
MemmoveUnalignedSrc/7 14.4ns ± 0% 10.0ns ± 0% -30.75% (p=0.000 n=5+8)
MemmoveUnalignedSrc/8 9.87ns ± 1% 10.00ns ± 0% ~ (p=0.070 n=5+8)
MemmoveUnalignedSrc/9 10.4ns ± 0% 10.0ns ± 0% -3.85% (p=0.000 n=5+8)
MemmoveUnalignedSrc/10 11.2ns ± 0% 10.0ns ± 0% -10.71% (p=0.000 n=5+8)
MemmoveUnalignedSrc/11 11.8ns ± 0% 10.0ns ± 0% -15.25% (p=0.000 n=5+8)
MemmoveUnalignedSrc/12 12.1ns ± 0% 10.0ns ± 0% -17.36% (p=0.000 n=5+8)
MemmoveUnalignedSrc/13 13.6ns ± 0% 10.0ns ± 0% -26.47% (p=0.000 n=5+8)
MemmoveUnalignedSrc/14 14.7ns ± 0% 10.0ns ± 0% -31.79% (p=0.000 n=5+8)
MemmoveUnalignedSrc/15 14.4ns ± 0% 10.0ns ± 0% -30.56% (p=0.000 n=5+8)
MemmoveUnalignedSrc/16 11.0ns ± 0% 10.0ns ± 0% -9.09% (p=0.000 n=5+8)
MemmoveUnalignedSrc/32 11.5ns ± 0% 10.0ns ± 0% -13.04% (p=0.000 n=5+8)
MemmoveUnalignedSrc/64 14.9ns ± 0% 11.2ns ± 0% -24.83% (p=0.000 n=4+8)
MemmoveUnalignedSrc/128 19.5ns ± 0% 15.2ns ± 0% -22.05% (p=0.000 n=5+8)
MemmoveUnalignedSrc/256 27.3ns ± 2% 19.2ns ± 0% -29.62% (p=0.000 n=5+8)
MemmoveUnalignedSrc/512 40.4ns ± 0% 27.2ns ± 0% -32.67% (p=0.000 n=5+8)
MemmoveUnalignedSrc/1024 75.4ns ± 0% 44.4ns ± 0% -41.15% (p=0.000 n=5+8)
MemmoveUnalignedSrc/2048 131ns ± 0% 77ns ± 3% -41.56% (p=0.002 n=5+8)
MemmoveUnalignedSrc/4096 248ns ± 0% 145ns ± 0% -41.53% (p=0.000 n=5+8)
name old speed new speed delta
Memmove/1 108MB/s ± 0% 114MB/s ± 0% +5.37% (p=0.004 n=4+8)
Memmove/2 207MB/s ± 0% 217MB/s ± 0% +4.85% (p=0.002 n=5+8)
Memmove/3 301MB/s ± 0% 326MB/s ± 0% +8.45% (p=0.002 n=5+8)
Memmove/4 377MB/s ± 0% 435MB/s ± 0% +15.31% (p=0.004 n=4+8)
Memmove/5 455MB/s ± 0% 543MB/s ± 0% +19.46% (p=0.002 n=5+8)
Memmove/6 483MB/s ± 0% 652MB/s ± 0% +34.88% (p=0.003 n=5+7)
Memmove/7 537MB/s ± 0% 761MB/s ± 0% +41.71% (p=0.002 n=5+8)
Memmove/8 879MB/s ± 1% 869MB/s ± 0% -1.15% (p=0.000 n=5+7)
Memmove/9 931MB/s ± 0% 978MB/s ± 0% +5.05% (p=0.002 n=5+8)
Memmove/10 960MB/s ± 0% 1086MB/s ± 0% +13.13% (p=0.002 n=5+8)
Memmove/11 1.00GB/s ± 0% 1.20GB/s ± 0% +18.92% (p=0.003 n=5+7)
Memmove/12 1.04GB/s ± 0% 1.30GB/s ± 0% +25.40% (p=0.002 n=5+8)
Memmove/13 1.05GB/s ± 0% 1.41GB/s ± 0% +34.87% (p=0.002 n=5+8)
Memmove/14 1.07GB/s ± 0% 1.52GB/s ± 0% +42.14% (p=0.002 n=5+8)
Memmove/15 1.09GB/s ± 0% 1.63GB/s ± 0% +49.91% (p=0.002 n=5+8)
Memmove/16 1.65GB/s ± 0% 1.74GB/s ± 0% +5.40% (p=0.003 n=5+7)
Memmove/32 3.01GB/s ± 0% 3.48GB/s ± 0% +15.58% (p=0.003 n=5+7)
Memmove/64 4.76GB/s ± 0% 6.27GB/s ± 0% +31.75% (p=0.003 n=5+7)
Memmove/128 7.08GB/s ± 1% 9.69GB/s ± 0% +36.96% (p=0.002 n=5+8)
Memmove/256 10.2GB/s ± 0% 15.6GB/s ± 0% +53.58% (p=0.002 n=5+8)
Memmove/512 14.1GB/s ± 0% 22.4GB/s ± 0% +59.57% (p=0.003 n=5+7)
Memmove/1024 14.6GB/s ± 0% 27.9GB/s ±10% +91.00% (p=0.002 n=5+8)
Memmove/2048 16.9GB/s ± 0% 33.4GB/s ± 0% +98.32% (p=0.003 n=5+7)
Memmove/4096 18.3GB/s ± 0% 33.9GB/s ± 0% +85.80% (p=0.002 n=5+8)
MemmoveUnalignedDst/1 101MB/s ± 1% 100MB/s ± 0% ~ (p=0.586 n=5+8)
MemmoveUnalignedDst/2 189MB/s ± 0% 192MB/s ± 0% +1.82% (p=0.002 n=5+8)
MemmoveUnalignedDst/3 278MB/s ± 0% 288MB/s ± 0% +3.88% (p=0.003 n=5+7)
MemmoveUnalignedDst/4 368MB/s ± 0% 387MB/s ± 0% +5.41% (p=0.003 n=5+7)
MemmoveUnalignedDst/5 434MB/s ± 0% 484MB/s ± 0% +11.52% (p=0.002 n=5+8)
MemmoveUnalignedDst/6 454MB/s ± 0% 580MB/s ± 0% +27.62% (p=0.002 n=5+8)
MemmoveUnalignedDst/7 509MB/s ± 0% 677MB/s ± 0% +33.01% (p=0.002 n=5+8)
MemmoveUnalignedDst/8 792MB/s ± 0% 770MB/s ± 0% -2.77% (p=0.002 n=5+8)
MemmoveUnalignedDst/9 841MB/s ± 0% 866MB/s ± 0% +2.92% (p=0.002 n=5+8)
MemmoveUnalignedDst/10 896MB/s ± 0% 962MB/s ± 0% +7.35% (p=0.003 n=5+7)
MemmoveUnalignedDst/11 947MB/s ± 0% 1058MB/s ± 0% +11.80% (p=0.002 n=5+8)
MemmoveUnalignedDst/12 962MB/s ± 2% 1154MB/s ± 0% +19.97% (p=0.002 n=5+8)
MemmoveUnalignedDst/13 947MB/s ± 0% 1251MB/s ± 0% +32.08% (p=0.002 n=5+8)
MemmoveUnalignedDst/14 1.00GB/s ± 0% 1.35GB/s ± 0% +34.55% (p=0.002 n=5+8)
MemmoveUnalignedDst/15 1.03GB/s ± 0% 1.44GB/s ± 0% +40.50% (p=0.002 n=5+8)
MemmoveUnalignedDst/16 1.53GB/s ± 0% 1.54GB/s ± 0% +0.77% (p=0.002 n=5+8)
MemmoveUnalignedDst/32 2.58GB/s ± 0% 2.75GB/s ± 0% +6.52% (p=0.003 n=5+7)
MemmoveUnalignedDst/64 4.21GB/s ± 0% 5.19GB/s ± 0% +23.40% (p=0.004 n=5+6)
MemmoveUnalignedDst/128 6.86GB/s ± 0% 8.42GB/s ± 0% +22.78% (p=0.003 n=5+7)
MemmoveUnalignedDst/256 10.2GB/s ± 0% 13.8GB/s ± 0% +35.15% (p=0.002 n=5+8)
MemmoveUnalignedDst/512 13.5GB/s ± 0% 21.0GB/s ± 0% +54.90% (p=0.002 n=5+8)
MemmoveUnalignedDst/1024 13.7GB/s ± 0% 25.3GB/s ± 0% +84.61% (p=0.003 n=5+7)
MemmoveUnalignedDst/2048 15.3GB/s ± 0% 27.5GB/s ± 0% +79.52% (p=0.002 n=5+8)
MemmoveUnalignedDst/4096 16.5GB/s ± 0% 28.9GB/s ± 0% +74.74% (p=0.002 n=5+8)
MemmoveUnalignedSrc/1 102MB/s ± 0% 100MB/s ± 0% -2.02% (p=0.000 n=5+7)
MemmoveUnalignedSrc/2 191MB/s ± 0% 200MB/s ± 0% +4.78% (p=0.002 n=5+8)
MemmoveUnalignedSrc/3 279MB/s ± 0% 300MB/s ± 0% +7.45% (p=0.002 n=5+8)
MemmoveUnalignedSrc/4 354MB/s ± 0% 400MB/s ± 0% +13.10% (p=0.002 n=5+8)
MemmoveUnalignedSrc/5 431MB/s ± 0% 500MB/s ± 0% +16.02% (p=0.002 n=5+8)
MemmoveUnalignedSrc/6 441MB/s ± 0% 600MB/s ± 0% +36.03% (p=0.002 n=5+8)
MemmoveUnalignedSrc/7 485MB/s ± 0% 700MB/s ± 0% +44.29% (p=0.002 n=5+8)
MemmoveUnalignedSrc/8 811MB/s ± 1% 800MB/s ± 0% -1.36% (p=0.016 n=5+8)
MemmoveUnalignedSrc/9 864MB/s ± 0% 900MB/s ± 0% +4.07% (p=0.002 n=5+8)
MemmoveUnalignedSrc/10 893MB/s ± 0% 999MB/s ± 0% +11.97% (p=0.002 n=5+8)
MemmoveUnalignedSrc/11 932MB/s ± 0% 1099MB/s ± 0% +18.01% (p=0.002 n=5+8)
MemmoveUnalignedSrc/12 988MB/s ± 0% 1199MB/s ± 0% +21.35% (p=0.002 n=5+8)
MemmoveUnalignedSrc/13 955MB/s ± 0% 1299MB/s ± 0% +36.02% (p=0.002 n=5+8)
MemmoveUnalignedSrc/14 955MB/s ± 0% 1399MB/s ± 0% +46.52% (p=0.002 n=5+8)
MemmoveUnalignedSrc/15 1.04GB/s ± 0% 1.50GB/s ± 0% +44.18% (p=0.002 n=5+8)
MemmoveUnalignedSrc/16 1.45GB/s ± 0% 1.60GB/s ± 0% +10.14% (p=0.002 n=5+8)
MemmoveUnalignedSrc/32 2.78GB/s ± 0% 3.20GB/s ± 0% +15.16% (p=0.003 n=5+7)
MemmoveUnalignedSrc/64 4.30GB/s ± 0% 5.72GB/s ± 0% +32.90% (p=0.003 n=5+7)
MemmoveUnalignedSrc/128 6.57GB/s ± 0% 8.42GB/s ± 0% +28.06% (p=0.002 n=5+8)
MemmoveUnalignedSrc/256 9.39GB/s ± 1% 13.33GB/s ± 0% +41.96% (p=0.002 n=5+8)
MemmoveUnalignedSrc/512 12.7GB/s ± 0% 18.8GB/s ± 0% +48.53% (p=0.003 n=5+7)
MemmoveUnalignedSrc/1024 13.6GB/s ± 0% 23.0GB/s ± 0% +69.82% (p=0.002 n=5+8)
MemmoveUnalignedSrc/2048 15.6GB/s ± 0% 26.8GB/s ± 3% +71.37% (p=0.002 n=5+8)
MemmoveUnalignedSrc/4096 16.5GB/s ± 0% 28.2GB/s ± 0% +71.40% (p=0.002 n=5+8)
Fixes #22925
Change-Id: I38c1a9ad5c6e3f4f95fc521c4b7e3140b58b4737
Reviewed-on: https://go-review.googlesource.com/83799
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-01 20:34:11 +00:00
Josh Bleecher Snyder
7365fac2db
runtime: use bytes.IndexByte in findnull
...
bytes.IndexByte is heavily optimized.
Use it in findnull.
name old time/op new time/op delta
GoString-8 65.5ns ± 1% 40.2ns ± 1% -38.62% (p=0.000 n=19+19)
findnull is also used in gostringnocopy,
which is used in many hot spots in the runtime.
Fixes #23830
Change-Id: I2e6cb279c7d8078f8844065de684cc3567fe89d7
Reviewed-on: https://go-review.googlesource.com/97523
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 20:34:07 +00:00
Chad Rosier
39fefa0709
cmd/compile/internal/ssa: combine consecutive BigEndian stores on arm64
...
This optimization mirrors that which is already implemented for AMD64. The
optimization specifically targets the binary.BigEndian.PutUint* functions.
encoding-binary results on Amberwing:
name old time/op new time/op delta
ReadSlice1000Int32s 9.83µs ± 2% 9.78µs ± 1% ~ (p=0.362 n=9+10)
ReadStruct 5.24µs ± 3% 5.19µs ± 2% ~ (p=0.285 n=10+10)
ReadInts 8.35µs ± 8% 8.44µs ± 3% ~ (p=0.323 n=10+10)
WriteInts 3.38µs ± 3% 3.44µs ±15% ~ (p=0.921 n=9+10)
WriteSlice1000Int32s 11.4µs ± 6% 10.2µs ± 4% -9.94% (p=0.000 n=10+10)
PutUint16 510ns ±12% 500ns ± 0% ~ (p=0.586 n=10+7)
PutUint32 530ns ±15% 490ns ±12% ~ (p=0.086 n=10+10)
PutUint64 550ns ± 0% 470ns ± 6% -14.52% (p=0.000 n=7+10)
LittleEndianPutUint16 500ns ± 0% 475ns ±16% ~ (p=0.120 n=7+10)
LittleEndianPutUint32 450ns ± 0% 517ns ±16% +14.81% (p=0.004 n=8+9)
LittleEndianPutUint64 550ns ± 0% 485ns ±13% -11.82% (p=0.000 n=8+10)
PutUvarint32 685ns ±12% 622ns ± 4% -9.17% (p=0.005 n=10+9)
PutUvarint64 735ns ± 9% 711ns ± 9% ~ (p=0.272 n=10+9)
[Geo mean] 1.47µs 1.42µs -3.87%
name old speed new speed delta
ReadSlice1000Int32s 407MB/s ± 2% 409MB/s ± 1% ~ (p=0.362 n=9+10)
ReadStruct 14.3MB/s ± 3% 14.4MB/s ± 2% ~ (p=0.250 n=10+10)
ReadInts 3.59MB/s ± 7% 3.56MB/s ± 4% ~ (p=0.340 n=10+10)
WriteInts 8.87MB/s ± 3% 8.74MB/s ±13% ~ (p=0.890 n=9+10)
WriteSlice1000Int32s 352MB/s ± 6% 391MB/s ± 4% +11.03% (p=0.000 n=10+10)
PutUint16 3.95MB/s ±13% 4.00MB/s ± 0% ~ (p=0.312 n=10+7)
PutUint32 7.62MB/s ±17% 8.21MB/s ±11% ~ (p=0.086 n=10+10)
PutUint64 14.6MB/s ± 0% 17.1MB/s ± 6% +17.28% (p=0.000 n=7+10)
LittleEndianPutUint16 4.00MB/s ± 0% 4.23MB/s ±18% ~ (p=0.176 n=7+10)
LittleEndianPutUint32 8.89MB/s ± 0% 7.64MB/s ±20% -14.05% (p=0.001 n=8+10)
LittleEndianPutUint64 14.6MB/s ± 0% 16.6MB/s ±12% +13.86% (p=0.000 n=8+10)
PutUvarint32 5.86MB/s ±14% 6.44MB/s ± 5% +9.84% (p=0.006 n=10+9)
PutUvarint64 10.9MB/s ± 8% 11.3MB/s ± 9% ~ (p=0.373 n=10+9)
[Geo mean] 14.2MB/s 14.8MB/s +3.93%
go1 results on Amberwing:
RegexpMatchEasy0_32 254ns ± 0% 254ns ± 0% ~ (all equal)
RegexpMatchEasy0_1K 547ns ± 0% 547ns ± 0% ~ (all equal)
RegexpMatchEasy1_32 252ns ± 0% 253ns ± 1% ~ (p=0.294 n=8+10)
RegexpMatchEasy1_1K 782ns ± 0% 783ns ± 1% ~ (p=0.529 n=8+9)
RegexpMatchMedium_32 316ns ± 0% 316ns ± 0% ~ (all equal)
RegexpMatchMedium_1K 51.5µs ± 0% 51.5µs ± 0% ~ (p=0.645 n=10+9)
RegexpMatchHard_32 2.75µs ± 0% 2.75µs ± 0% ~ (all equal)
RegexpMatchHard_1K 78.7µs ± 0% 78.7µs ± 0% ~ (p=0.754 n=10+10)
FmtFprintfEmpty 57.0ns ± 0% 57.0ns ± 0% ~ (all equal)
FmtFprintfString 111ns ± 0% 111ns ± 0% ~ (all equal)
FmtFprintfInt 114ns ± 0% 114ns ± 1% ~ (p=0.065 n=9+10)
FmtFprintfIntInt 182ns ± 0% 178ns ± 0% -2.20% (p=0.000 n=10+10)
FmtFprintfPrefixedInt 225ns ± 0% 227ns ± 0% +0.89% (p=0.000 n=10+10)
FmtFprintfFloat 307ns ± 0% 307ns ± 0% ~ (p=1.000 n=9+9)
FmtManyArgs 697ns ± 0% 701ns ± 2% ~ (p=0.108 n=9+10)
Gzip 436ms ± 0% 437ms ± 0% +0.23% (p=0.000 n=10+8)
HTTPClientServer 88.8µs ± 2% 89.6µs ± 1% +0.98% (p=0.019 n=10+10)
JSONEncode 20.1ms ± 1% 20.2ms ± 1% +0.48% (p=0.007 n=10+10)
JSONDecode 94.7ms ± 1% 94.1ms ± 0% -0.62% (p=0.000 n=10+9)
GobDecode 12.6ms ± 2% 12.6ms ± 1% ~ (p=0.360 n=10+8)
GobEncode 12.0ms ± 1% 11.9ms ± 1% -1.34% (p=0.000 n=10+10)
Mandelbrot200 5.05ms ± 0% 5.05ms ± 0% +0.12% (p=0.000 n=10+10)
TimeParse 448ns ± 0% 448ns ± 0% ~ (p=0.529 n=8+9)
TimeFormat 501ns ± 1% 501ns ± 1% ~ (p=1.000 n=10+9)
Template 90.6ms ± 0% 89.1ms ± 0% -1.67% (p=0.000 n=9+9)
GoParse 6.01ms ± 0% 5.96ms ± 0% -0.83% (p=0.000 n=10+9)
BinaryTree17 11.7s ± 0% 11.7s ± 0% ~ (p=0.481 n=10+10)
Revcomp 675ms ± 0% 675ms ± 0% ~ (p=0.436 n=9+9)
Fannkuch11 3.26s ± 0% 3.27s ± 1% +0.57% (p=0.000 n=10+10)
[Geo mean] 67.4µs 67.3µs -0.10%
name old speed new speed delta
RegexpMatchEasy0_32 126MB/s ± 0% 126MB/s ± 0% ~ (p=0.353 n=10+7)
RegexpMatchEasy0_1K 1.87GB/s ± 0% 1.87GB/s ± 0% ~ (p=0.275 n=8+10)
RegexpMatchEasy1_32 127MB/s ± 0% 126MB/s ± 1% ~ (p=0.110 n=8+10)
RegexpMatchEasy1_1K 1.31GB/s ± 0% 1.31GB/s ± 1% ~ (p=0.079 n=8+10)
RegexpMatchMedium_32 3.16MB/s ± 0% 3.16MB/s ± 0% ~ (all equal)
RegexpMatchMedium_1K 19.9MB/s ± 0% 19.9MB/s ± 0% ~ (p=0.889 n=10+9)
RegexpMatchHard_32 11.7MB/s ± 0% 11.7MB/s ± 0% ~ (all equal)
RegexpMatchHard_1K 13.0MB/s ± 0% 13.0MB/s ± 0% ~ (p=1.000 n=10+10)
Gzip 44.5MB/s ± 0% 44.4MB/s ± 0% -0.22% (p=0.000 n=10+8)
JSONEncode 96.6MB/s ± 1% 96.1MB/s ± 1% -0.48% (p=0.007 n=10+10)
JSONDecode 20.5MB/s ± 1% 20.6MB/s ± 0% +0.63% (p=0.000 n=10+9)
GobDecode 61.0MB/s ± 2% 61.1MB/s ± 1% ~ (p=0.372 n=10+8)
GobEncode 63.8MB/s ± 1% 64.7MB/s ± 1% +1.36% (p=0.000 n=10+10)
Template 21.4MB/s ± 0% 21.8MB/s ± 0% +1.69% (p=0.000 n=9+9)
GoParse 9.63MB/s ± 0% 9.71MB/s ± 0% +0.84% (p=0.000 n=9+8)
Revcomp 377MB/s ± 0% 376MB/s ± 0% ~ (p=0.399 n=9+9)
[Geo mean] 56.2MB/s 56.3MB/s +0.20%
Change-Id: Ic915373f5ef512f9fbc45745860e5db7f6de6286
Reviewed-on: https://go-review.googlesource.com/97755
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-01 20:29:22 +00:00
Ilya Tocar
93665c0d81
crypto: remove hand encoded amd64 instructions
...
Replace BYTE.. encodings with asm. This is possible due to asm
implementing more instructions and removal of
MOV $0, reg -> XOR reg, reg transformation from asm.
Change-Id: I011749ab6b3f64403ab6e746f3760c5841548b57
Reviewed-on: https://go-review.googlesource.com/97936
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 19:20:53 +00:00
Pascal S. de Kloe
5d11838654
encoding/json: read ahead after value consumption
...
Eliminates the need for an extra scanner, read undo and some other tricks.
name old time/op new time/op delta
CodeEncoder-12 1.92ms ± 0% 1.91ms ± 1% -0.65% (p=0.000 n=17+20)
CodeMarshal-12 2.13ms ± 2% 2.12ms ± 1% -0.49% (p=0.038 n=18+17)
CodeDecoder-12 8.55ms ± 2% 8.49ms ± 1% ~ (p=0.119 n=20+18)
UnicodeDecoder-12 411ns ± 0% 422ns ± 0% +2.77% (p=0.000 n=19+15)
DecoderStream-12 320ns ± 1% 307ns ± 1% -3.80% (p=0.000 n=18+20)
CodeUnmarshal-12 9.65ms ± 3% 9.58ms ± 3% ~ (p=0.157 n=20+20)
CodeUnmarshalReuse-12 8.54ms ± 3% 8.56ms ± 2% ~ (p=0.602 n=20+20)
UnmarshalString-12 110ns ± 1% 87ns ± 2% -21.53% (p=0.000 n=16+20)
UnmarshalFloat64-12 101ns ± 1% 77ns ± 2% -23.08% (p=0.000 n=19+20)
UnmarshalInt64-12 94.5ns ± 2% 68.4ns ± 1% -27.60% (p=0.000 n=20+20)
Issue10335-12 128ns ± 1% 100ns ± 1% -21.89% (p=0.000 n=19+18)
Unmapped-12 427ns ± 3% 247ns ± 4% -42.17% (p=0.000 n=20+20)
NumberIsValid-12 23.0ns ± 0% 21.7ns ± 0% -5.73% (p=0.000 n=20+20)
NumberIsValidRegexp-12 641ns ± 0% 642ns ± 0% +0.15% (p=0.003 n=19+19)
EncoderEncode-12 56.9ns ± 0% 55.0ns ± 1% -3.32% (p=0.012 n=2+17)
name old speed new speed delta
CodeEncoder-12 1.01GB/s ± 1% 1.02GB/s ± 1% +0.71% (p=0.000 n=18+20)
CodeMarshal-12 913MB/s ± 2% 917MB/s ± 1% +0.49% (p=0.038 n=18+17)
CodeDecoder-12 227MB/s ± 2% 229MB/s ± 1% ~ (p=0.110 n=20+18)
UnicodeDecoder-12 34.1MB/s ± 0% 33.1MB/s ± 0% -2.73% (p=0.000 n=19+19)
CodeUnmarshal-12 201MB/s ± 3% 203MB/s ± 3% ~ (p=0.151 n=20+20)
name old alloc/op new alloc/op delta
Issue10335-12 320B ± 0% 184B ± 0% -42.50% (p=0.000 n=20+20)
Unmapped-12 568B ± 0% 216B ± 0% -61.97% (p=0.000 n=20+20)
EncoderEncode-12 0.00B 0.00B ~ (all equal)
name old allocs/op new allocs/op delta
Issue10335-12 4.00 ± 0% 3.00 ± 0% -25.00% (p=0.000 n=20+20)
Unmapped-12 18.0 ± 0% 4.0 ± 0% -77.78% (p=0.000 n=20+20)
EncoderEncode-12 0.00 0.00 ~ (all equal)
Fixes #17914
Updates #20693
Updates #10335
Change-Id: I0459a52febb8b79c9a2991e69ed2614cf8740429
Reviewed-on: https://go-review.googlesource.com/47152
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 19:03:04 +00:00
Ilya Tocar
c15984c6c6
math: remove unused variable
...
useSSE41 was used inside asm implementation of floor to select between base and ss4 code path.
We intrinsified floor and left asm functions as a backup for non-sse4 systems.
This made variable unused, so remove it.
Change-Id: Ia2633de7c7cb1ef1d5b15a2366b523e481b722d9
Reviewed-on: https://go-review.googlesource.com/97935
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 18:51:44 +00:00
Hana Kim
e75f805e6f
runtime/trace: skip TestUserTaskSpan upon timestamp error
...
Change-Id: I030baaa0a0abf1e43449faaf676d389a28a868a3
Reviewed-on: https://go-review.googlesource.com/97857
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Peter Weinberger <pjw@google.com>
2018-03-01 18:38:49 +00:00
Giovanni Bajo
f16cc298d3
test: implement negative rules in asmcheck
...
Change-Id: I2b507e35cc314100eaf2ec2d1e5107cc2fc9e7cf
Reviewed-on: https://go-review.googlesource.com/97818
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 18:15:24 +00:00
Giovanni Bajo
0bcf8bcd99
test: in asmcheck, regexp must match from beginning of line
...
This avoid simple bugs like "ADD" matching "FADD". Obviously
"ADD" will still match "ADDQ" so some care is still required
in this regard, but at least a first class of possible errors
is taken care of.
Change-Id: I7deb04c31de30bedac9c026d9889ace4a1d2adcb
Reviewed-on: https://go-review.googlesource.com/97817
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 18:14:54 +00:00
Giovanni Bajo
879a1ff1e4
test: improve asmcheck syntax
...
asmcheck comments now support a compact form of specifying
multiple checks for each platform, using the following syntax:
amd64:"SHL\t[$]4","SHR\t[$]4"
Negative checks are also parsed using the following syntax:
amd64:-"ROR"
though they are still not working.
Moreover, out-of-line comments have been implemented. This
allows to specify asmchecks on comment-only lines, that will
be matched on the first subsequent non-comment non-empty line.
// amd64:"XOR"
// arm:"EOR"
x ^= 1
Change-Id: I110c7462fc6a5c70fd4af0d42f516016ae7f2760
Reviewed-on: https://go-review.googlesource.com/97816
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 18:10:48 +00:00
Josh Bleecher Snyder
9372e3f5ef
runtime: don't allocate to build strings of length 1
...
Use staticbytes instead.
Instrumenting make.bash shows approx 0.5%
of all slicebytetostrings have a buffer of length 1.
name old time/op new time/op delta
SliceByteToString/1-8 14.1ns ± 1% 4.1ns ± 1% -71.13% (p=0.000 n=17+20)
SliceByteToString/2-8 15.5ns ± 2% 15.5ns ± 1% ~ (p=0.061 n=20+18)
SliceByteToString/4-8 14.9ns ± 1% 15.0ns ± 2% +1.25% (p=0.000 n=20+20)
SliceByteToString/8-8 17.1ns ± 1% 17.5ns ± 1% +2.16% (p=0.000 n=19+19)
SliceByteToString/16-8 23.6ns ± 1% 23.9ns ± 1% +1.41% (p=0.000 n=20+18)
SliceByteToString/32-8 26.0ns ± 1% 25.8ns ± 0% -1.05% (p=0.000 n=19+16)
SliceByteToString/64-8 30.0ns ± 0% 30.2ns ± 0% +0.56% (p=0.000 n=16+18)
SliceByteToString/128-8 38.9ns ± 0% 39.0ns ± 0% +0.23% (p=0.019 n=19+15)
Fixes #24172
Change-Id: I3dfa14eefbf9fb4387114e20c9cb40e186abe962
Reviewed-on: https://go-review.googlesource.com/97717
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 17:38:06 +00:00
Josh Bleecher Snyder
aa9c1a8f80
runtime: fix amd64p32 indexbytes in presence of overflow
...
When the slice/string length is very large,
probably artifically large as in CL 97523,
adding BX (length) to R11 (pointer) overflows.
As a result, checking DI < R11 yields the wrong result.
Since they will be equal when the loop is done,
just check DI != R11 instead.
Yes, the pointer itself could overflow, but if that happens,
something else has gone pretty wrong; not our concern here.
Fixes #24187
Change-Id: I2f60fc6ccae739345d01bc80528560726ad4f8c6
Reviewed-on: https://go-review.googlesource.com/97802
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 16:53:33 +00:00
Chad Rosier
77ba071ec6
cmd/compile/internal/ssa: combine consecutive LittleEndian stores on arm64
...
This optimization mirrors that which is already implemented for AMD64. The
optimization specifically targets the binary.LittleEndian.PutUint* functions.
encoding/binary results on Amberwing:
name old time/op new time/op delta
ReadSlice1000Int32s 9.67µs ± 1% 9.64µs ± 1% ~ (p=0.185 n=9+9)
ReadStruct 5.24µs ± 2% 5.36µs ± 2% +2.24% (p=0.002 n=10+8)
ReadInts 8.69µs ± 5% 8.88µs ± 5% ~ (p=0.083 n=10+10)
WriteInts 3.90µs ±10% 3.71µs ± 9% ~ (p=0.077 n=10+10)
WriteSlice1000Int32s 10.9µs ± 1% 10.9µs ± 1% ~ (p=0.701 n=9+9)
PutUint16 572ns ±14% 505ns ±11% -11.75% (p=0.006 n=9+10)
PutUint32 550ns ±18% 540ns ±11% ~ (p=0.692 n=10+10)
PutUint64 565ns ±15% 540ns ±17% ~ (p=0.248 n=10+10)
LittleEndianPutUint16 540ns ±11% 500ns ±10% ~ (p=0.094 n=10+10)
LittleEndianPutUint32 520ns ±15% 480ns ±15% ~ (p=0.087 n=10+10)
LittleEndianPutUint64 505ns ±29% 470ns ±17% ~ (p=0.208 n=10+10)
PutUvarint32 700ns ±21% 635ns ±10% -9.29% (p=0.028 n=10+10)
PutUvarint64 740ns ± 8% 740ns ± 8% ~ (p=0.713 n=10+10)
[Geo mean] 1.53µs 1.47µs -3.93%
name old speed new speed delta
ReadSlice1000Int32s 414MB/s ± 1% 415MB/s ± 1% ~ (p=0.185 n=9+9)
ReadStruct 14.3MB/s ± 2% 14.0MB/s ± 2% -2.21% (p=0.000 n=10+8)
ReadInts 3.45MB/s ± 4% 3.38MB/s ± 6% ~ (p=0.085 n=10+10)
WriteInts 7.71MB/s ± 9% 8.09MB/s ± 8% +4.93% (p=0.048 n=10+10)
WriteSlice1000Int32s 367MB/s ± 1% 366MB/s ± 1% ~ (p=0.701 n=9+9)
PutUint16 3.51MB/s ±14% 3.99MB/s ±11% +13.47% (p=0.009 n=9+10)
PutUint32 7.35MB/s ±21% 7.44MB/s ±10% ~ (p=0.692 n=10+10)
PutUint64 14.3MB/s ±14% 15.0MB/s ±19% ~ (p=0.248 n=10+10)
LittleEndianPutUint16 3.72MB/s ±11% 4.03MB/s ±10% ~ (p=0.094 n=10+10)
LittleEndianPutUint32 7.75MB/s ±15% 8.39MB/s ±13% ~ (p=0.087 n=10+10)
LittleEndianPutUint64 16.1MB/s ±23% 17.2MB/s ±16% ~ (p=0.208 n=10+10)
PutUvarint32 5.76MB/s ±18% 6.32MB/s ±10% +9.72% (p=0.028 n=10+10)
PutUvarint64 10.8MB/s ± 8% 10.8MB/s ± 8% ~ (p=0.713 n=10+10)
[Geo mean] 13.7MB/s 14.3MB/s +4.02%
go1 results on Amberwing:
name old time/op new time/op delta
RegexpMatchEasy0_32 249ns ± 0% 249ns ± 0% ~ (p=0.087 n=10+10)
RegexpMatchEasy0_1K 584ns ± 0% 584ns ± 0% ~ (all equal)
RegexpMatchEasy1_32 246ns ± 0% 246ns ± 0% ~ (p=1.000 n=10+10)
RegexpMatchEasy1_1K 806ns ± 0% 806ns ± 0% ~ (p=0.706 n=10+9)
RegexpMatchMedium_32 314ns ± 0% 314ns ± 0% ~ (all equal)
RegexpMatchMedium_1K 52.1µs ± 0% 52.1µs ± 0% ~ (p=0.245 n=10+8)
RegexpMatchHard_32 2.75µs ± 1% 2.75µs ± 1% ~ (p=0.690 n=10+10)
RegexpMatchHard_1K 78.9µs ± 0% 78.9µs ± 1% ~ (p=0.295 n=9+9)
FmtFprintfEmpty 58.5ns ± 0% 58.5ns ± 0% ~ (all equal)
FmtFprintfString 112ns ± 0% 112ns ± 0% ~ (all equal)
FmtFprintfInt 117ns ± 0% 116ns ± 0% -0.85% (p=0.000 n=10+10)
FmtFprintfIntInt 181ns ± 0% 181ns ± 0% ~ (all equal)
FmtFprintfPrefixedInt 222ns ± 0% 224ns ± 0% +0.90% (p=0.000 n=9+10)
FmtFprintfFloat 318ns ± 1% 322ns ± 0% ~ (p=0.059 n=10+8)
FmtManyArgs 736ns ± 1% 735ns ± 0% ~ (p=0.206 n=9+9)
Gzip 437ms ± 0% 436ms ± 0% -0.25% (p=0.000 n=10+10)
HTTPClientServer 89.8µs ± 1% 90.2µs ± 2% ~ (p=0.393 n=10+10)
JSONEncode 20.1ms ± 1% 20.2ms ± 1% ~ (p=0.065 n=9+10)
JSONDecode 94.2ms ± 1% 93.9ms ± 1% -0.42% (p=0.043 n=10+10)
GobDecode 12.7ms ± 1% 12.8ms ± 2% +0.94% (p=0.019 n=10+10)
GobEncode 12.1ms ± 0% 12.1ms ± 0% ~ (p=0.052 n=10+10)
Mandelbrot200 5.06ms ± 0% 5.05ms ± 0% -0.04% (p=0.000 n=9+10)
TimeParse 450ns ± 3% 446ns ± 0% ~ (p=0.238 n=10+9)
TimeFormat 485ns ± 1% 483ns ± 1% ~ (p=0.073 n=10+10)
Template 90.4ms ± 0% 90.7ms ± 0% +0.29% (p=0.000 n=8+10)
GoParse 6.01ms ± 0% 6.03ms ± 0% +0.35% (p=0.000 n=10+10)
BinaryTree17 11.7s ± 0% 11.7s ± 0% ~ (p=0.481 n=10+10)
Revcomp 669ms ± 0% 669ms ± 0% ~ (p=0.315 n=10+10)
Fannkuch11 3.40s ± 0% 3.37s ± 0% -0.92% (p=0.000 n=10+10)
[Geo mean] 67.9µs 67.9µs +0.02%
name old speed new speed delta
RegexpMatchEasy0_32 128MB/s ± 0% 128MB/s ± 0% -0.08% (p=0.003 n=8+10)
RegexpMatchEasy0_1K 1.75GB/s ± 0% 1.75GB/s ± 0% ~ (p=0.642 n=8+10)
RegexpMatchEasy1_32 130MB/s ± 0% 130MB/s ± 0% ~ (p=0.690 n=10+9)
RegexpMatchEasy1_1K 1.27GB/s ± 0% 1.27GB/s ± 0% ~ (p=0.661 n=10+9)
RegexpMatchMedium_32 3.18MB/s ± 0% 3.18MB/s ± 0% ~ (all equal)
RegexpMatchMedium_1K 19.7MB/s ± 0% 19.6MB/s ± 0% ~ (p=0.190 n=10+9)
RegexpMatchHard_32 11.6MB/s ± 0% 11.6MB/s ± 1% ~ (p=0.669 n=10+10)
RegexpMatchHard_1K 13.0MB/s ± 0% 13.0MB/s ± 0% ~ (p=0.718 n=9+9)
Gzip 44.4MB/s ± 0% 44.5MB/s ± 0% +0.24% (p=0.000 n=10+10)
JSONEncode 96.5MB/s ± 1% 96.1MB/s ± 1% ~ (p=0.065 n=9+10)
JSONDecode 20.6MB/s ± 1% 20.7MB/s ± 1% +0.42% (p=0.041 n=10+10)
GobDecode 60.6MB/s ± 1% 60.0MB/s ± 2% -0.92% (p=0.016 n=10+10)
GobEncode 63.4MB/s ± 0% 63.6MB/s ± 0% ~ (p=0.055 n=10+10)
Template 21.5MB/s ± 0% 21.4MB/s ± 0% -0.30% (p=0.000 n=9+10)
GoParse 9.64MB/s ± 0% 9.61MB/s ± 0% -0.36% (p=0.000 n=10+10)
Revcomp 380MB/s ± 0% 380MB/s ± 0% ~ (p=0.323 n=10+10)
[Geo mean] 56.0MB/s 55.9MB/s -0.07%
Change-Id: I79a4978d42d01a5f72ed5ceec07f5e78ac6b3859
Reviewed-on: https://go-review.googlesource.com/97175
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-01 16:40:19 +00:00
Wei Xiao
562346b7d0
bytes: add asm version of Index for short strings on arm64
...
Currently we have special case for 1-byte strings,
this extends it to strings shorter than 9 bytes on arm64.
Benchmark results:
name old time/op new time/op delta
IndexByte/10-32 18.6ns ± 0% 18.1ns ± 0% -2.69% (p=0.008 n=5+5)
IndexByte/32-32 16.8ns ± 1% 16.9ns ± 1% ~ (p=0.762 n=5+5)
IndexByte/4K-32 464ns ± 0% 464ns ± 0% ~ (all equal)
IndexByte/4M-32 528µs ± 1% 506µs ± 1% -4.17% (p=0.008 n=5+5)
IndexByte/64M-32 18.7ms ± 0% 18.7ms ± 1% ~ (p=0.730 n=4+5)
IndexBytePortable/10-32 33.8ns ± 0% 34.9ns ± 3% ~ (p=0.167 n=5+5)
IndexBytePortable/32-32 65.3ns ± 0% 66.1ns ± 2% ~ (p=0.444 n=5+5)
IndexBytePortable/4K-32 5.88µs ± 0% 5.88µs ± 0% ~ (p=0.325 n=5+5)
IndexBytePortable/4M-32 6.03ms ± 0% 6.03ms ± 0% ~ (p=1.000 n=5+5)
IndexBytePortable/64M-32 98.8ms ± 0% 98.9ms ± 0% +0.10% (p=0.008 n=5+5)
IndexRune/10-32 57.7ns ± 0% 49.2ns ± 0% -14.73% (p=0.000 n=5+4)
IndexRune/32-32 57.7ns ± 0% 58.6ns ± 0% +1.56% (p=0.008 n=5+5)
IndexRune/4K-32 511ns ± 0% 513ns ± 0% +0.39% (p=0.008 n=5+5)
IndexRune/4M-32 527µs ± 1% 527µs ± 1% ~ (p=0.690 n=5+5)
IndexRune/64M-32 18.7ms ± 0% 18.7ms ± 1% ~ (p=0.190 n=4+5)
IndexRuneASCII/10-32 23.8ns ± 0% 23.8ns ± 0% ~ (all equal)
IndexRuneASCII/32-32 24.3ns ± 0% 24.3ns ± 0% ~ (all equal)
IndexRuneASCII/4K-32 468ns ± 0% 468ns ± 0% ~ (all equal)
IndexRuneASCII/4M-32 521µs ± 1% 531µs ± 2% +1.91% (p=0.016 n=5+5)
IndexRuneASCII/64M-32 18.6ms ± 1% 18.5ms ± 0% ~ (p=0.730 n=5+4)
Index/10-32 89.1ns ±13% 25.2ns ± 0% -71.72% (p=0.008 n=5+5)
Index/32-32 225ns ± 2% 226ns ± 3% ~ (p=0.683 n=5+5)
Index/4K-32 11.9µs ± 0% 11.8µs ± 0% -0.22% (p=0.008 n=5+5)
Index/4M-32 12.1ms ± 0% 12.1ms ± 0% ~ (p=0.548 n=5+5)
Index/64M-32 197ms ± 0% 197ms ± 0% ~ (p=0.690 n=5+5)
IndexEasy/10-32 46.2ns ± 0% 22.1ns ± 8% -52.16% (p=0.008 n=5+5)
IndexEasy/32-32 46.2ns ± 0% 47.2ns ± 0% +2.16% (p=0.008 n=5+5)
IndexEasy/4K-32 499ns ± 0% 502ns ± 0% +0.44% (p=0.008 n=5+5)
IndexEasy/4M-32 529µs ± 2% 529µs ± 1% ~ (p=0.841 n=5+5)
IndexEasy/64M-32 18.6ms ± 1% 18.7ms ± 1% ~ (p=0.222 n=5+5)
IndexAnyASCII/1:1-32 15.7ns ± 0% 15.7ns ± 0% ~ (all equal)
IndexAnyASCII/1:2-32 17.2ns ± 0% 17.2ns ± 0% ~ (all equal)
IndexAnyASCII/1:4-32 20.0ns ± 0% 20.0ns ± 0% ~ (all equal)
IndexAnyASCII/1:8-32 34.8ns ± 0% 34.8ns ± 0% ~ (all equal)
IndexAnyASCII/1:16-32 48.1ns ± 0% 48.1ns ± 0% ~ (all equal)
IndexAnyASCII/16:1-32 97.9ns ± 1% 97.7ns ± 0% ~ (p=0.857 n=5+5)
IndexAnyASCII/16:2-32 102ns ± 0% 102ns ± 0% ~ (all equal)
IndexAnyASCII/16:4-32 116ns ± 1% 116ns ± 1% ~ (p=1.000 n=5+5)
IndexAnyASCII/16:8-32 141ns ± 1% 141ns ± 0% ~ (p=0.571 n=5+4)
IndexAnyASCII/16:16-32 178ns ± 0% 178ns ± 0% ~ (all equal)
IndexAnyASCII/256:1-32 1.09µs ± 0% 1.09µs ± 0% ~ (all equal)
IndexAnyASCII/256:2-32 1.09µs ± 0% 1.10µs ± 0% +0.27% (p=0.008 n=5+5)
IndexAnyASCII/256:4-32 1.11µs ± 0% 1.11µs ± 0% ~ (p=0.397 n=5+5)
IndexAnyASCII/256:8-32 1.10µs ± 0% 1.10µs ± 0% ~ (p=0.444 n=5+5)
IndexAnyASCII/256:16-32 1.14µs ± 0% 1.14µs ± 0% ~ (all equal)
IndexAnyASCII/4096:1-32 16.5µs ± 0% 16.5µs ± 0% ~ (p=1.000 n=5+5)
IndexAnyASCII/4096:2-32 17.0µs ± 0% 17.0µs ± 0% ~ (p=0.159 n=5+4)
IndexAnyASCII/4096:4-32 17.1µs ± 0% 17.1µs ± 0% ~ (p=0.921 n=4+5)
IndexAnyASCII/4096:8-32 16.5µs ± 0% 16.5µs ± 0% ~ (p=0.460 n=5+5)
IndexAnyASCII/4096:16-32 16.5µs ± 0% 16.5µs ± 0% ~ (p=0.794 n=5+4)
IndexPeriodic/IndexPeriodic2-32 189µs ± 0% 189µs ± 0% ~ (p=0.841 n=5+5)
IndexPeriodic/IndexPeriodic4-32 189µs ± 0% 189µs ± 0% -0.03% (p=0.016 n=5+4)
IndexPeriodic/IndexPeriodic8-32 189µs ± 0% 189µs ± 0% ~ (p=0.651 n=5+5)
IndexPeriodic/IndexPeriodic16-32 175µs ± 9% 174µs ± 7% ~ (p=1.000 n=5+5)
IndexPeriodic/IndexPeriodic32-32 75.1µs ± 0% 75.1µs ± 0% ~ (p=0.690 n=5+5)
IndexPeriodic/IndexPeriodic64-32 42.6µs ± 0% 44.7µs ± 0% +4.98% (p=0.008 n=5+5)
name old speed new speed delta
IndexByte/10-32 538MB/s ± 0% 552MB/s ± 0% +2.65% (p=0.008 n=5+5)
IndexByte/32-32 1.90GB/s ± 1% 1.90GB/s ± 1% ~ (p=0.548 n=5+5)
IndexByte/4K-32 8.82GB/s ± 0% 8.81GB/s ± 0% ~ (p=0.548 n=5+5)
IndexByte/4M-32 7.95GB/s ± 1% 8.29GB/s ± 1% +4.35% (p=0.008 n=5+5)
IndexByte/64M-32 3.58GB/s ± 0% 3.60GB/s ± 1% ~ (p=0.730 n=4+5)
IndexBytePortable/10-32 296MB/s ± 0% 286MB/s ± 3% ~ (p=0.381 n=4+5)
IndexBytePortable/32-32 490MB/s ± 0% 485MB/s ± 2% ~ (p=0.286 n=5+5)
IndexBytePortable/4K-32 697MB/s ± 0% 697MB/s ± 0% ~ (p=0.413 n=5+5)
IndexBytePortable/4M-32 696MB/s ± 0% 695MB/s ± 0% ~ (p=0.897 n=5+5)
IndexBytePortable/64M-32 679MB/s ± 0% 678MB/s ± 0% -0.10% (p=0.008 n=5+5)
IndexRune/10-32 173MB/s ± 0% 203MB/s ± 0% +17.24% (p=0.016 n=5+4)
IndexRune/32-32 555MB/s ± 0% 546MB/s ± 0% -1.62% (p=0.008 n=5+5)
IndexRune/4K-32 8.01GB/s ± 0% 7.98GB/s ± 0% -0.38% (p=0.008 n=5+5)
IndexRune/4M-32 7.97GB/s ± 1% 7.95GB/s ± 1% ~ (p=0.690 n=5+5)
IndexRune/64M-32 3.59GB/s ± 0% 3.58GB/s ± 1% ~ (p=0.190 n=4+5)
IndexRuneASCII/10-32 420MB/s ± 0% 420MB/s ± 0% ~ (p=0.190 n=5+4)
IndexRuneASCII/32-32 1.32GB/s ± 0% 1.32GB/s ± 0% ~ (p=0.333 n=5+5)
IndexRuneASCII/4K-32 8.75GB/s ± 0% 8.75GB/s ± 0% ~ (p=0.690 n=5+5)
IndexRuneASCII/4M-32 8.04GB/s ± 1% 7.89GB/s ± 2% -1.87% (p=0.016 n=5+5)
IndexRuneASCII/64M-32 3.61GB/s ± 1% 3.62GB/s ± 0% ~ (p=0.730 n=5+4)
Index/10-32 113MB/s ±14% 397MB/s ± 0% +249.76% (p=0.008 n=5+5)
Index/32-32 142MB/s ± 2% 141MB/s ± 3% ~ (p=0.794 n=5+5)
Index/4K-32 345MB/s ± 0% 346MB/s ± 0% +0.22% (p=0.008 n=5+5)
Index/4M-32 345MB/s ± 0% 345MB/s ± 0% ~ (p=0.619 n=5+5)
Index/64M-32 341MB/s ± 0% 341MB/s ± 0% ~ (p=0.595 n=5+5)
IndexEasy/10-32 216MB/s ± 0% 453MB/s ± 8% +109.60% (p=0.008 n=5+5)
IndexEasy/32-32 692MB/s ± 0% 678MB/s ± 0% -2.01% (p=0.008 n=5+5)
IndexEasy/4K-32 8.19GB/s ± 0% 8.16GB/s ± 0% -0.45% (p=0.008 n=5+5)
IndexEasy/4M-32 7.93GB/s ± 2% 7.93GB/s ± 1% ~ (p=0.841 n=5+5)
IndexEasy/64M-32 3.60GB/s ± 1% 3.59GB/s ± 1% ~ (p=0.222 n=5+5)
Change-Id: I4ca69378a2df6f9ba748c6a2706953ee1bd07343
Reviewed-on: https://go-review.googlesource.com/96555
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-01 15:24:33 +00:00
Marcel van Lohuizen
4c1aff87f1
testing: gracefully handle subtest failing parent’s T
...
Don’t panic if a subtest inadvertently calls FailNow
on a parent’s T. Instead, report the offending subtest
while still reporting the error with the ancestor test and
keep exiting goroutines.
Note that this implementation has a race if parallel
subtests are failing the parent concurrently.
This is fine:
Calling FailNow on a parent is considered an error
in principle, at the moment, and is reported if it is
detected. Having the race allows the race detector
to detect the error as well.
Fixes #22882
Change-Id: Ifa6d5e55bb88f6bcbb562fc8c99f1f77e320015a
Reviewed-on: https://go-review.googlesource.com/97635
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 10:17:22 +00:00
Giovanni Bajo
c9438cb198
test: add support for code generation tests (asmcheck)
...
The top-level test harness is modified to support a new kind
of test: "asmcheck". This is meant to replace asm_test.go
as an easier and more readable way to test code generation.
I've added a couple of codegen tests to get initial feedback
on the syntax. I've created them under a common "codegen"
subdirectory, so that it's easier to run them all with
"go run run.go -v codegen".
The asmcheck syntax allows to insert line comments that
can specify a regular expression to match in the assembly code,
for multiple architectures (the testsuite will automatically
build each testfile multiple times, one per mentioned architecture).
Negative matches are unsupported for now, so this cannot fully
replace asm_test yet.
Change-Id: Ifdbba389f01d55e63e73c99e5f5449e642101d55
Reviewed-on: https://go-review.googlesource.com/97355
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2018-03-01 07:59:54 +00:00
Tobias Klauser
c7c01efd96
runtime: clean up libc_* definitions on Solaris
...
All functions defined in syscall2_solaris.go have the respective libc_*
var in syscall_solaris.go, except for libc_close. Move it from
os3_solaris.go
Remove unused libc_fstat.
Order go:cgo_import_dynamic and go:linkname lists in
syscall2_solaris.go alphabetically.
Change-Id: I9f12fa473cf1ae351448ac45597c82a67d799c31
Reviewed-on: https://go-review.googlesource.com/97736
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-01 07:31:53 +00:00
Joe Tsai
4338518da8
encoding/json: avoid assuming side-effect free reflect.Value.Addr().Elem()
...
Consider the following:
type child struct{ Field string }
type parent struct{ child }
p := new(parent)
v := reflect.ValueOf(p).Elem().Field(0)
v.Field(0).SetString("hello") // v.Field = "hello"
v = v.Addr().Elem() // v = *(&v)
v.Field(0).SetString("goodbye") // v.Field = "goodbye"
It would appear that v.Addr().Elem() should have the same value, and
that it would be safe to set "goodbye".
However, after CL 66331, any interspersed calls between Field calls
causes the RO flag to be set.
Thus, setting to "goodbye" actually causes a panic.
That CL affects decodeState.indirect which assumes that back-to-back
Value.Addr().Elem() is side-effect free. We fix that logic to keep
track of the Addr() and Elem() calls and set v back to the original
after a full round-trip has occured.
Fixes #24152
Updates #24153
Change-Id: Ie50f8fe963f00cef8515d89d1d5cbc43b76d9f9c
Reviewed-on: https://go-review.googlesource.com/97796
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 00:16:20 +00:00
erifan01
8c3c8332cd
cmd/asm: enable several arm64 load & store instructions
...
Instructions LDARB, LDARH, LDAXPW, LDAXP, STLRB, STLRH, STLXP, STLXPW, STXP,
STXPW have been added before, but they are not enabled. This CL enabled them.
Change the form of LDXP and LDXPW to the form of LDP, and fix a bug of STLXP.
Change-Id: I5d2b51494b92451bf6b072c65cfdd8acf07e9b54
Reviewed-on: https://go-review.googlesource.com/96215
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-28 23:46:21 +00:00
Ben Shi
1057624985
cmd/compile: optimize ARM64 code with EON/ORN
...
EON and ORN are efficient ARM64 instructions. EON combines (x ^ ^y)
into a single operation, and so ORN does for (x | ^y).
This CL implements that optimization. And here are benchmark results
with RaspberryPi3/ArchLinux.
1. A specific test gets about 13% improvement.
EONORN 181µs ± 0% 157µs ± 0% -13.26% (p=0.000 n=26+23)
(https://github.com/benshi001/ugo1/blob/master/eonorn_test.go )
2. There is little change in the go1 benchmark, excluding noise.
name old time/op new time/op delta
BinaryTree17-4 44.1s ± 2% 44.0s ± 2% ~ (p=0.513 n=30+30)
Fannkuch11-4 32.9s ± 3% 32.8s ± 3% -0.12% (p=0.024 n=30+30)
FmtFprintfEmpty-4 561ns ± 9% 558ns ± 9% ~ (p=0.654 n=30+30)
FmtFprintfString-4 1.09µs ± 4% 1.09µs ± 3% ~ (p=0.158 n=30+30)
FmtFprintfInt-4 1.12µs ± 0% 1.12µs ± 0% ~ (p=0.917 n=23+28)
FmtFprintfIntInt-4 1.73µs ± 0% 1.76µs ± 4% ~ (p=0.665 n=23+30)
FmtFprintfPrefixedInt-4 2.15µs ± 1% 2.15µs ± 0% ~ (p=0.389 n=27+26)
FmtFprintfFloat-4 3.18µs ± 4% 3.13µs ± 0% -1.50% (p=0.003 n=30+23)
FmtManyArgs-4 7.32µs ± 4% 7.21µs ± 0% ~ (p=0.220 n=30+25)
GobDecode-4 99.1ms ± 9% 97.0ms ± 0% -2.07% (p=0.000 n=30+23)
GobEncode-4 83.3ms ± 3% 82.4ms ± 4% ~ (p=0.321 n=30+30)
Gzip-4 4.39s ± 4% 4.32s ± 2% -1.42% (p=0.017 n=30+23)
Gunzip-4 440ms ± 0% 447ms ± 4% +1.54% (p=0.006 n=24+30)
HTTPClientServer-4 547µs ± 1% 537µs ± 1% -1.91% (p=0.000 n=30+30)
JSONEncode-4 211ms ± 0% 211ms ± 0% +0.04% (p=0.000 n=23+24)
JSONDecode-4 847ms ± 0% 847ms ± 0% ~ (p=0.158 n=25+25)
Mandelbrot200-4 46.5ms ± 0% 46.5ms ± 0% -0.04% (p=0.000 n=25+24)
GoParse-4 43.4ms ± 0% 43.4ms ± 0% ~ (p=0.494 n=24+25)
RegexpMatchEasy0_32-4 1.03µs ± 0% 1.03µs ± 0% ~ (all equal)
RegexpMatchEasy0_1K-4 4.02µs ± 3% 3.98µs ± 0% -0.95% (p=0.003 n=30+24)
RegexpMatchEasy1_32-4 1.01µs ± 3% 1.01µs ± 2% ~ (p=0.629 n=30+30)
RegexpMatchEasy1_1K-4 6.39µs ± 0% 6.39µs ± 0% ~ (p=0.564 n=24+23)
RegexpMatchMedium_32-4 1.80µs ± 3% 1.78µs ± 0% ~ (p=0.155 n=30+24)
RegexpMatchMedium_1K-4 555µs ± 0% 563µs ± 3% +1.55% (p=0.004 n=27+30)
RegexpMatchHard_32-4 31.0µs ± 4% 30.5µs ± 1% -1.58% (p=0.000 n=30+23)
RegexpMatchHard_1K-4 947µs ± 4% 931µs ± 0% -1.66% (p=0.009 n=30+24)
Revcomp-4 7.71s ± 4% 7.71s ± 4% ~ (p=0.196 n=29+30)
Template-4 877ms ± 0% 878ms ± 0% +0.16% (p=0.018 n=23+27)
TimeParse-4 4.75µs ± 1% 4.74µs ± 0% ~ (p=0.895 n=24+23)
TimeFormat-4 4.83µs ± 4% 4.83µs ± 4% ~ (p=0.767 n=30+30)
[Geo mean] 709µs 707µs -0.35%
name old speed new speed delta
GobDecode-4 7.75MB/s ± 8% 7.91MB/s ± 0% +2.03% (p=0.001 n=30+23)
GobEncode-4 9.22MB/s ± 3% 9.32MB/s ± 4% ~ (p=0.389 n=30+30)
Gzip-4 4.43MB/s ± 4% 4.43MB/s ± 4% ~ (p=0.888 n=30+30)
Gunzip-4 44.1MB/s ± 0% 43.4MB/s ± 4% -1.46% (p=0.009 n=24+30)
JSONEncode-4 9.18MB/s ± 0% 9.18MB/s ± 0% ~ (p=0.308 n=16+24)
JSONDecode-4 2.29MB/s ± 0% 2.29MB/s ± 0% ~ (all equal)
GoParse-4 1.33MB/s ± 0% 1.33MB/s ± 0% ~ (all equal)
RegexpMatchEasy0_32-4 30.9MB/s ± 0% 30.9MB/s ± 0% ~ (p=1.000 n=23+24)
RegexpMatchEasy0_1K-4 255MB/s ± 3% 257MB/s ± 0% +0.92% (p=0.004 n=30+24)
RegexpMatchEasy1_32-4 31.7MB/s ± 3% 31.6MB/s ± 2% ~ (p=0.603 n=30+30)
RegexpMatchEasy1_1K-4 160MB/s ± 0% 160MB/s ± 0% ~ (p=0.435 n=24+23)
RegexpMatchMedium_32-4 554kB/s ± 3% 560kB/s ± 0% +1.08% (p=0.004 n=30+24)
RegexpMatchMedium_1K-4 1.85MB/s ± 0% 1.82MB/s ± 3% -1.48% (p=0.001 n=27+30)
RegexpMatchHard_32-4 1.03MB/s ± 4% 1.05MB/s ± 1% +1.51% (p=0.027 n=30+23)
RegexpMatchHard_1K-4 1.08MB/s ± 4% 1.10MB/s ± 0% +1.69% (p=0.002 n=30+25)
Revcomp-4 33.0MB/s ± 4% 33.0MB/s ± 4% ~ (p=0.272 n=29+30)
Template-4 2.21MB/s ± 0% 2.21MB/s ± 0% ~ (all equal)
[Geo mean] 7.75MB/s 7.77MB/s +0.29%
3. There is little regression in the compilecmp benchmark.
name old time/op new time/op delta
Template 2.28s ± 3% 2.28s ± 4% ~ (p=0.739 n=10+10)
Unicode 1.34s ± 4% 1.32s ± 3% ~ (p=0.113 n=10+9)
GoTypes 8.10s ± 3% 8.18s ± 3% ~ (p=0.393 n=10+10)
Compiler 39.0s ± 3% 39.2s ± 3% ~ (p=0.393 n=10+10)
SSA 114s ± 3% 115s ± 2% ~ (p=0.631 n=10+10)
Flate 1.41s ± 2% 1.42s ± 3% ~ (p=0.353 n=10+10)
GoParser 1.81s ± 1% 1.83s ± 2% ~ (p=0.211 n=10+9)
Reflect 5.06s ± 2% 5.06s ± 2% ~ (p=0.912 n=10+10)
Tar 2.19s ± 3% 2.20s ± 3% ~ (p=0.247 n=10+10)
XML 2.65s ± 2% 2.67s ± 5% ~ (p=0.796 n=10+10)
[Geo mean] 4.92s 4.93s +0.27%
name old user-time/op new user-time/op delta
Template 2.81s ± 2% 2.81s ± 3% ~ (p=0.971 n=10+10)
Unicode 1.70s ± 3% 1.67s ± 5% ~ (p=0.315 n=10+10)
GoTypes 9.71s ± 1% 9.78s ± 1% +0.71% (p=0.023 n=10+10)
Compiler 47.3s ± 1% 47.1s ± 3% ~ (p=0.579 n=10+10)
SSA 143s ± 2% 143s ± 2% ~ (p=0.280 n=10+10)
Flate 1.70s ± 3% 1.71s ± 3% ~ (p=0.481 n=10+10)
GoParser 2.21s ± 3% 2.21s ± 1% ~ (p=0.549 n=10+9)
Reflect 5.89s ± 1% 5.87s ± 2% ~ (p=0.739 n=10+10)
Tar 2.66s ± 2% 2.63s ± 2% ~ (p=0.105 n=10+10)
XML 3.16s ± 3% 3.18s ± 2% ~ (p=0.143 n=10+10)
[Geo mean] 5.97s 5.97s -0.06%
name old text-bytes new text-bytes delta
HelloSize 637kB ± 0% 637kB ± 0% ~ (all equal)
name old data-bytes new data-bytes delta
HelloSize 9.46kB ± 0% 9.46kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.24MB ± 0% 1.24MB ± 0% ~ (all equal)
Change-Id: Ie27357d65c5ce9d07afdffebe1e2daadcaa3369f
Reviewed-on: https://go-review.googlesource.com/97036
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 23:42:40 +00:00
Balaram Makam
094258408d
cmd/compile: improve fractional word zeroing
...
This change improves fractional word zeroing by
using overlapping MOVDs for the fractions.
Performance of go1 benchmarks on Amberwing was all noise:
name old time/op new time/op delta
RegexpMatchEasy0_32 247ns ± 0% 246ns ± 0% -0.40% (p=0.008 n=5+5)
RegexpMatchEasy0_1K 581ns ± 0% 579ns ± 0% -0.34% (p=0.000 n=5+4)
RegexpMatchEasy1_32 244ns ± 0% 242ns ± 0% ~ (p=0.079 n=4+5)
RegexpMatchEasy1_1K 804ns ± 0% 805ns ± 0% ~ (p=0.238 n=5+4)
RegexpMatchMedium_32 313ns ± 0% 311ns ± 0% -0.64% (p=0.008 n=5+5)
RegexpMatchMedium_1K 52.2µs ± 0% 51.9µs ± 0% -0.52% (p=0.016 n=5+4)
RegexpMatchHard_32 2.75µs ± 0% 2.74µs ± 0% ~ (p=0.603 n=5+5)
RegexpMatchHard_1K 78.8µs ± 0% 78.9µs ± 0% +0.05% (p=0.008 n=5+5)
FmtFprintfEmpty 58.6ns ± 0% 58.6ns ± 0% ~ (p=0.159 n=5+5)
FmtFprintfString 118ns ± 0% 119ns ± 0% +0.85% (p=0.008 n=5+5)
FmtFprintfInt 119ns ± 0% 123ns ± 0% +3.36% (p=0.016 n=5+4)
FmtFprintfIntInt 192ns ± 0% 200ns ± 0% +4.17% (p=0.008 n=5+5)
FmtFprintfPrefixedInt 224ns ± 0% 209ns ± 0% -6.70% (p=0.008 n=5+5)
FmtFprintfFloat 335ns ± 0% 335ns ± 0% ~ (all equal)
FmtManyArgs 775ns ± 0% 811ns ± 1% +4.67% (p=0.016 n=4+5)
Gzip 437ms ± 0% 438ms ± 0% +0.19% (p=0.008 n=5+5)
HTTPClientServer 88.7µs ± 1% 90.3µs ± 1% +1.75% (p=0.016 n=5+5)
JSONEncode 20.1ms ± 1% 20.1ms ± 0% ~ (p=1.000 n=5+5)
JSONDecode 94.7ms ± 1% 94.8ms ± 1% ~ (p=0.548 n=5+5)
GobDecode 12.8ms ± 1% 12.8ms ± 1% ~ (p=0.548 n=5+5)
GobEncode 12.1ms ± 0% 12.1ms ± 0% ~ (p=0.151 n=5+5)
Mandelbrot200 5.37ms ± 0% 5.37ms ± 0% -0.03% (p=0.008 n=5+5)
TimeParse 450ns ± 0% 451ns ± 1% ~ (p=0.635 n=4+5)
TimeFormat 485ns ± 0% 484ns ± 0% ~ (p=0.508 n=5+5)
Template 90.4ms ± 0% 90.2ms ± 0% -0.24% (p=0.016 n=5+5)
GoParse 5.98ms ± 0% 5.98ms ± 0% ~ (p=1.000 n=5+5)
BinaryTree17 11.8s ± 0% 11.8s ± 0% ~ (p=0.841 n=5+5)
Revcomp 669ms ± 0% 669ms ± 0% ~ (p=0.310 n=5+5)
Fannkuch11 3.28s ± 0% 3.34s ± 0% +1.64% (p=0.008 n=5+5)
name old speed new speed delta
RegexpMatchEasy0_32 129MB/s ± 0% 130MB/s ± 0% +0.30% (p=0.016 n=4+5)
RegexpMatchEasy0_1K 1.76GB/s ± 0% 1.77GB/s ± 0% +0.27% (p=0.016 n=5+4)
RegexpMatchEasy1_32 131MB/s ± 0% 132MB/s ± 0% +0.71% (p=0.016 n=4+5)
RegexpMatchEasy1_1K 1.27GB/s ± 0% 1.27GB/s ± 0% -0.17% (p=0.016 n=5+4)
RegexpMatchMedium_32 3.19MB/s ± 0% 3.21MB/s ± 0% +0.63% (p=0.008 n=5+5)
RegexpMatchMedium_1K 19.6MB/s ± 0% 19.7MB/s ± 0% +0.52% (p=0.016 n=5+4)
RegexpMatchHard_32 11.7MB/s ± 0% 11.7MB/s ± 0% ~ (p=0.643 n=5+5)
RegexpMatchHard_1K 13.0MB/s ± 0% 13.0MB/s ± 0% ~ (p=0.079 n=4+5)
Gzip 44.4MB/s ± 0% 44.3MB/s ± 0% -0.19% (p=0.008 n=5+5)
JSONEncode 96.3MB/s ± 1% 96.4MB/s ± 0% ~ (p=1.000 n=5+5)
JSONDecode 20.5MB/s ± 1% 20.5MB/s ± 1% ~ (p=0.460 n=5+5)
GobDecode 60.1MB/s ± 1% 59.9MB/s ± 1% ~ (p=0.548 n=5+5)
GobEncode 63.5MB/s ± 0% 63.7MB/s ± 0% ~ (p=0.135 n=5+5)
Template 21.5MB/s ± 0% 21.5MB/s ± 0% +0.24% (p=0.016 n=5+5)
GoParse 9.68MB/s ± 0% 9.69MB/s ± 0% ~ (p=0.786 n=5+5)
Revcomp 380MB/s ± 0% 380MB/s ± 0% ~ (p=0.310 n=5+5)
Change-Id: I596eee6421cdbad1a0189cdb9fe0628bba534eaf
Reviewed-on: https://go-review.googlesource.com/96775
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 23:28:39 +00:00
Hana Kim
413d8a833d
cmd/trace: skip tests if parsing fails with timestamp error
...
runtime/trace test already skips tests in case of the timestamp
error.
Moreover, relax TestAnalyzeAnnotationGC test condition to
deal with the inaccuracy caused from use of cputicks in tracing.
Fixes #24081
Updates #16755
Change-Id: I708ecc6da202eaec07e431085a75d3dbfbf4cc06
Reviewed-on: https://go-review.googlesource.com/97757
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 22:09:34 +00:00
Matthew Dempsky
b3f00c6985
cmd/compile: fix unexpected type alias crash
...
OCOMPLIT stores the pre-typechecked type in n.Right, and then moves it
to n.Type. However, it wasn't clearing n.Right, so n.Right continued
to point to the OTYPE node. (Exception: slice literals reused n.Right
to store the array length.)
When exporting inline function bodies, we don't expect to need to save
any type aliases. Doing so wouldn't be wrong per se, but it's
completely unnecessary and would just bloat the export data.
However, reexportdep (whose role is to identify types needed by inline
function bodies) uses a generic tree traversal mechanism, which visits
n.Right even for O{ARRAY,MAP,STRUCT}LIT nodes. This means it finds the
OTYPE node, and mistakenly interpreted that the type alias needs to be
exported.
The straight forward fix is to just clear n.Right when typechecking
composite literals.
Fixes #24173 .
Change-Id: Ia2d556bfdd806c83695b08e18b6cd71eff0772fc
Reviewed-on: https://go-review.googlesource.com/97719
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-02-28 20:18:37 +00:00
Daniel Martí
1e308fbc1a
cmd/compile: improved error message when calling a shadowed builtin
...
Otherwise, the error can be confusing if one forgets or doesn't know
that the builtin is being shadowed, which is not common practice.
Fixes #22822 .
Change-Id: I735393b5ce28cb83815a1c3f7cd2e7bb5080a32d
Reviewed-on: https://go-review.googlesource.com/97455
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-02-28 19:39:52 +00:00
Adam Langley
4b1d704d14
crypto/x509: parse invalid DNS names and email addresses.
...
Go 1.10 requires that SANs in certificates are valid. However, a
non-trivial number of (generally non-WebPKI) certificates have invalid
strings in dnsName fields and some have even put those dnsName SANs in
CA certificates.
This change defers validity checking until name constraints are checked.
Fixes #23995 , #23711 .
Change-Id: I2e0ebb0898c047874a3547226b71e3029333b7f1
Reviewed-on: https://go-review.googlesource.com/96378
Run-TryBot: Adam Langley <agl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-02-28 19:14:11 +00:00
Robert Griesemer
c1359db9cc
go/types: fix empty interface optimization (minor performance bug)
...
The tests checking for empty interfaces so that they can be fast-
tracked in the code actually didn't test the right field and the
fast track code never executed. Doing it now.
Change-Id: I58b2951efb3fb40b3366874c79fd653591ae0e99
Reviewed-on: https://go-review.googlesource.com/97519
Reviewed-by: Alan Donovan <adonovan@google.com>
2018-02-28 18:22:21 +00:00
Robert Griesemer
e2b5e6038b
go/types: fix incorrect context when type-checking interfaces
...
Regression, introduced by https://go-review.googlesource.com/c/go/+/79575
which meant to be more conservative but ended up destroying an important
context.
Fixes #24140 .
Change-Id: Id428dbb295ce9f11ab7cd54ec5ab51ef4291ac3f
Reviewed-on: https://go-review.googlesource.com/97535
Reviewed-by: Alan Donovan <adonovan@google.com>
2018-02-28 18:22:16 +00:00
Josh Bleecher Snyder
91a05b92be
cmd/compile: prevent memmove in copy when dst == src
...
This causes a nominal increase in binary size.
name old object-bytes new object-bytes delta
Template 399kB ± 0% 399kB ± 0% ~ (all equal)
Unicode 207kB ± 0% 207kB ± 0% ~ (all equal)
GoTypes 1.23MB ± 0% 1.23MB ± 0% ~ (all equal)
Compiler 4.35MB ± 0% 4.35MB ± 0% +0.01% (p=0.008 n=5+5)
SSA 9.77MB ± 0% 9.77MB ± 0% +0.00% (p=0.008 n=5+5)
Flate 236kB ± 0% 236kB ± 0% +0.04% (p=0.008 n=5+5)
GoParser 298kB ± 0% 298kB ± 0% ~ (all equal)
Reflect 1.03MB ± 0% 1.03MB ± 0% +0.01% (p=0.008 n=5+5)
Tar 333kB ± 0% 334kB ± 0% +0.22% (p=0.008 n=5+5)
XML 414kB ± 0% 414kB ± 0% +0.02% (p=0.008 n=5+5)
[Geo mean] 730kB 731kB +0.03%
Change-Id: I381809fd9cfbfd6db44bd342b06285e62a3a21f1
Reviewed-on: https://go-review.googlesource.com/94596
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-28 17:37:22 +00:00
Richard Miller
a379b7d9ac
syscall: reduce redundant getwd tracking in Plan 9
...
In Plan 9, each M is implemented as a separate OS process with
its own working directory. To keep the wd consistent across
goroutines (or rescheduling of the same goroutine), CL 6350
introduced a Fixwd procedure which checks using getwd and calls
chdir if necessary before any syscall operating on a pathname.
This wd checking will not be necessary if the pathname is absolute
(starts with '/' or '#'). Getwd is a fairly expensive operation
in Plan 9 (implemented by opening "." and calling Fd2path on the
file descriptor). Eliminating the redundant getwd calls can
significantly reduce overhead for common operations like
"dist test --list" which perform many syscalls on absolute pathnames.
Updates #9428 .
Change-Id: I13fd9380779de27b0ac2f2b488229778d6839255
Reviewed-on: https://go-review.googlesource.com/97675
Reviewed-by: David du Colombier <0intro@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: David du Colombier <0intro@gmail.com>
2018-02-28 16:26:49 +00:00
Richard Miller
c2cdfbd1a7
runtime: don't try to shrink address space with brk in Plan 9
...
Plan 9 won't let brk shrink the data segment if it's shared with
other processes (which it is in the go runtime). So we keep track
of the notional end of the segment as it moves up and down, and
call brk only when it grows.
Corrects CL 94776.
Updates #23860 .
Fixes #24013 .
Change-Id: I754232decab81dfd71d690f77ee6097a17d9be11
Reviewed-on: https://go-review.googlesource.com/97595
Reviewed-by: David du Colombier <0intro@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 15:57:10 +00:00
Rob Pike
1be58dcda8
doc/faq: add a Q&A about virus scanners
...
Fixes #23759 .
Change-Id: I0407ebfea507991fc205f7b04bc5798808a5c5f6
Reviewed-on: https://go-review.googlesource.com/97496
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: David Symonds <dsymonds@golang.org>
2018-02-28 05:55:31 +00:00
Robert Griesemer
0c884d0810
cmd/compile, cmd/compile/internal/syntax: print relative column info
...
This change enables printing of relative column information if a
prior line directive specified a valid column. If there was no
line directive, or the line directive didn't specify a column
(or the -C flag is specified), no column information is shown in
file positions.
Implementation: Column values (and line values, for that matter)
that are zero are interpreted as "unknown". A line directive that
doesn't specify a column records that as a zero column in the
respective PosBase data structure. When computing relative columns,
a relative value is zero of the base's column value is zero.
When formatting a position, a zero column value is not printed.
To make this work without special cases, the PosBase for a file
is given a concrete (non-0:0) position 1:1 with the PosBase's
line and column also being 1:1. In other words, at the position
1:1 of a file, it's relative positions are starting with 1:1 as
one would expect.
In the package syntax, this requires self-recursive PosBases for
file bases, matching what cmd/internal/src.PosBase was already
doing. In src.PosBase, file and inlining bases also need to be
based at 1:1 to indicate "known" positions.
This change completes the cmd/compiler part of the issue below.
Fixes #22662 .
Change-Id: I6c3d2dee26709581fba0d0261b1d12e93f1cba1a
Reviewed-on: https://go-review.googlesource.com/97375
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-02-28 03:51:23 +00:00
Hana Kim
b5bd5bfbc7
cmd/trace: fix overlappingDuration
...
Update #24081
Change-Id: Ieccfb03c51e86f35d4629a42959c80570bd93c33
Reviewed-on: https://go-review.googlesource.com/97555
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 02:42:15 +00:00
Heschi Kreinick
f8973fcafb
cmd/link: revert CL 89535: "fix up location lists for dsymutil"
...
This reverts commit 230b0bad1f
.
Reason for revert: breaking the build.
Fixes #24165
Change-Id: I9d8dda59f97a47e5c436f1c061b34ced82bde8ec
Reviewed-on: https://go-review.googlesource.com/97575
Run-TryBot: Heschi Kreinick <heschi@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 01:53:43 +00:00
Kunpei Sakai
21343e07d6
cmd/compile: remove duplicates by using finishcompare
...
Updates #23834
Change-Id: If05001f9fd6b97d72069f440102eec6e371908dd
Reviewed-on: https://go-review.googlesource.com/97016
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-02-28 00:50:06 +00:00
Michael Fraenkel
a375a6b363
cmd/compile: convert untyped bool during walkCases
...
Updates #23834 .
Change-Id: I1789525a992d37aae9e9b69c1e9d91437d3d0d3b
Reviewed-on: https://go-review.googlesource.com/97001
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-02-27 23:26:36 +00:00