And delete them from asm_test.
Change-Id: I34fcf85ae8ce09cd146fe4ce6a0ae7616bd97e2d
Reviewed-on: https://go-review.googlesource.com/102296
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
As pointed out by Josh Bleecher Snyder in CL 99780.
The check is for GOARM > 6, so suggest to recompile with either GOARM=5
or GOARM=6.
Change-Id: I6a97e87bdc17aa3932f5c8cb598bba85c3cf4be9
Reviewed-on: https://go-review.googlesource.com/101936
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Sometimes, we can end up calling update with a self-relation
about a variable (x REL x). In this case, there is no need
to record anything: the relation is unsatisfiable if and only
if it doesn't contain eq.
This also helps avoiding infinite loop in next CL that will
introduce transitive closure of relations.
Passes toolstash -cmp.
Change-Id: Ic408452ec1c13653f22ada35466ec98bc14aaa8e
Reviewed-on: https://go-review.googlesource.com/100276
Reviewed-by: Austin Clements <austin@google.com>
When an unsatisfiable relation is recorded in the facts table,
there is no need to compute further relations or updates
additional data structures.
Since we're about to transitively propagate relations, make
sure to fail as fast as possible to avoid doing useless work
in dead branches.
Passes toolstash -cmp.
Change-Id: I23eed376d62776824c33088163c7ac9620abce85
Reviewed-on: https://go-review.googlesource.com/100275
Reviewed-by: Austin Clements <austin@google.com>
The ssa.Func has Type field that is described as
function signature type.
It never gets any value and remains nil.
This leads to "<T>" signature printed representation.
Given this function declaration:
func foo(x int, f func() string) (int, error)
GOSSAFUNC printed it as below:
compiling foo
foo <T>
After this change:
compiling foo
foo func(int, func() string) (int, error)
Change-Id: Iec5eec8aac5c76ff184659e30f41b2f5fe86d329
Reviewed-on: https://go-review.googlesource.com/102375
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
By always writing out pack files, the object file format can be
simplified somewhat. In particular, the export data format will no
longer require escaping, because the pack file provides appropriate
framing.
This CL does not affect build systems that use -pack, which includes
all major Go build systems (cmd/go, gb, bazel).
Also, existing package import logic already distinguishes pack/object
files based on file contents rather than file extension.
The only exception is cmd/pack, which specially handled object files
created by cmd/compile when used with the 'c' mode. This mode is
extended to now recognize the pack files produced by cmd/compile and
handle them as before.
Passes toolstash-check.
Updates #21705.
Updates #24512.
Change-Id: Idf131013bfebd73a5cde7e087eb19964503a9422
Reviewed-on: https://go-review.googlesource.com/102236
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The __.PKGDEF file is a compiler object file only intended for other
compilers. Also, for build systems that use -linkobj, all of the
information it contains is present within the linker object files
already, so look for it there instead.
This requires a little bit of code reorganization. Significantly,
previously when loading an archive file, the __.PKGDEF file was
authoritative on whether the package was "main" and/or "safe". Now
that we're using the Go object files instead, there's the issue that
there can be multiple Go object files in an archive (because when
using assembly, each assembly file becomes its own additional object
file).
The solution taken here is to check if any object file within the
package declares itself as "main" and/or "safe".
Updates #24512.
Change-Id: I70243a293bdf34b8555c0bf1833f8933b2809449
Reviewed-on: https://go-review.googlesource.com/102281
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This is currently always the case because loadobjfile complains if
it's not, but that will be changed soon.
Updates #24512.
Change-Id: I262daca765932a0f4cea3fcc1cc80ca90de07a59
Reviewed-on: https://go-review.googlesource.com/102280
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
AT_HWCAP is not available on FreeBSD-11.1-RELEASE or earlier and the wrong const was used.
Use the correct value, and initialize hwcap with ^uint32(0) inorder not to fail the VFP tests.
Fixes#24507.
Change-Id: I5c3eed57bb53bf992b7de0eec88ea959806306b9
Reviewed-on: https://go-review.googlesource.com/102355
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The compiler can't currently figure out that it can eliminate both c.s
loads (using store to load forwarding) in the second line of the
following code:
...
c.s[i], c.s[j] = c.s[j], c.s[i]
x := c.s[j] + c.s[i]
...
The compiler eliminates the second load of c.s[j] (using the original
value of c.s[i]), however the load of c.s[i] remains because the compiler
doesn't know that c.s[i] and c.s[j] either overlap completely or not at
all.
Introducing temporaries to make this explicit improves the performance
of the generic code slightly, the goal being to remove the assembly in
this package in the future. This change also hoists a bounds check out
of the main loop which gives a slight performance boost and also makes
the behaviour identical to the assembly implementation when len(dst) <
len(src).
name old speed new speed delta
RC4_128-4 491MB/s ± 3% 596MB/s ± 5% +21.51% (p=0.000 n=9+9)
RC4_1K-4 504MB/s ± 2% 616MB/s ± 1% +22.33% (p=0.000 n=10+10)
RC4_8K-4 509MB/s ± 1% 630MB/s ± 2% +23.85% (p=0.000 n=8+9)
Change-Id: I27adc775713b2e74a1a94e0c1de0909fb4379463
Reviewed-on: https://go-review.googlesource.com/102335
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This reverts commit c6e69ec7f9.
Reason for revert: Broke builders. #24508
Change-Id: I66abff0dd14ec6e1f8d8d982ccfb0438633b639d
Reviewed-on: https://go-review.googlesource.com/102316
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
I don't know if I got lost in the old PKCS documents, or whether this is
a case where reality diverges from the spec, but OpenSSL clearly stuffs
PKIX Extension objects in CSR attributues directly[1].
In either case, doing what OpenSSL does seems valid here and allows the
critical flag in extensions to be serialised.
Fixes#13739.
[1] e3713c365c/crypto/x509/x509_req.c (L173)
Change-Id: Ic1e73ba9bd383a357a2aa8fc4f6bd76811bbefcc
Reviewed-on: https://go-review.googlesource.com/70851
Run-TryBot: Adam Langley <agl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
This change implement keying material export as described in:
https://tools.ietf.org/html/rfc5705
I verified the implementation against openssl s_client and openssl
s_server.
Change-Id: I4dcdd2fb929c63ab4e92054616beab6dae7b1c55
Signed-off-by: Mike Danese <mikedanese@google.com>
Reviewed-on: https://go-review.googlesource.com/85115
Run-TryBot: Adam Langley <agl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
Slightly simplifies the code. Made sure to exclude the cases that would
change behavior, such as when the iterated value is a string, when the
index is modified within the body, or when the slice is modified.
Also checked that all the elements are of pointer type, to avoid the
corner case where non-pointer types could be copied by mistake.
Change-Id: Iea64feb2a9a6a4c94ada9ff3ace40ee173505849
Reviewed-on: https://go-review.googlesource.com/100557
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This experiment has gone stale. It causes a type-checking failure
because the condition of the OIF produced by range loop lowering has
type "untyped bool". Fix this by typechecking the whole OIF statement,
not just its condition.
This doesn't quite fix the whole experiment, but it gets further.
Something about preemption point insertion is causing failures like
"internal compiler error: likeliness prediction 1 for block b10 with 1
successors" in cmd/compile/internal/gc.
Change-Id: I7d80d618d7c91c338bf5f2a8dc174d582a479df3
Reviewed-on: https://go-review.googlesource.com/102157
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Move currently uses mov instructions directly up to 31 bytes and then
switches to duffcopy. Moving 31 bytes is 4 instructions corresponding to
two loads and two stores, (or 6 if !useSSE) depending on the usage,
duffcopy is five (one or two mov, two or three lea, one call).
This adds direct mov instructions for Move's of size 32, 48, and 64 with
sse and for only size 32 without.
With useSSE:
- 32 is 4 instructions (byte +/- comparison below)
- 33 thru 48 is 6
- 49 thru 64 is 8
Without:
- 32 is 8
Note that the only platform with useSSE set to false is plan 9. I have
built three projects based off tip and tip with this patch and the
project's byte size is equal to or less than they were prior.
The basis of this change is that copying data with instructions directly
is nearly free, whereas calling into duffcopy adds a bit of overhead.
This is most noticeable in range statements where elements are 32+
bytes. For code with the following pattern:
func Benchmark32Range(b *testing.B) {
var f s32
for _, count := range []int{10, 100, 1000, 10000} {
name := strconv.Itoa(count)
b.Run(name, func(b *testing.B) {
base := make([]s32, count)
for i := 0; i < b.N; i++ {
for _, v := range base {
f = v
}
}
})
}
_ = f
}
These are the resulting benchmarks:
Benchmark16Range/10-4 19.1 19.1 +0.00%
Benchmark16Range/100-4 169 170 +0.59%
Benchmark16Range/1000-4 1684 1691 +0.42%
Benchmark16Range/10000-4 18147 18124 -0.13%
Benchmark31Range/10-4 141 142 +0.71%
Benchmark31Range/100-4 1407 1410 +0.21%
Benchmark31Range/1000-4 14070 14074 +0.03%
Benchmark31Range/10000-4 141781 141759 -0.02%
Benchmark32Range/10-4 71.4 32.2 -54.90%
Benchmark32Range/100-4 695 326 -53.09%
Benchmark32Range/1000-4 7166 3313 -53.77%
Benchmark32Range/10000-4 72571 35425 -51.19%
Benchmark64Range/10-4 87.8 64.9 -26.08%
Benchmark64Range/100-4 868 629 -27.53%
Benchmark64Range/1000-4 9355 6907 -26.17%
Benchmark64Range/10000-4 94463 70385 -25.49%
Benchmark79Range/10-4 177 152 -14.12%
Benchmark79Range/100-4 1769 1531 -13.45%
Benchmark79Range/1000-4 17893 15532 -13.20%
Benchmark79Range/10000-4 178947 155551 -13.07%
Benchmark80Range/10-4 99.6 99.7 +0.10%
Benchmark80Range/100-4 987 985 -0.20%
Benchmark80Range/1000-4 10573 10560 -0.12%
Benchmark80Range/10000-4 106792 106639 -0.14%
For runtime's BenchCopyFat* benchmarks:
CopyFat8-4 0.40ns ± 0% 0.40ns ± 0% ~ (all equal)
CopyFat12-4 0.40ns ± 0% 0.80ns ± 0% +100.00% (p=0.000 n=9+9)
CopyFat16-4 0.40ns ± 0% 0.80ns ± 0% +100.00% (p=0.000 n=10+8)
CopyFat24-4 0.80ns ± 0% 0.40ns ± 0% -50.00% (p=0.001 n=8+9)
CopyFat32-4 2.01ns ± 0% 0.40ns ± 0% -80.10% (p=0.000 n=8+8)
CopyFat64-4 2.87ns ± 0% 0.40ns ± 0% -86.07% (p=0.000 n=8+10)
CopyFat128-4 4.82ns ± 0% 4.82ns ± 0% ~ (p=1.000 n=8+8)
CopyFat256-4 8.83ns ± 0% 8.83ns ± 0% ~ (p=1.000 n=8+8)
CopyFat512-4 16.9ns ± 0% 16.9ns ± 0% ~ (all equal)
CopyFat520-4 14.6ns ± 0% 14.6ns ± 1% ~ (p=0.529 n=8+9)
CopyFat1024-4 32.9ns ± 0% 33.0ns ± 0% +0.20% (p=0.041 n=8+9)
Function calls are not benefitted as much due how they are compiled, but
other benchmarks I ran show that calling function with 64 byte elements
is marginally improved.
The main downside with this change is that it may increase binary sizes
depending on the size of the copy, but this change also decreases
binaries for moves of 48 bytes or less.
For the following code:
package main
type size [32]byte
//go:noinline
func use(t size) {
}
//go:noinline
func get() size {
var z size
return z
}
func main() {
var a size
use(a)
}
Changing size around gives the following assembly leading up to the call
(the initialization and actual call are removed):
tip func call with 32B arg: 27B
48 89 e7 mov %rsp,%rdi
48 8d 74 24 20 lea 0x20(%rsp),%rsi
48 89 6c 24 f0 mov %rbp,-0x10(%rsp)
48 8d 6c 24 f0 lea -0x10(%rsp),%rbp
e8 53 ab ff ff callq 448964 <runtime.duffcopy+0x364>
48 8b 6d 00 mov 0x0(%rbp),%rbp
modified: 19B (-8B)
0f 10 44 24 20 movups 0x20(%rsp),%xmm0
0f 11 04 24 movups %xmm0,(%rsp)
0f 10 44 24 30 movups 0x30(%rsp),%xmm0
0f 11 44 24 10 movups %xmm0,0x10(%rsp)
-
tip with 47B arg: 29B
48 8d 7c 24 0f lea 0xf(%rsp),%rdi
48 8d 74 24 40 lea 0x40(%rsp),%rsi
48 89 6c 24 f0 mov %rbp,-0x10(%rsp)
48 8d 6c 24 f0 lea -0x10(%rsp),%rbp
e8 43 ab ff ff callq 448964 <runtime.duffcopy+0x364>
48 8b 6d 00 mov 0x0(%rbp),%rbp
modified: 20B (-9B)
0f 10 44 24 40 movups 0x40(%rsp),%xmm0
0f 11 44 24 0f movups %xmm0,0xf(%rsp)
0f 10 44 24 50 movups 0x50(%rsp),%xmm0
0f 11 44 24 1f movups %xmm0,0x1f(%rsp)
-
tip with 64B arg: 27B
48 89 e7 mov %rsp,%rdi
48 8d 74 24 40 lea 0x40(%rsp),%rsi
48 89 6c 24 f0 mov %rbp,-0x10(%rsp)
48 8d 6c 24 f0 lea -0x10(%rsp),%rbp
e8 1f ab ff ff callq 448948 <runtime.duffcopy+0x348>
48 8b 6d 00 mov 0x0(%rbp),%rbp
modified: 39B [+12B]
0f 10 44 24 40 movups 0x40(%rsp),%xmm0
0f 11 04 24 movups %xmm0,(%rsp)
0f 10 44 24 50 movups 0x50(%rsp),%xmm0
0f 11 44 24 10 movups %xmm0,0x10(%rsp)
0f 10 44 24 60 movups 0x60(%rsp),%xmm0
0f 11 44 24 20 movups %xmm0,0x20(%rsp)
0f 10 44 24 70 movups 0x70(%rsp),%xmm0
0f 11 44 24 30 movups %xmm0,0x30(%rsp)
-
tip with 79B arg: 29B
48 8d 7c 24 0f lea 0xf(%rsp),%rdi
48 8d 74 24 60 lea 0x60(%rsp),%rsi
48 89 6c 24 f0 mov %rbp,-0x10(%rsp)
48 8d 6c 24 f0 lea -0x10(%rsp),%rbp
e8 09 ab ff ff callq 448948 <runtime.duffcopy+0x348>
48 8b 6d 00 mov 0x0(%rbp),%rbp
modified: 46B [+17B]
0f 10 44 24 60 movups 0x60(%rsp),%xmm0
0f 11 44 24 0f movups %xmm0,0xf(%rsp)
0f 10 44 24 70 movups 0x70(%rsp),%xmm0
0f 11 44 24 1f movups %xmm0,0x1f(%rsp)
0f 10 84 24 80 00 00 movups 0x80(%rsp),%xmm0
00
0f 11 44 24 2f movups %xmm0,0x2f(%rsp)
0f 10 84 24 90 00 00 movups 0x90(%rsp),%xmm0
00
0f 11 44 24 3f movups %xmm0,0x3f(%rsp)
So, at best we save 9B, at worst we gain 17. I do not think that copying
around 65+B sized types is common enough to bloat program sizes. Using
bincmp on the go binary itself shows a zero byte difference; there are
gains and losses all over. One of the largest gains in binary size comes
from cmd/go/internal/cache.(*Cache).Get, which passes around a 64 byte
sized type -- this is one of the cases I would expect to be benefitted
by this change.
I think that this marginal improvement in struct copying for 64 byte
structs is worth it: most data structs / work items I use in my programs
are small, but few are smaller than 32 bytes: with one slice, the budget
is up. The 32 rule alone would allow another 16 bytes, the 48 and 64
rules allow another 32 and 48.
Change-Id: I19a8f9190d5d41825091f17f268f4763bfc12a62
Reviewed-on: https://go-review.googlesource.com/100718
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
And remove them from asm_test.
Change-Id: I1ca29b40546d6de06f20bfd550ed8ff87f495454
Reviewed-on: https://go-review.googlesource.com/102115
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
I've reorganized the guide and rewritten large sections.
The structure is now more clear and logical, and can
be understood and navigated using the summary displayed at
the top of the page (before, the summary was confusing because
the guide contained H1s that were being ignored by the summary).
Both the initial onboarding process and the Gerrit
change submission process have been reworked to
include a concise checklist of steps that can be
read and understood in a few seconds, for people
that don't want or need to bother with details.
More in-depth descriptions have been moved into
separate sections, one per each checklist step.
This is by far the biggest improvement, as the previous
approach of having to read several pages just to understand
the requires steps was very scaring for beginners, in
addition of being harder to navigate.
GitHub pull requests have been integrated as a different
way to submit a change, suggested for first time contributors.
The review process has been described in more details,
documenting the workflow and the used conventions.
Most miscellanea have been moved into an "advanced
topics" chapter.
Paragraphs describing how to use git have been removed
to simplify reading. This guide should focus on Go contribution,
and not help users getting familiar with git, for which many
guides are available.
Change-Id: I6f4b76583c9878b230ba1d0225745a1708fad2e8
Reviewed-on: https://go-review.googlesource.com/93495
Reviewed-by: Rob Pike <r@golang.org>
Fixes#14327
Much of the code is based on the linux/amd64 code that implements these
build modes, and code is shared where possible.
Change-Id: Ia510f2023768c0edbc863aebc585929ec593b332
Reviewed-on: https://go-review.googlesource.com/93875
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
For each replacement, test case is added to new 386enc.s file
with exception of EMMS, SYSENTER, MFENCE and LFENCE as they
are already covered in amd64enc.s (same on amd64 and 386).
The replacement became less obvious after go vet suggested changes
Before:
BYTE $0x0f; BYTE $0x7f; BYTE $0x44; BYTE $0x24; BYTE $0x08
Changed to MOVQ (this form is being tested):
MOVQ M0, 8(SP)
Refactored to FP-relative access (go vet advice):
MOVQ M0, val+4(FP)
Change-Id: I56b87cf3371b6ad81ad0cd9db2033aee407b5818
Reviewed-on: https://go-review.googlesource.com/101475
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Decode AT_PAGESZ to determine physPageSize on freebsd/{386,amd64,arm}
and AT_HWCAP for hwcap and hardDiv on freebsd/arm. Also use hwcap to
perform the FP checks in checkgoarm akin to the linux/arm
implementation.
Change-Id: I532810a1581efe66277e4305cb234acdc79ee91e
Reviewed-on: https://go-review.googlesource.com/99780
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Otherwise, a populated GOPATH might result in failures such as:
$ go test
[...] no buildable Go source files in [...]/gopherjs/compiler/natives/src/crypto/rand
exit status 1
Move the initialization of the dirs walker out of the init func, so that
we can control its behavior in the tests.
Updates #24464.
Change-Id: I4b26a7d3d6809bdd8e9b6b0556d566e7855f80fe
Reviewed-on: https://go-review.googlesource.com/101836
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Unused variables in closures are currently not diagnosed by the
compiler (this is Issue #3059), while go/types catches them.
One unused variable in the cmd/trace tests is causing the go/types
test that typechecks the whole standard library to fail:
FAIL: TestStdlib (8.05s)
stdlib_test.go:223: cmd/trace/annotations_test.go:241:6: gcTime
declared but not used
FAIL
Remove it.
Updates #24464
Change-Id: I0f1b9db6ae1f0130616ee649bdbfdc91e38d2184
Reviewed-on: https://go-review.googlesource.com/101815
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
The old code was a blend of (copied) code that existed before go/build,
and incorrect adjustments made when go/build was introduced. This change
leaves package path determination entirely to go/build and in the process
fixes issues with relative import paths.
Fixes#23092Fixes#24392
Change-Id: I9e900538b365398751bace56964495c5440ac4ae
Reviewed-on: https://go-review.googlesource.com/83415
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
VerifyHostname is called by tls.Conn during Handshake and does not need to be called explicitly.
Change-Id: I22b7fa137e76bb4be3d0018813a571acfb882219
Reviewed-on: https://go-review.googlesource.com/98618
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
It serialises optional parameters as empty rather than NULL. It's
probably technically correct, although ASN.1 has a long history of doing
this different ways.
But OpenSSL is likely common enough that we want to support this
encoding.
Fixes#23847
Change-Id: I81c60f0996edfecf59467dfdf75b0cf8ba7b1efb
Reviewed-on: https://go-review.googlesource.com/96417
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently we don't lift spill out of loop if loop contains call.
However often we have code like this:
for .. {
if hard_case {
call()
}
// simple case, without call
}
So instead of checking for any call, check for unavoidable call.
For #22698 cases I see:
mime/quotedprintable/Writer-6 10.9µs ± 4% 9.2µs ± 3% -15.02% (p=0.000 n=8+8)
And:
compress/flate/Encode/Twain/Huffman/1e4-6 99.4µs ± 6% 90.9µs ± 0% -8.57% (p=0.000 n=8+8)
compress/flate/Encode/Twain/Huffman/1e5-6 760µs ± 1% 725µs ± 1% -4.56% (p=0.000 n=8+8)
compress/flate/Encode/Twain/Huffman/1e6-6 7.55ms ± 0% 7.24ms ± 0% -4.07% (p=0.000 n=8+7)
There are no significant changes on go1 benchmarks.
But for cases with runtime arch checks, where we call generic version on old hardware,
there are respectable performance gains:
math/RoundToEven-6 1.43ns ± 0% 1.25ns ± 0% -12.59% (p=0.001 n=7+7)
math/bits/OnesCount64-6 1.60ns ± 1% 1.42ns ± 1% -11.32% (p=0.000 n=8+8)
Also on some runtime benchmarks loops have less loads and higher performance:
runtime/RuneIterate/range1/ASCII-6 15.6ns ± 1% 13.9ns ± 1% -10.74% (p=0.000 n=7+8)
runtime/ArrayEqual-6 3.22ns ± 0% 2.86ns ± 2% -11.06% (p=0.000 n=7+8)
Fixes#22698
Updates #22234
Change-Id: I0ae2f19787d07a9026f064366dedbe601bf7257a
Reviewed-on: https://go-review.googlesource.com/84055
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
And delete them from asm_test.
Change-Id: I64c512bfef3b3da6db5c5d29277675dade28b8ab
Reviewed-on: https://go-review.googlesource.com/101595
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Fix a bug in the code that generates the pre-inlined variable
declaration table used as raw material for emitting DWARF inline
routine records. The fix for issue 23704 altered the recipe for
assigning file/line/col to variables in one part of the compiler, but
didn't update a similar recipe in the code for variable tracking.
Added a new test that should catch problems of a similar nature.
Fixes#24460.
Change-Id: I255c036637f4151aa579c0e21d123fd413724d61
Reviewed-on: https://go-review.googlesource.com/101676
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
The atomic add instructions modify the condition code and so need to
be marked as clobbering flags.
Fixes#24449.
Change-Id: Ic69c8d775fbdbfb2a56c5e0cfca7a49c0d7f6897
Reviewed-on: https://go-review.googlesource.com/101455
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This change fixes index error when encoding VMOV instruction which pattern
is vmov Vn.<T>[index], Vd.<T>[index]
Change-Id: I949166e6dfd63fb0a9365f183b6c50d452614f9d
Reviewed-on: https://go-review.googlesource.com/101335
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This change fixes index error when encoding VMOV instruction which pattern is
VMOV Rn, V.<T>[index]. For example VMOV R1, V1.S[1] is assembled as VMOV R1, V1.S[0]
Fixes#24400
Change-Id: I82b5edc8af4e06862bc4692b119697c6bb7dc3fb
Reviewed-on: https://go-review.googlesource.com/101297
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This reverts commit bfa8b6f8ff.
Reason for revert: This depends on another CL which is not yet submitted.
Change-Id: I50e7594f1473c911a2079fe910849a6694ac6c07
Reviewed-on: https://go-review.googlesource.com/101496
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Current ARM64 assembler has no check for the invalid value of both
shift amount and post-index immediate offset of LD1/ST1. This patch
adds the check.
This patch also fixes the printing error of register number equals
to 31, which should be printed as ZR instead of R31. Test cases
are also added.
Change-Id: I476235f3ab3a3fc91fe89c5a3149a4d4529c05c7
Reviewed-on: https://go-review.googlesource.com/100255
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
This reverts commit 4b06d9d727.
Reason for revert: It's a reference to a legendary article
from the Journal of Irreproducible Results.
Updates golang/go#24451
Change-Id: I0288177f4e286bd6ace5774f2e5e0acb02370305
Reviewed-on: https://go-review.googlesource.com/101495
Reviewed-by: Andrew Bonventre <andybons@golang.org>
One could expect calls like
z.mant.shl(z.mant, shiftAmount)
(or higher-level-functions calls that use lhs/rhs) to be almost free
when shiftAmount = 0; and expect calls like
z.mant.shl(x.mant, 0)
to have the same cost of a x.mant -> z.mant copy. Neither of this
things are currently true.
For an 800 words nat, the first kind of calls cost ~800ns for rigth
shifts and ~3.5µs for left shift; while the second kind of calls are
doing more work than necessary by calling shlVU/shrVU.
This change makes the first kind of calls ({Shl,Shr}Same) almost free,
and the second kind of calls ({Shl,Shr}) about 30% faster.
name old time/op new time/op delta
ZeroShifts/Shl-4 3.64µs ± 3% 2.49µs ± 1% -31.55% (p=0.000 n=10+10)
ZeroShifts/ShlSame-4 3.65µs ± 1% 0.01µs ± 1% -99.85% (p=0.000 n=9+9)
ZeroShifts/Shr-4 3.65µs ± 1% 2.49µs ± 1% -31.91% (p=0.000 n=10+10)
ZeroShifts/ShrSame-4 825ns ± 0% 6ns ± 1% -99.33% (p=0.000 n=9+10)
During go test math/big, the shl zeroshift fastpath is triggered 1380
times; while the shr fastpath is triggered 153334 times(!).
Change-Id: I5f92b304a40638bd8453a86c87c58e54b337bcdf
Reviewed-on: https://go-review.googlesource.com/87660
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Per #11257 all examples should be in external test files.
Additionally, doing so makes this example playable.
Updates #24352. (Albeit tangentially).
Change-Id: I77ab4655107f61db2e9d21a608b73ace3a230fb2
Reviewed-on: https://go-review.googlesource.com/101285
Reviewed-by: Robert Griesemer <gri@golang.org>