This commit improves the processing of the final few bytes in
castagnoliSSE42: instead of processing one byte at a time, we use all
versions of the CRC32 instruction to process 4 bytes, then 2, then 1.
The difference is only noticeable for small "odd" sized buffers.
We do the similar improvement for processing the first few bytes in
the case of unaligned buffer.
Fixing the test which was not actually verifying the results for
misaligned buffers (WriteString was creating an internal copy which
was aligned).
Adding benchmarks for length 15 (aligned and misaligned), results
below.
name old time/op new time/op delta
CastagnoliCrc15B-4 25.1ns ± 0% 22.1ns ± 1% -12.14%
CastagnoliCrc15BMisaligned-4 25.2ns ± 0% 22.9ns ± 1% -9.03%
CastagnoliCrc40B-4 23.1ns ± 0% 23.4ns ± 0% +1.08%
CastagnoliCrc1KB-4 127ns ± 0% 128ns ± 0% +1.18%
CastagnoliCrc4KB-4 462ns ± 0% 464ns ± 0% ~
CastagnoliCrc32KB-4 3.58µs ± 0% 3.60µs ± 0% +0.58%
name old speed new speed delta
CastagnoliCrc15B-4 597MB/s ± 0% 679MB/s ± 1% +13.77%
CastagnoliCrc15BMisaligned-4 596MB/s ± 0% 655MB/s ± 1% +9.94%
CastagnoliCrc40B-4 1.73GB/s ± 0% 1.71GB/s ± 0% -1.14%
CastagnoliCrc1KB-4 8.01GB/s ± 0% 7.93GB/s ± 1% -1.06%
CastagnoliCrc4KB-4 8.86GB/s ± 0% 8.83GB/s ± 0% ~
CastagnoliCrc32KB-4 9.14GB/s ± 0% 9.09GB/s ± 0% -0.58%
Change-Id: I499e37af2241d28e3e5d522bbab836c1a718430a
Reviewed-on: https://go-review.googlesource.com/24470
Reviewed-by: Keith Randall <khr@golang.org>
In CSE if a tuple generator is CSE'd to a different block, its
selectors are copied to the same block. In this case, also CES
the copied selectors.
Test copied from Keith's CL 27202.
Fixes#16741.
Change-Id: I2fc8b9513d430f10d6104275cfff5fb75d3ef3d9
Reviewed-on: https://go-review.googlesource.com/27236
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
If we cannot infer the asm arch from the filename
or the build tags, assume that it is the
current build arch. Assembly files with no
restrictions ought to be usable on all arches.
Updates #11041
Change-Id: I0ae807dbbd5fb67ca21d0157fe180237a074113a
Reviewed-on: https://go-review.googlesource.com/27151
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
The unsafe.Pointer check allows adding to
and subtracting from uintptrs in order to do
arithmetic.
Some code needs to round uintptrs.
Allow &^ for that purpose.
Updates #11041
Change-Id: Ib90dd2954bb6c78427058271e13f2ce4c4af38fb
Reviewed-on: https://go-review.googlesource.com/27156
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Moves the state.ServerName assignment to outside the if
statement that checks for handshakeComplete.
Fixes#15571
Change-Id: I6c4131ddb16389aed1c410a975f9aa3b52816965
Reviewed-on: https://go-review.googlesource.com/22862
Run-TryBot: Adam Langley <agl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
- Use machine instructions for uint64<->float conversions
- Do not enforce alignment on Zero/Move
ARM64 supports unaligned load/stores, but only aligned offset
or small offset can be encoded into instructions.
- Do combined loads
Change-Id: Iffca7dd0f13070b17b784861ce5a30af584680eb
Reviewed-on: https://go-review.googlesource.com/27086
Reviewed-by: David Chase <drchase@google.com>
.local addresses are used by things like Kubernetes and Weave DNS; Go
should not avoid resolving them.
This is a partial revert of https://golang.org/cl/21328 which was too
strict of an interpretation of RFC 6762.
Fixes#16739
Change-Id: I349415b4eab5d61240dd18217bd95dc7d2105cd5
Reviewed-on: https://go-review.googlesource.com/27250
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The Width{int,ptr,reg} assignments are no longer necessary since
golang.org/cl/21623. The other arch's betypeinit functions were
cleaned up, but apparently this one was missed.
Change-Id: I1c7f074d7864a561659c1f98aef604f57f285fd0
Reviewed-on: https://go-review.googlesource.com/27272
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
BIND libresolv allows values from 0 to 15.
For invalid values and negative numbers, 0 is used.
For numbers greater than 15, 15 is used.
Fixes#15419
Change-Id: I1009bc119c3e87919bcb55a80a35532e9fc3ba52
Reviewed-on: https://go-review.googlesource.com/24901
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The Source provided by math/rand relies on an array of cooked
pseudo-random 63bit integers for seeding. The origin of these
numbers is undocumented.
Add a standalone program in math/rand folder that generates
the 63bit integer array as well as a 64bit version supporting
extension of the Source to 64bit pseudo-random number
generation while maintaining the current sequence in the
lower 63bit.
The code is largely based on the initial implementation of the
random number generator in the go repository by Ken Thompson
(revision 399).
Change-Id: Ib4192aea8127595027116a0f5a7be53f11dc110b
Reviewed-on: https://go-review.googlesource.com/22230
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
sysUnused (e.g., madvise MADV_FREE) is only sensible to call on
physical page boundaries, so scavengelist rounds in the bounds of the
region being released to the nearest physical page boundaries.
However, if the region is smaller than a physical page and neither the
start nor end fall on a boundary, then rounding the start up to a page
boundary and the end down to a page boundary will result in end < start.
Currently, we only give up on the region if start == end, so if we
encounter end < start, we'll call madvise with a negative length and
the madvise will fail.
Issue #16644 gives a concrete example of this:
start = 0x1285ac000
end = 0x1285ae000 (1 8K page)
This leads to the rounded values
start = 0x1285b0000
end = 0x1285a0000
which leads to len = -65536.
Fix this by giving up on the region if end <= start, not just if
end == start.
Fixes#16644.
Change-Id: I8300db492dbadc82ac1ad878318b36bcb7c39524
Reviewed-on: https://go-review.googlesource.com/27230
Reviewed-by: Keith Randall <khr@golang.org>
When T is a scalar, there are no runtime calls
required, which makes this a clear win.
encoding/binary:
WriteInts-8 958ns ± 3% 864ns ± 2% -9.80% (p=0.000 n=15+15)
This also considerably shrinks a core fmt
routine:
Before: "".(*pp).printArg t=1 size=3952 args=0x20 locals=0xf0
After: "".(*pp).printArg t=1 size=2624 args=0x20 locals=0x98
Unfortunately, I find it very hard to get stable
numbers out of the fmt benchmarks due to thermal scaling.
Change-Id: I1278006b030253bf8e48dc7631d18985cdaa143d
Reviewed-on: https://go-review.googlesource.com/26659
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
This removes some scaffolding introduced pre-1.7, introduced to
fix an export format bug, and to minimize conflicts with older
formats. The currently deployed and recognized format is "v1",
so don't worry about other versions. This is a step towards a
better scheme for internal export format versioning.
For #16244.
Change-Id: Ic7cf99dd2a24ad5484cc54aed44fa09332c2cf72
Reviewed-on: https://go-review.googlesource.com/27205
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Robert Griesemer <gri@golang.org>
Removes the encoding of this bit which was ignored but left behind
for 1.7 to minimize pre-1.7 export format changes. See the issue
for more details.
Fixes#15772.
Change-Id: I46cd7a66ad4c6003b78c64295cf3bda503ebf2dd
Reviewed-on: https://go-review.googlesource.com/27201
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
s390x took up the last available chunk of int16 opcodes.
There are RISC-V and sparc64 ports in progress out of tree,
and there will likely be other architectures.
Reduce the opcode space to allow more architectures to
fit without increasing to int32.
This is the smallest power of two that accomodates all
existing architectures. All else being equal, smaller is
better--smaller numbers are easier to generate immediates
for and easier on the eyes when debugging.
Change-Id: I4d0824b28913892fbd0579d3f90bea34e44c8946
Reviewed-on: https://go-review.googlesource.com/24223
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
We should check whether there is a concurrent writer at the
start of every mapiternext, not just in mapaccessK (which is
only called during certain map growth situations).
Tests turned off by default because they are inherently flaky.
Fixes#16278
Change-Id: I8b72cab1b8c59d1923bec6fa3eabc932e4e91542
Reviewed-on: https://go-review.googlesource.com/24749
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Only need to zero-extend to 32 bits and we get the top
32 bits zeroed for free.
Only the WQ change actually generates different code.
The assembler did this optimization for us in the other two cases.
But we might as well do it during SSA so -S output more closely
matches the actual generated instructions.
Change-Id: I3e4ac50dc4da124014d4e31c86e9fc539d94f7fd
Reviewed-on: https://go-review.googlesource.com/23711
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This CL adds a safety mechanism
for changing the number of opcodes
available per architecture.
A subsequent CL will actually make the change.
Change-Id: I6332ed5514f2f153c54d11b7da0cc8a6be1c8066
Reviewed-on: https://go-review.googlesource.com/24222
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Also includes a minor golint cleanup in the tests.
Change-Id: I8c0fc81479e635e7cca18d5c48c28b654afa59d8
Reviewed-on: https://go-review.googlesource.com/25380
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This only triggers a few places in the stdlib,
but it helps a lot when it does.
Before:
runtime.mapassign1 t=1 size=2400 args=0x20 locals=0xe0
After:
runtime.mapassign1 t=1 size=2352 args=0x20 locals=0xd8
name old time/op new time/op delta
MapPop100-8 19.8µs ±11% 18.4µs ± 9% -7.16% (p=0.000 n=20+19)
MapPop1000-8 367µs ±17% 335µs ±11% -8.63% (p=0.000 n=19+19)
MapPop10000-8 7.29ms ±15% 6.86ms ±12% -5.84% (p=0.020 n=20+20)
Change-Id: I9faf32f95a6ba6a6d5d0818eab32cc271e01d10a
Reviewed-on: https://go-review.googlesource.com/26666
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The frontend rewriting lowers them to runtime calls on 386. It
matches explicitly uint32, but missed uint.
Fixes#16738.
Change-Id: Iece7a45edf74615baca052a53273c208f057636d
Reviewed-on: https://go-review.googlesource.com/27085
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Might as well tell the runtime how large the map is going to be.
This avoids grow work and allocations while the map is being built.
Will wait for 1.8.
Fixes#15880Fixes#16279
Change-Id: I377e3e5ec1e2e76ea2a50cc00810adda20ad0e79
Reviewed-on: https://go-review.googlesource.com/23558
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This CL teaches SSA to recognize code of the form
// b is a boolean value, i is an int of some flavor
if b {
i = 1
} else {
i = 0
}
and use b's underlying 0/1 representation for i
instead of generating jumps.
Unfortunately, it does not work on the obvious code:
func bool2int(b bool) int {
if b {
return 1
}
return 0
}
This is left for future work.
Note that the existing phiopt optimizations also don't work for:
func neg(b bool) bool {
if b {
return false
}
return true
}
In the meantime, runtime authors and the like can use:
func bool2int(b bool) int {
var i int
if b {
i = 1
} else {
i = 0
}
return i
}
This compiles to:
"".bool2int t=1 size=16 args=0x10 locals=0x0
0x0000 00000 (x.go:25) TEXT "".bool2int(SB), $0-16
0x0000 00000 (x.go:25) FUNCDATA $0, gclocals·23e8278e2b69a3a75fa59b23c49ed6ad(SB)
0x0000 00000 (x.go:25) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:32) MOVBLZX "".b+8(FP), AX
0x0005 00005 (x.go:32) MOVBQZX AL, AX
0x0008 00008 (x.go:32) MOVQ AX, "".~r1+16(FP)
0x000d 00013 (x.go:32) RET
The extraneous MOVBQZX is #15300.
This optimization also helps range and slice.
The compiler must protect against pointers pointing
to the end of a slice/string. It does this by increasing
a pointer by either 0 or 1 * elemsize, based on a condition.
This CL optimizes away a jump in that code.
This CL triggers 382 times while compiling the standard library.
Updating code to utilize this optimization is left for future CLs.
Updates #6011
Change-Id: Ia7c1185f8aa223c543f91a3cd6d4a2a09c691c70
Reviewed-on: https://go-review.googlesource.com/22711
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Don't fold constant factors into a multiply
beyond the capacity of a MULQ instruction (32 bits).
Fixes#16733
Change-Id: Idc213c6cb06f7c94008a8cf9e60a9e77d085fd89
Reviewed-on: https://go-review.googlesource.com/27160
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
We already inlined
_, ok = e.(T)
_, ok = i.(E)
_, ok = e.(E)
The only ok-only variants not inlined are now
_, ok = i.(I)
_, ok = e.(I)
These call getitab, so are non-trivial.
Change-Id: Ie45fd8933ee179a679b92ce925079b94cff0ee12
Reviewed-on: https://go-review.googlesource.com/26658
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Now that assembler opcodes have their own type, they can have a true
stringer, rather than explicit calls to Aconv, which makes for nicer
format strings.
Change-Id: Ic77f5f8ac38b4e519dcaa08c93e7b732226f7bfe
Reviewed-on: https://go-review.googlesource.com/25045
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Give *recordingConn the correct WriteTo signature
to be an io.WriterTo. This makes vet happy.
It also means that it'll report errors,
which were previously being ignored.
Updates #11041
Change-Id: I13f171407d63f4b62427679bff362eb74faddca5
Reviewed-on: https://go-review.googlesource.com/27121
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
container/list/list_test.go:274: self-assignment of e1 to e1
container/list/list_test.go:274: self-assignment of e4 to e4
container/list/list_test.go:282: self-assignment of e1 to e1
container/list/list_test.go:286: self-assignment of e1 to e1
container/list/list_test.go:286: self-assignment of e4 to e4
Updates #11041
Change-Id: Ibd90cf6a924e93497908f437b814c3fc82937f4a
Reviewed-on: https://go-review.googlesource.com/27114
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>