Unless you go back and read the hash package documentation, it's
not clear that all the hash packages implement marshaling and
unmarshaling. Document the behaviour specifically in each package
that implements it as it this is hidden behaviour and easy to miss.
Change-Id: Id9d3508909362f1a3e53872d0319298359e50a94
Reviewed-on: https://go-review.googlesource.com/77251
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
change hash/crc32 package to use cpu package instead of using
runtime internal variables to check crc32 instruction
Change-Id: I8f88d2351bde8ed4e256f9adf822a08b9a00f532
Reviewed-on: https://go-review.googlesource.com/76490
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
The cryptographic checksums operate in blocks of 64 or 128 bytes,
which means that the last 128 bytes or so of the input may be encoded
in its original (plaintext) form as part of the state.
Document this so users do not falsely assume that the encoded state
carries no reversible information about the input.
Change-Id: I823dbb87867bf0a77aa20f6ed7a615dbedab3715
Reviewed-on: https://go-review.googlesource.com/77372
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Example usage of functionality implemented in CL 66710.
Change-Id: I87d6e4d2fb7a60e4ba1e6ef02715480eb7e8f8bd
Reviewed-on: https://go-review.googlesource.com/76011
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
To improve readability when exported fields are removed,
forbid the printer from emitting an empty line before the first comment
in a const, var, or type block.
Also, when printing the "Has filtered or unexported fields." message,
add an empty line before it to separate the message from the struct
or interfact contents.
Before the change:
<<<
type NamedArg struct {
// Name is the name of the parameter placeholder.
//
// If empty, the ordinal position in the argument list will be
// used.
//
// Name must omit any symbol prefix.
Name string
// Value is the value of the parameter.
// It may be assigned the same value types as the query
// arguments.
Value interface{}
// contains filtered or unexported fields
}
>>>
After the change:
<<<
type NamedArg struct {
// Name is the name of the parameter placeholder.
//
// If empty, the ordinal position in the argument list will be
// used.
//
// Name must omit any symbol prefix.
Name string
// Value is the value of the parameter.
// It may be assigned the same value types as the query
// arguments.
Value interface{}
// contains filtered or unexported fields
}
>>>
Fixes#18264
Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88
Reviewed-on: https://go-review.googlesource.com/71990
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The marshal method allows the hash's internal state to be serialized and
unmarshaled at a later time, without having the re-write the entire stream
of data that was already written to the hash.
Fixes#20573
Change-Id: I40bbb84702ac4b7c5662f99bf943cdf4081203e5
Reviewed-on: https://go-review.googlesource.com/66710
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Implements detection of x86 cpu features that
are used in the go standard library.
Changes all standard library packages to use the new cpu package
instead of using runtime internal variables to check x86 cpu features.
Updates: #15403
Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9
Reviewed-on: https://go-review.googlesource.com/41476
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Major reorganization of the crc32 code:
- The arch-specific files now implement a well-defined interface
(documented in crc32.go). They no longer have the responsibility of
initializing and falling back to a non-accelerated implementation;
instead, that happens in the higher level code.
- The non-accelerated algorithms are moved to a separate file with no
dependencies on other code.
- The "cutoff" optimization for slicing-by-8 is moved inside the
algorithm itself (as opposed to every callsite).
Tests are significantly improved:
- direct tests for the non-accelerated algorithms.
- "cross-check" tests for arch-specific implementations (all archs).
- tests for misaligned buffers for both IEEE and Castagnoli.
Fixes#16909.
Change-Id: I9b6dd83b7a57cd615eae901c0a6d61c6b8091c74
Reviewed-on: https://go-review.googlesource.com/27935
Reviewed-by: Keith Randall <khr@golang.org>
When SSE is available, we don't need the Table. However, it is
returned as a handle by MakeTable. Fix this to always generate
the table.
Further cleanup is discussed in #16909.
Change-Id: Ic05400d68c6b5d25073ebd962000451746137afc
Reviewed-on: https://go-review.googlesource.com/27934
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This reverts commit 54d7de7dd6.
It was breaking non-amd64 builds.
Change-Id: I22650e922498eeeba3d4fa08bb4ea40a210c8f97
Reviewed-on: https://go-review.googlesource.com/27925
Reviewed-by: Keith Randall <khr@golang.org>
The code wasn't checking to see if the data was still >= 64 bytes
long after aligning it.
Aligning the data is an optimization and we don't actually need
to do it. In fact for smaller sizes it slows things down due to
the overhead of calling the generic function. Therefore for now
I have simply removed the alignment stage. I have also added a
check into the assembly to deliberately trigger a segmentation
fault if the data is too short.
Fixes#16779.
Change-Id: Ic01636d775efc5ec97689f050991cee04ce8fe73
Reviewed-on: https://go-review.googlesource.com/27409
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This commit improves the processing of the final few bytes in
castagnoliSSE42: instead of processing one byte at a time, we use all
versions of the CRC32 instruction to process 4 bytes, then 2, then 1.
The difference is only noticeable for small "odd" sized buffers.
We do the similar improvement for processing the first few bytes in
the case of unaligned buffer.
Fixing the test which was not actually verifying the results for
misaligned buffers (WriteString was creating an internal copy which
was aligned).
Adding benchmarks for length 15 (aligned and misaligned), results
below.
name old time/op new time/op delta
CastagnoliCrc15B-4 25.1ns ± 0% 22.1ns ± 1% -12.14%
CastagnoliCrc15BMisaligned-4 25.2ns ± 0% 22.9ns ± 1% -9.03%
CastagnoliCrc40B-4 23.1ns ± 0% 23.4ns ± 0% +1.08%
CastagnoliCrc1KB-4 127ns ± 0% 128ns ± 0% +1.18%
CastagnoliCrc4KB-4 462ns ± 0% 464ns ± 0% ~
CastagnoliCrc32KB-4 3.58µs ± 0% 3.60µs ± 0% +0.58%
name old speed new speed delta
CastagnoliCrc15B-4 597MB/s ± 0% 679MB/s ± 1% +13.77%
CastagnoliCrc15BMisaligned-4 596MB/s ± 0% 655MB/s ± 1% +9.94%
CastagnoliCrc40B-4 1.73GB/s ± 0% 1.71GB/s ± 0% -1.14%
CastagnoliCrc1KB-4 8.01GB/s ± 0% 7.93GB/s ± 1% -1.06%
CastagnoliCrc4KB-4 8.86GB/s ± 0% 8.83GB/s ± 0% ~
CastagnoliCrc32KB-4 9.14GB/s ± 0% 9.09GB/s ± 0% -0.58%
Change-Id: I499e37af2241d28e3e5d522bbab836c1a718430a
Reviewed-on: https://go-review.googlesource.com/24470
Reviewed-by: Keith Randall <khr@golang.org>
The input buffer is aligned to a doubleword boundary to
improve performance of the vector instructions. The pure
Go implementation is used to align the input data, and is
also used when the vector instructions are not available
or the data length is less than 64 bytes.
Change-Id: Ie259a5f2f1562bcc17961c99e5776c99091d6bed
Reviewed-on: https://go-review.googlesource.com/22201
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
It seems cleaner and more consistent with other files to list the
architectures that have assembly implementations rather than to
list those that do not.
This means we don't have to add s390x and future platforms to this
list.
Change-Id: I2ad3f66b76eb1711333c910236ca7f5151b698e5
Reviewed-on: https://go-review.googlesource.com/21770
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently we test crc64 only with ISO polynomial.
Change-Id: Ibc5e202db3b960369cbbb18e31eb0fea07b54dba
Reviewed-on: https://go-review.googlesource.com/21309
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This adds "slicing by 8" optimization to Castagnoli tables which will
speed up CRC32 calculation on systems without asssembler,
which are all but AMD64.
In my tests, it is faster to use "slicing by 8" for sizes all down to
16 bytes, so the switchover point has been adjusted.
There are no benchmarks for small sizes, so I have added one for 40 bytes,
as well as one for bigger sizes (32KB).
Castagnoli, No assembler, 40 Byte payload: (before, after)
BenchmarkCastagnoli40B-4 10000000 161 ns/op 246.94 MB/s
BenchmarkCastagnoli40B-4 20000000 100 ns/op 398.01 MB/s
Castagnoli, No assembler, 32KB payload: (before, after)
BenchmarkCastagnoli32KB-4 10000 115426 ns/op 283.89 MB/s
BenchmarkCastagnoli32KB-4 30000 45171 ns/op 725.41 MB/s
IEEE, No assembler, 1KB payload: (before, after)
BenchmarkCrc1KB-4 500000 3604 ns/op 284.10 MB/s
BenchmarkCrc1KB-4 1000000 1463 ns/op 699.79 MB/s
Compared:
benchmark old ns/op new ns/op delta
BenchmarkCastagnoli40B-4 161 100 -37.89%
BenchmarkCastagnoli32KB-4 115426 45171 -60.87%
BenchmarkCrc1KB-4 3604 1463 -59.41%
benchmark old MB/s new MB/s speedup
BenchmarkCastagnoli40B-4 246.94 398.01 1.61x
BenchmarkCastagnoli32KB-4 283.89 725.41 2.56x
BenchmarkCrc1KB-4 284.10 699.79 2.46x
Change-Id: I303e4ec84e8d4dafd057d64c0e43deb2b498e968
Reviewed-on: https://go-review.googlesource.com/19335
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Add several instructions that were used via BYTE and use them.
Instructions added: PEXTRB, PEXTRD, PEXTRQ, PINSRB, XGETBV, POPCNT.
Change-Id: I5a80cd390dc01f3555dbbe856a475f74b5e6df65
Reviewed-on: https://go-review.googlesource.com/18593
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
CRC-32 computation is stateless and the p slice does not get stored
anywhere. Thus, we mark the assembly functions as noescape so that
it doesn't believe that p leaks in:
func Update(crc uint32, tab *Table, p []byte) uint32
Before:
./crc32.go:153: leaking param: p
After:
./crc32.go:153: Update p does not escape
Change-Id: I52ba35b6cc544fff724327140e0c27898431d1dc
Reviewed-on: https://go-review.googlesource.com/17069
Reviewed-by: Russ Cox <rsc@golang.org>
iEEETable violates the Go naming conventions and is inconsistent
with the rest of the package. Use ieeeTable instead.
Change-Id: I04b201aa39759d159de2b0295f43da80488c2263
Reviewed-on: https://go-review.googlesource.com/17068
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Change-Id: I77c6768fff6f0163b36800307c4d573bb6521fe5
Reviewed-on: https://go-review.googlesource.com/14454
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
IEEE is the most commonly used CRC-32 polynomial, used by zip, gzip and others.
Based on http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
benchmark old ns/op new ns/op delta
BenchmarkIEEECrc1KB-8 3193 352 -88.98%
BenchmarkIEEECrc4KB-8 5025 1307 -73.99%
BenchmarkCastagnoliCrc1KB-8 126 126 +0.00%
benchmark old MB/s new MB/s speedup
BenchmarkIEEECrc1KB-8 320.68 2901.92 9.05x
BenchmarkIEEECrc4KB-8 815.08 3131.80 3.84x
BenchmarkCastagnoliCrc1KB-8 8100.80 8109.78 1.00x
Change-Id: I99c9a48365f631827f516e44f97e86155f03cb90
Reviewed-on: https://go-review.googlesource.com/14080
Reviewed-by: Keith Randall <khr@golang.org>
Explicitly say that *Table returned by MakeTable may not be
modified. Otherwise, this leads to very subtle bugs that may
or may not manifest themselves.
Same comment was made on package crc64, to keep the future
open to the caching tables that crc32 effectively does.
Fixes: #12487.
Change-Id: I2881bebb8b16f6f8564412172774c79c2593c6c1
Reviewed-on: https://go-review.googlesource.com/14258
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The URL is shown on go docs and is an eye-sore.
For go1.6.
Change-Id: I8b8ea3751200d06ed36acfe22f47ebb38107f8db
Reviewed-on: https://go-review.googlesource.com/13282
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The Slicing-By-8 [1] algorithm has much performance improvements than
current approach. This patch only uses it for IEEE, which is the most
common case in practice.
There is the benchmark on Mac OS X 10.9:
benchmark old MB/s new MB/s speedup
BenchmarkIEEECrc1KB 349.40 353.03 1.01x
BenchmarkIEEECrc4KB 351.55 934.35 2.66x
BenchmarkCastagnoliCrc1KB 7037.58 7392.63 1.05x
This algorithm need 8K lookup table, so it's enabled only for block
larger than 4K.
We can see about 2.6x improvement for IEEE.
Change-Id: I7f786d20f0949245e4aa101d7921669f496ed0f7
Reviewed-on: https://go-review.googlesource.com/1863
Reviewed-by: Russ Cox <rsc@golang.org>
Explicitly specify that we represent polynomial in reversed notation
Fixes#8229
Change-Id: Idf094c01fd82f133cd0c1b50fa967d12c577bdb5
Reviewed-on: https://go-review.googlesource.com/9237
Reviewed-by: David Chase <drchase@google.com>
This also removes pkg/runtime/traceback_lr.c, which was ported
to Go in an earlier commit and then moved to
runtime/traceback.go.
Reviewer: rsc@golang.org
rsc: LGTM