1
0
mirror of https://github.com/golang/go synced 2024-11-14 07:00:21 -07:00
Commit Graph

68 Commits

Author SHA1 Message Date
Alberto Donizetti
35b80eac7d hash/maphash: remove duplicate from Hash documentation
Fixes #44255

Change-Id: I14d2edbee0a0c39e04111414a57d70ee2fdfb6af
Reviewed-on: https://go-review.googlesource.com/c/go/+/291631
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2021-02-24 08:36:15 +00:00
Russ Cox
d4b2638234 all: go fmt std cmd (but revert vendor)
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).

Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild

Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-02-20 03:54:50 +00:00
Russ Cox
4f1b0a44cb all: update to use os.ReadFile, os.WriteFile, os.CreateTemp, os.MkdirTemp
As part of #42026, these helpers from io/ioutil were moved to os.
(ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.)

Update the Go tree to use the preferred names.

As usual, code compiled with the Go 1.4 bootstrap toolchain
and code vendored from other sources is excluded.

ReadDir changes are in a separate CL, because they are not a
simple search and replace.

For #42026.

Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/266365
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-12-09 19:12:23 +00:00
Russ Cox
5ef78c4d84 hash/crc32: fix race between lazy Castagnoli init and Update/Write
The switch on tab is checking tab == castagnoliTable,
but castagnoliTable can change value during a concurrent
call to MakeTable.

Fixes #41911.

Change-Id: I6124dcdbf33e17fe302baa3e1aa03202dec61b4c
Reviewed-on: https://go-review.googlesource.com/c/go/+/261639
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-10-13 00:31:47 +00:00
Russ Cox
9e2acf94fe hash/maphash: adjust package comment
Add note about using per-use seeds.

Delete "collision-resistant but" in:
> The hash functions are collision-resistant but not cryptographically secure.

"Collision-resistant" has a precise cryptographic meaning that is
incompatible with "not cryptographically secure".
All that is really meant by it here here is "it's a good hash function",
which should be established already.

Also delete:
> The hash value of a given byte sequence is consistent within a
> single process, but will be different in different processes.

This was added for its final clause in response to #37040,
but "The hash value of a given byte sequence" is by design not a
concept in this package. Only "... of a given seed and byte sequence".
And seeds cannot be shared between processes, so again by design
you can't even set up the appropriate first half of the sentence
to say the second half.

Change-Id: I2c02bee0e804ef3b120cb4752bf89e60f3f5ff5d
Reviewed-on: https://go-review.googlesource.com/c/go/+/255968
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-10-12 16:30:45 +00:00
Meng Zhuo
454300a617 hash/maphash: adding benchmarks for maphash
goos: linux
goarch: arm64
pkg: hash/maphash
BenchmarkHash8Bytes
BenchmarkHash8Bytes     22568919                46.0 ns/op       173.80 MB/s
BenchmarkHash320Bytes
BenchmarkHash320Bytes    5243858               230 ns/op        1393.30 MB/s
BenchmarkHash1K
BenchmarkHash1K          1755870               660 ns/op        1550.60 MB/s
BenchmarkHash8K
BenchmarkHash8K           225688              5313 ns/op        1541.90 MB/s
PASS
ok      hash/maphash    6.465s

Change-Id: I5a909042a542135ebc47d639fea02dc46c900c1c
Reviewed-on: https://go-review.googlesource.com/c/go/+/249079
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-08-21 05:30:14 +00:00
Ruixin(Peter) Bao
9a3f22be7a hash/crc32: simplify hasVX checking on s390x
Originally, we use an assembly function that returns a boolean result to
tell whether the machine has vector facility or not. It is now no longer
needed when we can directly use cpu.S390X.HasVX variable. This CL
also removes the last occurence of hasVectorFacility function on s390x.

Change-Id: Id20cb746c21eacac5e13344b362e2d87adfe4317
Reviewed-on: https://go-review.googlesource.com/c/go/+/230337
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-27 21:18:58 +00:00
Keith Randall
bef0b4ea8f hash/maphash: add more tests for seed generation
Test all the paths by which a Hash picks its seed.
Make sure they all behave identically to a preset seed.

Change-Id: I2f7950857697f2f07226b96655574c36931b2aae
Reviewed-on: https://go-review.googlesource.com/c/go/+/220686
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Vladimir Evgrafov <evgrafov.vladimir@gmail.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2020-03-02 17:15:20 +00:00
Alberto Donizetti
ec4c9db210 hash/maphash: add package-level example
Change-Id: I05c7ca644410822a527e94a7a8b883a0f8b0a4ad
Reviewed-on: https://go-review.googlesource.com/c/go/+/220420
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2020-02-24 01:59:55 +00:00
vovapi
638df87fa4 hash/maphash: don't discard data on random seed init
Hash initializes seed on the first usage of seed or state with initSeed.
initSeed uses SetSeed which discards accumulated data.
This causes hash to return different sums for the same data in the first use
and after reset.
This CL fixes this issue by separating the seed set from data discard.

Fixes #37315

Change-Id: Ic7020702c2ce822eb700af462e37efab12f72054
GitHub-Last-Rev: 48b2f963e8
GitHub-Pull-Request: golang/go#37328
Reviewed-on: https://go-review.googlesource.com/c/go/+/220259
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-22 15:25:30 +00:00
Keith Randall
a0c9fb6bd3 hash/maphash: mention the results are 64-bit integers
Change-Id: I0d2ba52d79c34d77d475ec8d673286d0e56b826b
Reviewed-on: https://go-review.googlesource.com/c/go/+/219340
Reviewed-by: Alan Donovan <adonovan@google.com>
2020-02-13 18:32:48 +00:00
Ian Lance Taylor
1c241d2879 hash/maphash: mention that hash values do not persist in package docs
Updates #36878
Fixes #37040

Change-Id: Ib0bd21481e5d9c3b3966c116966ecfe071243a24
Reviewed-on: https://go-review.googlesource.com/c/go/+/218297
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2020-02-11 05:55:11 +00:00
Keith Randall
6ba0be1639 hash/maphash: mark call into runtime hash function as not escaping
This allows maphash.Hash to be allocated on the stack for typical uses.

Fixes #35636

Change-Id: I8366507d26ea717f47a9fb46d3bd69ba799845ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/207444
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-11-16 20:31:45 +00:00
Russ Cox
5a7c571ea1 hash/maphash: revise API to be more idiomatic
This CL makes these changes to the hash/maphash API to make it fit a bit
more into the standard library:

 - Move some of the package doc onto type Hash, so that `go doc maphash.Hash` shows it.

 - Instead of having identical AddBytes and Write methods,
   standardize on Write, the usual name for this function.
   Similarly, AddString -> WriteString, AddByte -> WriteByte.

 - Instead of having identical Hash and Sum64 methods,
   standardize on Sum64 (for hash.Hash64). Dropping the "Hash" method
   also helps because Hash is usually reserved to mean the state of a
   hash function (hash.Hash etc), not the hash value itself.

 - Make an uninitialized hash.Hash auto-seed with a random seed.
   It is critical that users not use the same seed for all hash functions
   in their program, at least not accidentally. So the Hash implementation
   must either panic if uninitialized or initialize itself.
   Initializing itself is less work for users and can be done lazily.

 - Now that the zero hash.Hash is useful, drop maphash.New in favor of
   new(maphash.Hash) or simply declaring a maphash.Hash.

 - Add a [0]func()-typed field to the Hash so that Hashes cannot be compared.
   (I considered doing the same for Seed but comparing seeds seems OK.)

 - Drop the integer argument from MakeSeed, to match the original design
   in golang.org/issue/28322. There is no point to giving users control
   over the specific seed bits, since we want the interpretation of those
   bits to be different in every different process. The only thing users
   need is to be able to create a new random seed at each call.
   (Fixes a TODO in MakeSeed's public doc comment.)

This API is new in Go 1.14, so these changes do not violate the compatibility promise.

Fixes #35060.
Fixes #35348.

Change-Id: Ie6fecc441f3f5ef66388c6ead92e875c0871f805
Reviewed-on: https://go-review.googlesource.com/c/go/+/205069
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2019-11-04 21:30:29 +00:00
Keith Randall
35cfe059a1 hash/maphash: move bytes/hash to hash/maphash
Fixes #34778

Change-Id: If8225a7c41cb2af3f67157fb9670eef86272e85e
Reviewed-on: https://go-review.googlesource.com/c/go/+/204997
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-11-02 18:30:37 +00:00
Brad Fitzpatrick
03ef105dae all: remove nacl (part 3, more amd64p32)
Part 1: CL 199499 (GOOS nacl)
Part 2: CL 200077 (amd64p32 files, toolchain)
Part 3: stuff that arguably should've been part of Part 2, but I forgot
        one of my grep patterns when splitting the original CL up into
        two parts.

This one might also have interesting stuff to resurrect for any future
x32 ABI support.

Updates #30439

Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/200318
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-10 22:38:38 +00:00
Brad Fitzpatrick
07b4abd62e all: remove the nacl port (part 2, amd64p32 + toolchain)
This is part two if the nacl removal. Part 1 was CL 199499.

This CL removes amd64p32 support, which might be useful in the future
if we implement the x32 ABI. It also removes the nacl bits in the
toolchain, and some remaining nacl bits.

Updates #30439

Change-Id: I2475d5bb066d1b474e00e40d95b520e7c2e286e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/200077
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-09 22:34:34 +00:00
Brian Kessler
1ccb66d1ef hash/fnv: use bits.Mul64 for 128-bit hash
Replace the 128-bit multiplication in 4 parts with bits.Mul64
and two single-width multiplications.  This simplifies the code
and increases throughput by ~50% on amd64.

name         old time/op   new time/op   delta
Fnv128KB-4    9.64µs ± 0%   6.09µs ± 0%  -36.89%  (p=0.016 n=4+5)
Fnv128aKB-4   9.11µs ± 0%   6.17µs ± 5%  -32.32%  (p=0.008 n=5+5)

name         old speed     new speed     delta
Fnv128KB-4   106MB/s ± 0%  168MB/s ± 0%  +58.44%  (p=0.016 n=4+5)
Fnv128aKB-4  112MB/s ± 0%  166MB/s ± 5%  +47.85%  (p=0.008 n=5+5)

Change-Id: Id752f2a20ea3de23a41e08db89eecf2bb60b7e6d
Reviewed-on: https://go-review.googlesource.com/c/133936
Run-TryBot: Matt Layher <mdlayher@gmail.com>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matt Layher <mdlayher@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-12-10 22:50:48 +00:00
Tobias Klauser
41d6315e34 hash/crc64: use t.Fatalf in TestGolden
Use t.Fatalf instead of t.Errorf followed by t.FailNow.

Change-Id: Ie31f8006e7d9daca7f59bf6f0d5ae688222be486
Reviewed-on: https://go-review.googlesource.com/c/144111
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-25 06:32:12 +00:00
Cholerae Hu
2a5df06765 hash/crc64: lazily initialize slice8Tables
Saves 36KB of memory in stdlib packages.

Updates #26775

Change-Id: I0f9d7b17d9768f6fb980d5fbba7c45920215a5fc
Reviewed-on: https://go-review.googlesource.com/127735
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-21 04:50:21 +00:00
Martin Möhrmann
cd0e79d9f1 all: use internal/cpu feature variables directly
Avoid using package specific variables when there is a one to one
correspondance to cpu feature support exported by internal/cpu.

This makes it clearer which cpu feature is referenced.
Another advantage is that internal/cpu variables are padded to avoid
false sharing and memory and cache usage is shared by multiple packages.

Change-Id: If18fb448a95207cfa6a3376f3b2ddc4b230dd138
Reviewed-on: https://go-review.googlesource.com/126596
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-20 14:47:07 +00:00
Tim Cooper
161874da2a all: update comment URLs from HTTP to HTTPS, where possible
Each URL was manually verified to ensure it did not serve up incorrect
content.

Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2018-06-01 21:52:00 +00:00
Ilya Tocar
93665c0d81 crypto: remove hand encoded amd64 instructions
Replace BYTE.. encodings with asm. This is possible due to asm
implementing more instructions and removal of
MOV $0, reg -> XOR reg, reg transformation from asm.

Change-Id: I011749ab6b3f64403ab6e746f3760c5841548b57
Reviewed-on: https://go-review.googlesource.com/97936
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 19:20:53 +00:00
Russ Cox
1d547e4a68 hash: add MarshalBinary/UnmarshalBinary round trip + golden test for all implementations
There are some basic tests in the packages implementing the hashes,
but this one is meant to be comprehensive for the standard library
as a whole.

Most importantly, it locks in the current representations and makes
sure that they do not change from release to release (and also, as a
result, that future releases can parse the representations generated
by older releases).

The crypto/* MarshalBinary implementations are being changed
in this CL to write only d.x[:d.nx] to the encoding, with zeros for
the remainder of the slice d.x[d.nx:]. The old encoding wrote the
whole d.x, but that exposed an internal detail: whether d.x is
cleared after a full buffer is accumulated, and also whether d.x was
used at all for previous blocks (consider 1-byte writes vs 1024-byte writes).
The new encoding writes only what the decoder needs to know,
nothing more.

In fact the old encodings were arguably also a security hole,
because they exposed data written even before the most recent
call to the Reset method, data that clearly has no impact on the
current hash and clearly should not be exposed. The leakage
is clearly visible in the old crypto/sha1 golden test tables also
being modified in this CL.

Change-Id: I4e9193a3ec5f91d27ce7d0aa24c19b3923741416
Reviewed-on: https://go-review.googlesource.com/82136
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-12-06 07:45:46 +00:00
Joe Tsai
b53088a634 Revert "go/printer: forbid empty line before first comment in block"
This reverts commit 08f19bbde1.

Reason for revert:
The changed transformation takes effect on a larger set
of code snippets than expected.

For example, this:
    func foo() {

        // Comment
        bar()

    }
becomes:
    func foo() {
        // Comment
        bar()

    }

This is an unintended consequence.

Change-Id: Ifca88d6267dab8a8170791f7205124712bf8ace8
Reviewed-on: https://go-review.googlesource.com/81335
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <joetsai@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-12-01 01:12:26 +00:00
Roger Peppe
bd926e1c65 crypto, hash: document marshal/unmarshal implementation
Unless you go back and read the hash package documentation, it's
not clear that all the hash packages implement marshaling and
unmarshaling. Document the behaviour specifically in each package
that implements it as it this is hidden behaviour and easy to miss.

Change-Id: Id9d3508909362f1a3e53872d0319298359e50a94
Reviewed-on: https://go-review.googlesource.com/77251
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-11-15 00:06:24 +00:00
Fangming.Fang
66bfbd9ad7 internal/cpu: detect cpu features in internal/cpu package
change hash/crc32 package to use cpu package instead of using
runtime internal variables to check crc32 instruction

Change-Id: I8f88d2351bde8ed4e256f9adf822a08b9a00f532
Reviewed-on: https://go-review.googlesource.com/76490
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-14 19:07:15 +00:00
Joe Tsai
4aea3e7135 hash: document that the encoded state may contain input in plaintext
The cryptographic checksums operate in blocks of 64 or 128 bytes,
which means that the last 128 bytes or so of the input may be encoded
in its original (plaintext) form as part of the state.
Document this so users do not falsely assume that the encoded state
carries no reversible information about the input.

Change-Id: I823dbb87867bf0a77aa20f6ed7a615dbedab3715
Reviewed-on: https://go-review.googlesource.com/77372
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-13 22:14:58 +00:00
Tim Cooper
0ee4527ac7 hash: add marshaling, unmarshaling example
Example usage of functionality implemented in CL 66710.

Change-Id: I87d6e4d2fb7a60e4ba1e6ef02715480eb7e8f8bd
Reviewed-on: https://go-review.googlesource.com/76011
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-11-04 03:47:34 +00:00
Joe Tsai
08f19bbde1 go/printer: forbid empty line before first comment in block
To improve readability when exported fields are removed,
forbid the printer from emitting an empty line before the first comment
in a const, var, or type block.
Also, when printing the "Has filtered or unexported fields." message,
add an empty line before it to separate the message from the struct
or interfact contents.

Before the change:
<<<
type NamedArg struct {

        // Name is the name of the parameter placeholder.
        //
        // If empty, the ordinal position in the argument list will be
        // used.
        //
        // Name must omit any symbol prefix.
        Name string

        // Value is the value of the parameter.
        // It may be assigned the same value types as the query
        // arguments.
        Value interface{}
        // contains filtered or unexported fields
}
>>>

After the change:
<<<
type NamedArg struct {
        // Name is the name of the parameter placeholder.
        //
        // If empty, the ordinal position in the argument list will be
        // used.
        //
        // Name must omit any symbol prefix.
        Name string

        // Value is the value of the parameter.
        // It may be assigned the same value types as the query
        // arguments.
        Value interface{}

        // contains filtered or unexported fields
}
>>>

Fixes #18264

Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88
Reviewed-on: https://go-review.googlesource.com/71990
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-11-02 18:17:22 +00:00
Tim Cooper
731b632172 crypto, hash: implement BinaryMarshaler, BinaryUnmarshaler in hash implementations
The marshal method allows the hash's internal state to be serialized and
unmarshaled at a later time, without having the re-write the entire stream
of data that was already written to the hash.

Fixes #20573

Change-Id: I40bbb84702ac4b7c5662f99bf943cdf4081203e5
Reviewed-on: https://go-review.googlesource.com/66710
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-01 21:04:12 +00:00
Mikio Hara
7b659eb155 all: gofmt
Change-Id: I2d0439a9f068e726173afafe2ef1f5d62b7feb4d
Reviewed-on: https://go-review.googlesource.com/46190
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-06-21 03:14:30 +00:00
Martin Möhrmann
69972aea74 internal/cpu: new package to detect cpu features
Implements detection of x86 cpu features that
are used in the go standard library.

Changes all standard library packages to use the new cpu package
instead of using runtime internal variables to check x86 cpu features.

Updates: #15403

Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9
Reviewed-on: https://go-review.googlesource.com/41476
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-10 17:02:21 +00:00
Wei Xiao
ab636b899c hash/crc32: optimize arm64 crc32 implementation
ARMv8 defines crc32 instruction.

Comparing to the original crc32 calculation, this patch makes use of
crc32 instructions to do crc32 calculation instead of the multiple
lookup table algorithms.

ARMv8 provides IEEE and Castagnoli polynomials for crc32 calculation
so that the perfomance of these two types of crc32 get significant
improved.

name                                        old time/op   new time/op    delta
CRC32/poly=IEEE/size=15/align=0-32            117ns ± 0%      38ns ± 0%   -67.44%
CRC32/poly=IEEE/size=15/align=1-32            117ns ± 0%      38ns ± 0%   -67.52%
CRC32/poly=IEEE/size=40/align=0-32            129ns ± 0%      41ns ± 0%   -68.37%
CRC32/poly=IEEE/size=40/align=1-32            129ns ± 0%      41ns ± 0%   -68.29%
CRC32/poly=IEEE/size=512/align=0-32           828ns ± 0%     246ns ± 0%   -70.29%
CRC32/poly=IEEE/size=512/align=1-32           828ns ± 0%     132ns ± 0%   -84.06%
CRC32/poly=IEEE/size=1kB/align=0-32          1.58µs ± 0%    0.46µs ± 0%   -70.98%
CRC32/poly=IEEE/size=1kB/align=1-32          1.58µs ± 0%    0.46µs ± 0%   -70.92%
CRC32/poly=IEEE/size=4kB/align=0-32          6.06µs ± 0%    1.74µs ± 0%   -71.27%
CRC32/poly=IEEE/size=4kB/align=1-32          6.10µs ± 0%    1.74µs ± 0%   -71.44%
CRC32/poly=IEEE/size=32kB/align=0-32         48.3µs ± 0%    13.7µs ± 0%   -71.61%
CRC32/poly=IEEE/size=32kB/align=1-32         48.3µs ± 0%    13.7µs ± 0%   -71.60%
CRC32/poly=Castagnoli/size=15/align=0-32      116ns ± 0%      38ns ± 0%   -67.07%
CRC32/poly=Castagnoli/size=15/align=1-32      116ns ± 0%      38ns ± 0%   -66.90%
CRC32/poly=Castagnoli/size=40/align=0-32      127ns ± 0%      40ns ± 0%   -68.11%
CRC32/poly=Castagnoli/size=40/align=1-32      127ns ± 0%      40ns ± 0%   -68.11%
CRC32/poly=Castagnoli/size=512/align=0-32     828ns ± 0%     132ns ± 0%   -84.06%
CRC32/poly=Castagnoli/size=512/align=1-32     827ns ± 0%     132ns ± 0%   -84.04%
CRC32/poly=Castagnoli/size=1kB/align=0-32    1.59µs ± 0%    0.22µs ± 0%   -85.89%
CRC32/poly=Castagnoli/size=1kB/align=1-32    1.58µs ± 0%    0.22µs ± 0%   -85.79%
CRC32/poly=Castagnoli/size=4kB/align=0-32    6.14µs ± 0%    0.77µs ± 0%   -87.40%
CRC32/poly=Castagnoli/size=4kB/align=1-32    6.06µs ± 0%    0.77µs ± 0%   -87.25%
CRC32/poly=Castagnoli/size=32kB/align=0-32   48.3µs ± 0%     5.9µs ± 0%   -87.71%
CRC32/poly=Castagnoli/size=32kB/align=1-32   48.4µs ± 0%     6.0µs ± 0%   -87.69%
CRC32/poly=Koopman/size=15/align=0-32         104ns ± 0%     104ns ± 0%    +0.00%
CRC32/poly=Koopman/size=15/align=1-32         104ns ± 0%     104ns ± 0%    +0.00%
CRC32/poly=Koopman/size=40/align=0-32         235ns ± 0%     235ns ± 0%    +0.00%
CRC32/poly=Koopman/size=40/align=1-32         235ns ± 0%     235ns ± 0%    +0.00%
CRC32/poly=Koopman/size=512/align=0-32       2.71µs ± 0%    2.71µs ± 0%    -0.07%
CRC32/poly=Koopman/size=512/align=1-32       2.71µs ± 0%    2.71µs ± 0%    -0.04%
CRC32/poly=Koopman/size=1kB/align=0-32       5.40µs ± 0%    5.39µs ± 0%    -0.06%
CRC32/poly=Koopman/size=1kB/align=1-32       5.40µs ± 0%    5.40µs ± 0%    +0.02%
CRC32/poly=Koopman/size=4kB/align=0-32       21.5µs ± 0%    21.5µs ± 0%    -0.16%
CRC32/poly=Koopman/size=4kB/align=1-32       21.5µs ± 0%    21.5µs ± 0%    -0.05%
CRC32/poly=Koopman/size=32kB/align=0-32       172µs ± 0%     172µs ± 0%    -0.07%
CRC32/poly=Koopman/size=32kB/align=1-32       172µs ± 0%     172µs ± 0%    -0.01%

name                                        old speed     new speed      delta
CRC32/poly=IEEE/size=15/align=0-32          128MB/s ± 0%   394MB/s ± 0%  +207.95%
CRC32/poly=IEEE/size=15/align=1-32          128MB/s ± 0%   394MB/s ± 0%  +208.09%
CRC32/poly=IEEE/size=40/align=0-32          310MB/s ± 0%   979MB/s ± 0%  +216.07%
CRC32/poly=IEEE/size=40/align=1-32          310MB/s ± 0%   979MB/s ± 0%  +216.16%
CRC32/poly=IEEE/size=512/align=0-32         618MB/s ± 0%  2074MB/s ± 0%  +235.72%
CRC32/poly=IEEE/size=512/align=1-32         618MB/s ± 0%  3852MB/s ± 0%  +523.55%
CRC32/poly=IEEE/size=1kB/align=0-32         646MB/s ± 0%  2225MB/s ± 0%  +244.57%
CRC32/poly=IEEE/size=1kB/align=1-32         647MB/s ± 0%  2225MB/s ± 0%  +243.87%
CRC32/poly=IEEE/size=4kB/align=0-32         676MB/s ± 0%  2352MB/s ± 0%  +248.02%
CRC32/poly=IEEE/size=4kB/align=1-32         672MB/s ± 0%  2352MB/s ± 0%  +250.15%
CRC32/poly=IEEE/size=32kB/align=0-32        678MB/s ± 0%  2387MB/s ± 0%  +252.17%
CRC32/poly=IEEE/size=32kB/align=1-32        678MB/s ± 0%  2388MB/s ± 0%  +252.11%
CRC32/poly=Castagnoli/size=15/align=0-32    129MB/s ± 0%   393MB/s ± 0%  +205.51%
CRC32/poly=Castagnoli/size=15/align=1-32    129MB/s ± 0%   390MB/s ± 0%  +203.41%
CRC32/poly=Castagnoli/size=40/align=0-32    314MB/s ± 0%   988MB/s ± 0%  +215.04%
CRC32/poly=Castagnoli/size=40/align=1-32    314MB/s ± 0%   987MB/s ± 0%  +214.68%
CRC32/poly=Castagnoli/size=512/align=0-32   618MB/s ± 0%  3860MB/s ± 0%  +524.32%
CRC32/poly=Castagnoli/size=512/align=1-32   619MB/s ± 0%  3859MB/s ± 0%  +523.66%
CRC32/poly=Castagnoli/size=1kB/align=0-32   645MB/s ± 0%  4568MB/s ± 0%  +608.56%
CRC32/poly=Castagnoli/size=1kB/align=1-32   650MB/s ± 0%  4567MB/s ± 0%  +602.94%
CRC32/poly=Castagnoli/size=4kB/align=0-32   667MB/s ± 0%  5297MB/s ± 0%  +693.81%
CRC32/poly=Castagnoli/size=4kB/align=1-32   676MB/s ± 0%  5297MB/s ± 0%  +684.00%
CRC32/poly=Castagnoli/size=32kB/align=0-32  678MB/s ± 0%  5519MB/s ± 0%  +713.83%
CRC32/poly=Castagnoli/size=32kB/align=1-32  677MB/s ± 0%  5497MB/s ± 0%  +712.04%
CRC32/poly=Koopman/size=15/align=0-32       143MB/s ± 0%   144MB/s ± 0%    +0.27%
CRC32/poly=Koopman/size=15/align=1-32       143MB/s ± 0%   144MB/s ± 0%    +0.33%
CRC32/poly=Koopman/size=40/align=0-32       169MB/s ± 0%   170MB/s ± 0%    +0.12%
CRC32/poly=Koopman/size=40/align=1-32       170MB/s ± 0%   170MB/s ± 0%    +0.08%
CRC32/poly=Koopman/size=512/align=0-32      189MB/s ± 0%   189MB/s ± 0%    +0.07%
CRC32/poly=Koopman/size=512/align=1-32      189MB/s ± 0%   189MB/s ± 0%    +0.04%
CRC32/poly=Koopman/size=1kB/align=0-32      190MB/s ± 0%   190MB/s ± 0%    +0.05%
CRC32/poly=Koopman/size=1kB/align=1-32      190MB/s ± 0%   190MB/s ± 0%    -0.01%
CRC32/poly=Koopman/size=4kB/align=0-32      190MB/s ± 0%   190MB/s ± 0%    +0.15%
CRC32/poly=Koopman/size=4kB/align=1-32      190MB/s ± 0%   191MB/s ± 0%    +0.05%
CRC32/poly=Koopman/size=32kB/align=0-32     191MB/s ± 0%   191MB/s ± 0%    +0.06%
CRC32/poly=Koopman/size=32kB/align=1-32     191MB/s ± 0%   191MB/s ± 0%    +0.02%

Also fix a bug of arm64 assembler

The optimization is mainly contributed by Fangming.Fang <fangming.fang@arm.com>

Change-Id: I900678c2e445d7e8ad9e2a9ab3305d649230905f
Reviewed-on: https://go-review.googlesource.com/40074
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-13 12:44:10 +00:00
Lucas Clemente
e05de6a5be hash/fnv: add 128-bit FNV hash support
The 128bit FNV hash will be used e.g. in QUIC.

The algorithm is described at
https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function

Change-Id: I13f3ec39b0e12b7a5008824a6619dff2e708ee81
Reviewed-on: https://go-review.googlesource.com/38356
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-13 01:28:48 +00:00
Eric Lagergren
094498c9a1 all: fix minor misspellings
Change-Id: I1f1cfb161640eb8756fb1a283892d06b30b7a8fa
Reviewed-on: https://go-review.googlesource.com/39356
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-03 23:19:07 +00:00
Lynn Boger
b6cd22c277 hash/crc32: improve performance for ppc64le
This change improves the performance of crc32 for ppc64le by using
vpmsum and other vector instructions in the algorithm.

The testcase was updated to test more sizes.

Fixes #19570

BenchmarkCRC32/poly=IEEE/size=15/align=0-8             90.5          81.8          -9.61%
BenchmarkCRC32/poly=IEEE/size=15/align=1-8             89.7          81.7          -8.92%
BenchmarkCRC32/poly=IEEE/size=40/align=0-8             93.2          61.1          -34.44%
BenchmarkCRC32/poly=IEEE/size=40/align=1-8             92.8          60.9          -34.38%
BenchmarkCRC32/poly=IEEE/size=512/align=0-8            501           55.8          -88.86%
BenchmarkCRC32/poly=IEEE/size=512/align=1-8            502           132           -73.71%
BenchmarkCRC32/poly=IEEE/size=1kB/align=0-8            947           69.9          -92.62%
BenchmarkCRC32/poly=IEEE/size=1kB/align=1-8            946           144           -84.78%
BenchmarkCRC32/poly=IEEE/size=4kB/align=0-8            3602          186           -94.84%
BenchmarkCRC32/poly=IEEE/size=4kB/align=1-8            3603          263           -92.70%
BenchmarkCRC32/poly=IEEE/size=32kB/align=0-8           28404         1338          -95.29%
BenchmarkCRC32/poly=IEEE/size=32kB/align=1-8           28856         1405          -95.13%
BenchmarkCRC32/poly=Castagnoli/size=15/align=0-8       89.7          81.8          -8.81%
BenchmarkCRC32/poly=Castagnoli/size=15/align=1-8       89.8          81.9          -8.80%
BenchmarkCRC32/poly=Castagnoli/size=40/align=0-8       93.8          61.4          -34.54%
BenchmarkCRC32/poly=Castagnoli/size=40/align=1-8       94.3          61.3          -34.99%
BenchmarkCRC32/poly=Castagnoli/size=512/align=0-8      503           56.4          -88.79%
BenchmarkCRC32/poly=Castagnoli/size=512/align=1-8      502           132           -73.71%
BenchmarkCRC32/poly=Castagnoli/size=1kB/align=0-8      941           70.2          -92.54%
BenchmarkCRC32/poly=Castagnoli/size=1kB/align=1-8      943           145           -84.62%
BenchmarkCRC32/poly=Castagnoli/size=4kB/align=0-8      3588          186           -94.82%
BenchmarkCRC32/poly=Castagnoli/size=4kB/align=1-8      3595          264           -92.66%
BenchmarkCRC32/poly=Castagnoli/size=32kB/align=0-8     28266         1323          -95.32%
BenchmarkCRC32/poly=Castagnoli/size=32kB/align=1-8     28344         1404          -95.05%

Change-Id: Ic4d8274c66e0e87bfba5f609f508a3877aee6bb5
Reviewed-on: https://go-review.googlesource.com/38184
Reviewed-by: David Chase <drchase@google.com>
2017-03-17 12:28:57 +00:00
Russ Cox
04e0a7622c hash/crc32: use sub-benchmarks
Change-Id: Iae68a097a6897f1616f94fdc3548837ef200e66f
Reviewed-on: https://go-review.googlesource.com/36541
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-02-08 17:17:08 +00:00
Radu Berinde
bdde10137b hash/crc32: cleanup code and improve tests
Major reorganization of the crc32 code:

 - The arch-specific files now implement a well-defined interface
   (documented in crc32.go). They no longer have the responsibility of
   initializing and falling back to a non-accelerated implementation;
   instead, that happens in the higher level code.

 - The non-accelerated algorithms are moved to a separate file with no
   dependencies on other code.

 - The "cutoff" optimization for slicing-by-8 is moved inside the
   algorithm itself (as opposed to every callsite).

Tests are significantly improved:
 - direct tests for the non-accelerated algorithms.
 - "cross-check" tests for arch-specific implementations (all archs).
 - tests for misaligned buffers for both IEEE and Castagnoli.

Fixes #16909.

Change-Id: I9b6dd83b7a57cd615eae901c0a6d61c6b8091c74
Reviewed-on: https://go-review.googlesource.com/27935
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-31 15:17:57 +00:00
Radu Berinde
8c15a17251 hash/crc32: fix nil Castagnoli table problem
When SSE is available, we don't need the Table. However, it is
returned as a handle by MakeTable. Fix this to always generate
the table.

Further cleanup is discussed in #16909.

Change-Id: Ic05400d68c6b5d25073ebd962000451746137afc
Reviewed-on: https://go-review.googlesource.com/27934
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-28 19:01:07 +00:00
Radu Berinde
90c3cf4b52 hash/crc32: improve the AMD64 implementation using SSE4.2
The algorithm is explained in the comments. The improvement in
throughput is about 1.4x for buffers between 500b-4Kb and 2.5x-2.6x
for larger buffers.

Additionally, we no longer initialize the software tables if SSE4.2 is
available.

Adding a test for the SSE implementation (restricted to amd64 and
amd64p32).

Benchmarks on a Haswell i5-4670 @ 3.4 GHz:

name                           old time/op    new time/op     delta
CastagnoliCrc15B-4               21.9ns ± 1%     22.9ns ± 0%    +4.45%
CastagnoliCrc15BMisaligned-4     22.6ns ± 0%     23.4ns ± 0%    +3.43%
CastagnoliCrc40B-4               23.3ns ± 0%     23.9ns ± 0%    +2.58%
CastagnoliCrc40BMisaligned-4     25.4ns ± 0%     26.1ns ± 0%    +2.86%
CastagnoliCrc512-4               72.6ns ± 0%     52.8ns ± 0%   -27.33%
CastagnoliCrc512Misaligned-4     76.3ns ± 1%     56.3ns ± 0%   -26.18%
CastagnoliCrc1KB-4                128ns ± 1%       89ns ± 0%   -30.04%
CastagnoliCrc1KBMisaligned-4      130ns ± 0%       88ns ± 0%   -32.65%
CastagnoliCrc4KB-4                461ns ± 0%      187ns ± 0%   -59.40%
CastagnoliCrc4KBMisaligned-4      463ns ± 0%      191ns ± 0%   -58.77%
CastagnoliCrc32KB-4              3.58µs ± 0%     1.35µs ± 0%   -62.22%
CastagnoliCrc32KBMisaligned-4    3.58µs ± 0%     1.36µs ± 0%   -61.84%

name                           old speed      new speed       delta
CastagnoliCrc15B-4              684MB/s ± 1%    655MB/s ± 0%    -4.32%
CastagnoliCrc15BMisaligned-4    663MB/s ± 0%    641MB/s ± 0%    -3.32%
CastagnoliCrc40B-4             1.72GB/s ± 0%   1.67GB/s ± 0%    -2.69%
CastagnoliCrc40BMisaligned-4   1.58GB/s ± 0%   1.53GB/s ± 0%    -2.82%
CastagnoliCrc512-4             7.05GB/s ± 0%   9.70GB/s ± 0%   +37.59%
CastagnoliCrc512Misaligned-4   6.71GB/s ± 1%   9.09GB/s ± 0%   +35.43%
CastagnoliCrc1KB-4             7.98GB/s ± 1%  11.46GB/s ± 0%   +43.55%
CastagnoliCrc1KBMisaligned-4   7.86GB/s ± 0%  11.70GB/s ± 0%   +48.75%
CastagnoliCrc4KB-4             8.87GB/s ± 0%  21.80GB/s ± 0%  +145.69%
CastagnoliCrc4KBMisaligned-4   8.83GB/s ± 0%  21.39GB/s ± 0%  +142.25%
CastagnoliCrc32KB-4            9.15GB/s ± 0%  24.22GB/s ± 0%  +164.62%
CastagnoliCrc32KBMisaligned-4  9.16GB/s ± 0%  24.00GB/s ± 0%  +161.94%

Fixes #16107.

Change-Id: Ibe50ea76574674ce0571ef31c31015e0ed66b907
Reviewed-on: https://go-review.googlesource.com/27931
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-28 01:39:03 +00:00
Keith Randall
3427f16642 Revert "hash/crc32: improve the AMD64 implementation using SSE4.2"
This reverts commit 54d7de7dd6.

It was breaking non-amd64 builds.

Change-Id: I22650e922498eeeba3d4fa08bb4ea40a210c8f97
Reviewed-on: https://go-review.googlesource.com/27925
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-27 16:49:02 +00:00
Radu Berinde
54d7de7dd6 hash/crc32: improve the AMD64 implementation using SSE4.2
The algorithm is explained in the comments. The improvement in
throughput is about 1.4x for buffers between 500b-4Kb and 2.5x-2.6x
for larger buffers.

Additionally, we no longer initialize the software tables if SSE4.2 is
available.

Benchmarks on a Haswell i5-4670 @ 3.4 GHz:

name                           old time/op    new time/op     delta
CastagnoliCrc15B-4               21.9ns ± 1%     22.9ns ± 0%    +4.45%
CastagnoliCrc15BMisaligned-4     22.6ns ± 0%     23.4ns ± 0%    +3.43%
CastagnoliCrc40B-4               23.3ns ± 0%     23.9ns ± 0%    +2.58%
CastagnoliCrc40BMisaligned-4     25.4ns ± 0%     26.1ns ± 0%    +2.86%
CastagnoliCrc512-4               72.6ns ± 0%     52.8ns ± 0%   -27.33%
CastagnoliCrc512Misaligned-4     76.3ns ± 1%     56.3ns ± 0%   -26.18%
CastagnoliCrc1KB-4                128ns ± 1%       89ns ± 0%   -30.04%
CastagnoliCrc1KBMisaligned-4      130ns ± 0%       88ns ± 0%   -32.65%
CastagnoliCrc4KB-4                461ns ± 0%      187ns ± 0%   -59.40%
CastagnoliCrc4KBMisaligned-4      463ns ± 0%      191ns ± 0%   -58.77%
CastagnoliCrc32KB-4              3.58µs ± 0%     1.35µs ± 0%   -62.22%
CastagnoliCrc32KBMisaligned-4    3.58µs ± 0%     1.36µs ± 0%   -61.84%

name                           old speed      new speed       delta
CastagnoliCrc15B-4              684MB/s ± 1%    655MB/s ± 0%    -4.32%
CastagnoliCrc15BMisaligned-4    663MB/s ± 0%    641MB/s ± 0%    -3.32%
CastagnoliCrc40B-4             1.72GB/s ± 0%   1.67GB/s ± 0%    -2.69%
CastagnoliCrc40BMisaligned-4   1.58GB/s ± 0%   1.53GB/s ± 0%    -2.82%
CastagnoliCrc512-4             7.05GB/s ± 0%   9.70GB/s ± 0%   +37.59%
CastagnoliCrc512Misaligned-4   6.71GB/s ± 1%   9.09GB/s ± 0%   +35.43%
CastagnoliCrc1KB-4             7.98GB/s ± 1%  11.46GB/s ± 0%   +43.55%
CastagnoliCrc1KBMisaligned-4   7.86GB/s ± 0%  11.70GB/s ± 0%   +48.75%
CastagnoliCrc4KB-4             8.87GB/s ± 0%  21.80GB/s ± 0%  +145.69%
CastagnoliCrc4KBMisaligned-4   8.83GB/s ± 0%  21.39GB/s ± 0%  +142.25%
CastagnoliCrc32KB-4            9.15GB/s ± 0%  24.22GB/s ± 0%  +164.62%
CastagnoliCrc32KBMisaligned-4  9.16GB/s ± 0%  24.00GB/s ± 0%  +161.94%

Fixes #16107.

Change-Id: I8fa827ec03f708ba27ee71c833f7544ad9dc5bc3
Reviewed-on: https://go-review.googlesource.com/24471
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-27 15:50:28 +00:00
Michael Munday
4b17b152a3 hash/crc32: fix optimized s390x implementation
The code wasn't checking to see if the data was still >= 64 bytes
long after aligning it.

Aligning the data is an optimization and we don't actually need
to do it. In fact for smaller sizes it slows things down due to
the overhead of calling the generic function. Therefore for now
I have simply removed the alignment stage. I have also added a
check into the assembly to deliberately trigger a segmentation
fault if the data is too short.

Fixes #16779.

Change-Id: Ic01636d775efc5ec97689f050991cee04ce8fe73
Reviewed-on: https://go-review.googlesource.com/27409
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-21 02:04:43 +00:00
Radu Berinde
0c819b654f hash/crc32: improve the processing of the last bytes in the SSE4.2 code for AMD64
This commit improves the processing of the final few bytes in
castagnoliSSE42: instead of processing one byte at a time, we use all
versions of the CRC32 instruction to process 4 bytes, then 2, then 1.
The difference is only noticeable for small "odd" sized buffers.

We do the similar improvement for processing the first few bytes in
the case of unaligned buffer.

Fixing the test which was not actually verifying the results for
misaligned buffers (WriteString was creating an internal copy which
was aligned).

Adding benchmarks for length 15 (aligned and misaligned), results
below.

name                          old time/op    new time/op    delta
CastagnoliCrc15B-4              25.1ns ± 0%    22.1ns ± 1%  -12.14%
CastagnoliCrc15BMisaligned-4    25.2ns ± 0%    22.9ns ± 1%   -9.03%
CastagnoliCrc40B-4              23.1ns ± 0%    23.4ns ± 0%   +1.08%
CastagnoliCrc1KB-4               127ns ± 0%     128ns ± 0%   +1.18%
CastagnoliCrc4KB-4               462ns ± 0%     464ns ± 0%     ~
CastagnoliCrc32KB-4             3.58µs ± 0%    3.60µs ± 0%   +0.58%

name                          old speed      new speed      delta
CastagnoliCrc15B-4             597MB/s ± 0%   679MB/s ± 1%  +13.77%
CastagnoliCrc15BMisaligned-4   596MB/s ± 0%   655MB/s ± 1%   +9.94%
CastagnoliCrc40B-4            1.73GB/s ± 0%  1.71GB/s ± 0%   -1.14%
CastagnoliCrc1KB-4            8.01GB/s ± 0%  7.93GB/s ± 1%   -1.06%
CastagnoliCrc4KB-4            8.86GB/s ± 0%  8.83GB/s ± 0%     ~
CastagnoliCrc32KB-4           9.14GB/s ± 0%  9.09GB/s ± 0%   -0.58%

Change-Id: I499e37af2241d28e3e5d522bbab836c1a718430a
Reviewed-on: https://go-review.googlesource.com/24470
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-17 21:20:50 +00:00
Ilya Tocar
9d73e146da hash/crc64: Use slicing by 8.
Similar to crc32 slicing by 8.
This also fixes a Crc64KB benchmark actually using 1024 bytes.

Crc64/ISO64KB-4       147µs ± 0%      37µs ± 0%   -75.05%  (p=0.000 n=18+18)
Crc64/ISO4KB-4       9.19µs ± 0%    2.33µs ± 0%   -74.70%  (p=0.000 n=19+20)
Crc64/ISO1KB-4       2.31µs ± 0%    0.60µs ± 0%   -73.81%  (p=0.000 n=19+15)
Crc64/ECMA64KB-4      147µs ± 0%      37µs ± 0%   -75.05%  (p=0.000 n=20+20)
Crc64/Random64KB-4    147µs ± 0%      41µs ± 0%   -72.17%  (p=0.000 n=20+18)
Crc64/Random16KB-4   36.7µs ± 0%    36.5µs ± 0%    -0.54%  (p=0.000 n=18+19)

name                old speed     new speed      delta
Crc64/ISO64KB-4     446MB/s ± 0%  1788MB/s ± 0%  +300.72%  (p=0.000 n=18+18)
Crc64/ISO4KB-4      446MB/s ± 0%  1761MB/s ± 0%  +295.20%  (p=0.000 n=18+20)
Crc64/ISO1KB-4      444MB/s ± 0%  1694MB/s ± 0%  +281.46%  (p=0.000 n=19+20)
Crc64/ECMA64KB-4    446MB/s ± 0%  1788MB/s ± 0%  +300.77%  (p=0.000 n=20+20)
Crc64/Random64KB-4  446MB/s ± 0%  1603MB/s ± 0%  +259.32%  (p=0.000 n=20+18)
Crc64/Random16KB-4  446MB/s ± 0%   448MB/s ± 0%    +0.54%  (p=0.000 n=18+20)

Change-Id: I1c7621d836c486d6bfc41dbe1ec2ff9ab11aedfc
Reviewed-on: https://go-review.googlesource.com/22222
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-05-18 14:38:04 +00:00
Chris Zou
5833d843de hash/crc32: use vector instructions on s390x
The input buffer is aligned to a doubleword boundary to
improve performance of the vector instructions. The pure
Go implementation is used to align the input data, and is
also used when the vector instructions are not available
or the data length is less than 64 bytes.

Change-Id: Ie259a5f2f1562bcc17961c99e5776c99091d6bed
Reviewed-on: https://go-review.googlesource.com/22201
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-22 18:07:15 +00:00
Ilya Tocar
89a1f02834 hash/adler32: Unroll loop for extra performance.
name         old time/op    new time/op    delta
Adler32KB-4     592ns ± 0%     447ns ± 0%  -24.49%  (p=0.000 n=19+20)

name         old speed      new speed      delta
Adler32KB-4  1.73GB/s ± 0%  2.29GB/s ± 0%  +32.41%  (p=0.000 n=20+20)

Change-Id: I38990aa66ca4452a886200018a57c0bc3af30717
Reviewed-on: https://go-review.googlesource.com/21880
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-15 10:17:17 +00:00
Michael Munday
8edf4cb27d hash/crc32: invert build tags for go implementation
It seems cleaner and more consistent with other files to list the
architectures that have assembly implementations rather than to
list those that do not.

This means we don't have to add s390x and future platforms to this
list.

Change-Id: I2ad3f66b76eb1711333c910236ca7f5151b698e5
Reviewed-on: https://go-review.googlesource.com/21770
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-12 16:30:25 +00:00
Ilya Tocar
f5bd3556f5 hash/crc64: Add tests for ECMA polynomial
Currently we test crc64 only with ISO polynomial.

Change-Id: Ibc5e202db3b960369cbbb18e31eb0fea07b54dba
Reviewed-on: https://go-review.googlesource.com/21309
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-03-31 20:42:02 +00:00