1
0
mirror of https://github.com/golang/go synced 2024-11-07 23:36:11 -07:00
Commit Graph

96 Commits

Author SHA1 Message Date
bill_ofarrell
a5f8128e39 bytes, strings: fix comparison of long byte slices on s390x
The existing implementation of bytes.Compare on s390x doesn't work properly for slices longer
than 256 elements. This change fixes that. Added tests for long strings and slices of bytes.

Fixes #26114

Change-Id: If6d8b68ee6dbcf99a24f867a1d3438b1f208954f
Reviewed-on: https://go-review.googlesource.com/121495
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-06-29 20:48:07 +00:00
Dave Russell
23b687eccb bytes: re-slice buffer to its previous length after call to grow()
Fixes #25435

The added test fails without the re-slice and passes with it.

Change-Id: I5ebc2a737285eb116ecc5938d8bf49050652830f
GitHub-Last-Rev: 454ddad7df
GitHub-Pull-Request: golang/go#25436
Reviewed-on: https://go-review.googlesource.com/113495
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-29 19:46:33 +00:00
Eric Pauley
9d11c63b64 bytes, strings: improve EqualFold fast version for ASCII
The existing implementation only considers the special ASCII
case when the lower character is an upper case letter. This
means that most ASCII comparisons use unicode.SimpleFold even
when it is not necessary.

benchmark                old ns/op     new ns/op     delta
BenchmarkEqualFold-8     450           390           -13.33%

Change-Id: I735ca3c30fc0145c186d2a54f31fd39caab2c3fa
Reviewed-on: https://go-review.googlesource.com/110018
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-01 18:52:19 +00:00
Keith Randall
ee58eccc56 internal/bytealg: move short string Index implementations into bytealg
Also move the arm64 CountByte implementation while we're here.

Fixes #19792

Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e
Reviewed-on: https://go-review.googlesource.com/98518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 19:49:44 +00:00
Keith Randall
f6332bb84a internal/bytealg: move compare functions to bytealg
Move bytes.Compare and runtime·cmpstring to bytealg.

Update #19792

Change-Id: I139e6d7c59686bef7a3017e3dec99eba5fd10447
Reviewed-on: https://go-review.googlesource.com/98515
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 17:49:39 +00:00
Keith Randall
45964e4f9c internal/bytealg: move Count to bytealg
Move bytes.Count and strings.Count to bytealg.

Update #19792

Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb
Reviewed-on: https://go-review.googlesource.com/98495
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 17:49:25 +00:00
Keith Randall
403ab0f221 internal/bytealg: move IndexByte asssembly to the new bytealg package
Move the IndexByte function from the runtime to a new bytealg package.
The new package will eventually hold all the optimized assembly for
groveling through byte slices and strings. It seems a better home for
this code than randomly keeping it in runtime.

Once this is in, the next step is to move the other functions
(Compare, Equal, ...).

Update #19792

This change seems complicated enough that we might just declare
"not worth it" and abandon.  Opinions welcome.

The core assembly is all unchanged, except minor modifications where
the code reads cpu feature bits.

The wrapper functions have been cleaned up as they are now actually
checked by vet.

Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796
Reviewed-on: https://go-review.googlesource.com/98016
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 22:46:15 +00:00
Wei Xiao
562346b7d0 bytes: add asm version of Index for short strings on arm64
Currently we have special case for 1-byte strings,
this extends it to strings shorter than 9 bytes on arm64.

Benchmark results:
name                              old time/op    new time/op    delta
IndexByte/10-32                     18.6ns ± 0%    18.1ns ± 0%    -2.69%  (p=0.008 n=5+5)
IndexByte/32-32                     16.8ns ± 1%    16.9ns ± 1%      ~     (p=0.762 n=5+5)
IndexByte/4K-32                      464ns ± 0%     464ns ± 0%      ~     (all equal)
IndexByte/4M-32                      528µs ± 1%     506µs ± 1%    -4.17%  (p=0.008 n=5+5)
IndexByte/64M-32                    18.7ms ± 0%    18.7ms ± 1%      ~     (p=0.730 n=4+5)
IndexBytePortable/10-32             33.8ns ± 0%    34.9ns ± 3%      ~     (p=0.167 n=5+5)
IndexBytePortable/32-32             65.3ns ± 0%    66.1ns ± 2%      ~     (p=0.444 n=5+5)
IndexBytePortable/4K-32             5.88µs ± 0%    5.88µs ± 0%      ~     (p=0.325 n=5+5)
IndexBytePortable/4M-32             6.03ms ± 0%    6.03ms ± 0%      ~     (p=1.000 n=5+5)
IndexBytePortable/64M-32            98.8ms ± 0%    98.9ms ± 0%    +0.10%  (p=0.008 n=5+5)
IndexRune/10-32                     57.7ns ± 0%    49.2ns ± 0%   -14.73%  (p=0.000 n=5+4)
IndexRune/32-32                     57.7ns ± 0%    58.6ns ± 0%    +1.56%  (p=0.008 n=5+5)
IndexRune/4K-32                      511ns ± 0%     513ns ± 0%    +0.39%  (p=0.008 n=5+5)
IndexRune/4M-32                      527µs ± 1%     527µs ± 1%      ~     (p=0.690 n=5+5)
IndexRune/64M-32                    18.7ms ± 0%    18.7ms ± 1%      ~     (p=0.190 n=4+5)
IndexRuneASCII/10-32                23.8ns ± 0%    23.8ns ± 0%      ~     (all equal)
IndexRuneASCII/32-32                24.3ns ± 0%    24.3ns ± 0%      ~     (all equal)
IndexRuneASCII/4K-32                 468ns ± 0%     468ns ± 0%      ~     (all equal)
IndexRuneASCII/4M-32                 521µs ± 1%     531µs ± 2%    +1.91%  (p=0.016 n=5+5)
IndexRuneASCII/64M-32               18.6ms ± 1%    18.5ms ± 0%      ~     (p=0.730 n=5+4)
Index/10-32                         89.1ns ±13%    25.2ns ± 0%   -71.72%  (p=0.008 n=5+5)
Index/32-32                          225ns ± 2%     226ns ± 3%      ~     (p=0.683 n=5+5)
Index/4K-32                         11.9µs ± 0%    11.8µs ± 0%    -0.22%  (p=0.008 n=5+5)
Index/4M-32                         12.1ms ± 0%    12.1ms ± 0%      ~     (p=0.548 n=5+5)
Index/64M-32                         197ms ± 0%     197ms ± 0%      ~     (p=0.690 n=5+5)
IndexEasy/10-32                     46.2ns ± 0%    22.1ns ± 8%   -52.16%  (p=0.008 n=5+5)
IndexEasy/32-32                     46.2ns ± 0%    47.2ns ± 0%    +2.16%  (p=0.008 n=5+5)
IndexEasy/4K-32                      499ns ± 0%     502ns ± 0%    +0.44%  (p=0.008 n=5+5)
IndexEasy/4M-32                      529µs ± 2%     529µs ± 1%      ~     (p=0.841 n=5+5)
IndexEasy/64M-32                    18.6ms ± 1%    18.7ms ± 1%      ~     (p=0.222 n=5+5)
IndexAnyASCII/1:1-32                15.7ns ± 0%    15.7ns ± 0%      ~     (all equal)
IndexAnyASCII/1:2-32                17.2ns ± 0%    17.2ns ± 0%      ~     (all equal)
IndexAnyASCII/1:4-32                20.0ns ± 0%    20.0ns ± 0%      ~     (all equal)
IndexAnyASCII/1:8-32                34.8ns ± 0%    34.8ns ± 0%      ~     (all equal)
IndexAnyASCII/1:16-32               48.1ns ± 0%    48.1ns ± 0%      ~     (all equal)
IndexAnyASCII/16:1-32               97.9ns ± 1%    97.7ns ± 0%      ~     (p=0.857 n=5+5)
IndexAnyASCII/16:2-32                102ns ± 0%     102ns ± 0%      ~     (all equal)
IndexAnyASCII/16:4-32                116ns ± 1%     116ns ± 1%      ~     (p=1.000 n=5+5)
IndexAnyASCII/16:8-32                141ns ± 1%     141ns ± 0%      ~     (p=0.571 n=5+4)
IndexAnyASCII/16:16-32               178ns ± 0%     178ns ± 0%      ~     (all equal)
IndexAnyASCII/256:1-32              1.09µs ± 0%    1.09µs ± 0%      ~     (all equal)
IndexAnyASCII/256:2-32              1.09µs ± 0%    1.10µs ± 0%    +0.27%  (p=0.008 n=5+5)
IndexAnyASCII/256:4-32              1.11µs ± 0%    1.11µs ± 0%      ~     (p=0.397 n=5+5)
IndexAnyASCII/256:8-32              1.10µs ± 0%    1.10µs ± 0%      ~     (p=0.444 n=5+5)
IndexAnyASCII/256:16-32             1.14µs ± 0%    1.14µs ± 0%      ~     (all equal)
IndexAnyASCII/4096:1-32             16.5µs ± 0%    16.5µs ± 0%      ~     (p=1.000 n=5+5)
IndexAnyASCII/4096:2-32             17.0µs ± 0%    17.0µs ± 0%      ~     (p=0.159 n=5+4)
IndexAnyASCII/4096:4-32             17.1µs ± 0%    17.1µs ± 0%      ~     (p=0.921 n=4+5)
IndexAnyASCII/4096:8-32             16.5µs ± 0%    16.5µs ± 0%      ~     (p=0.460 n=5+5)
IndexAnyASCII/4096:16-32            16.5µs ± 0%    16.5µs ± 0%      ~     (p=0.794 n=5+4)
IndexPeriodic/IndexPeriodic2-32      189µs ± 0%     189µs ± 0%      ~     (p=0.841 n=5+5)
IndexPeriodic/IndexPeriodic4-32      189µs ± 0%     189µs ± 0%    -0.03%  (p=0.016 n=5+4)
IndexPeriodic/IndexPeriodic8-32      189µs ± 0%     189µs ± 0%      ~     (p=0.651 n=5+5)
IndexPeriodic/IndexPeriodic16-32     175µs ± 9%     174µs ± 7%      ~     (p=1.000 n=5+5)
IndexPeriodic/IndexPeriodic32-32    75.1µs ± 0%    75.1µs ± 0%      ~     (p=0.690 n=5+5)
IndexPeriodic/IndexPeriodic64-32    42.6µs ± 0%    44.7µs ± 0%    +4.98%  (p=0.008 n=5+5)

name                              old speed      new speed      delta
IndexByte/10-32                    538MB/s ± 0%   552MB/s ± 0%    +2.65%  (p=0.008 n=5+5)
IndexByte/32-32                   1.90GB/s ± 1%  1.90GB/s ± 1%      ~     (p=0.548 n=5+5)
IndexByte/4K-32                   8.82GB/s ± 0%  8.81GB/s ± 0%      ~     (p=0.548 n=5+5)
IndexByte/4M-32                   7.95GB/s ± 1%  8.29GB/s ± 1%    +4.35%  (p=0.008 n=5+5)
IndexByte/64M-32                  3.58GB/s ± 0%  3.60GB/s ± 1%      ~     (p=0.730 n=4+5)
IndexBytePortable/10-32            296MB/s ± 0%   286MB/s ± 3%      ~     (p=0.381 n=4+5)
IndexBytePortable/32-32            490MB/s ± 0%   485MB/s ± 2%      ~     (p=0.286 n=5+5)
IndexBytePortable/4K-32            697MB/s ± 0%   697MB/s ± 0%      ~     (p=0.413 n=5+5)
IndexBytePortable/4M-32            696MB/s ± 0%   695MB/s ± 0%      ~     (p=0.897 n=5+5)
IndexBytePortable/64M-32           679MB/s ± 0%   678MB/s ± 0%    -0.10%  (p=0.008 n=5+5)
IndexRune/10-32                    173MB/s ± 0%   203MB/s ± 0%   +17.24%  (p=0.016 n=5+4)
IndexRune/32-32                    555MB/s ± 0%   546MB/s ± 0%    -1.62%  (p=0.008 n=5+5)
IndexRune/4K-32                   8.01GB/s ± 0%  7.98GB/s ± 0%    -0.38%  (p=0.008 n=5+5)
IndexRune/4M-32                   7.97GB/s ± 1%  7.95GB/s ± 1%      ~     (p=0.690 n=5+5)
IndexRune/64M-32                  3.59GB/s ± 0%  3.58GB/s ± 1%      ~     (p=0.190 n=4+5)
IndexRuneASCII/10-32               420MB/s ± 0%   420MB/s ± 0%      ~     (p=0.190 n=5+4)
IndexRuneASCII/32-32              1.32GB/s ± 0%  1.32GB/s ± 0%      ~     (p=0.333 n=5+5)
IndexRuneASCII/4K-32              8.75GB/s ± 0%  8.75GB/s ± 0%      ~     (p=0.690 n=5+5)
IndexRuneASCII/4M-32              8.04GB/s ± 1%  7.89GB/s ± 2%    -1.87%  (p=0.016 n=5+5)
IndexRuneASCII/64M-32             3.61GB/s ± 1%  3.62GB/s ± 0%      ~     (p=0.730 n=5+4)
Index/10-32                        113MB/s ±14%   397MB/s ± 0%  +249.76%  (p=0.008 n=5+5)
Index/32-32                        142MB/s ± 2%   141MB/s ± 3%      ~     (p=0.794 n=5+5)
Index/4K-32                        345MB/s ± 0%   346MB/s ± 0%    +0.22%  (p=0.008 n=5+5)
Index/4M-32                        345MB/s ± 0%   345MB/s ± 0%      ~     (p=0.619 n=5+5)
Index/64M-32                       341MB/s ± 0%   341MB/s ± 0%      ~     (p=0.595 n=5+5)
IndexEasy/10-32                    216MB/s ± 0%   453MB/s ± 8%  +109.60%  (p=0.008 n=5+5)
IndexEasy/32-32                    692MB/s ± 0%   678MB/s ± 0%    -2.01%  (p=0.008 n=5+5)
IndexEasy/4K-32                   8.19GB/s ± 0%  8.16GB/s ± 0%    -0.45%  (p=0.008 n=5+5)
IndexEasy/4M-32                   7.93GB/s ± 2%  7.93GB/s ± 1%      ~     (p=0.841 n=5+5)
IndexEasy/64M-32                  3.60GB/s ± 1%  3.59GB/s ± 1%      ~     (p=0.222 n=5+5)

Change-Id: I4ca69378a2df6f9ba748c6a2706953ee1bd07343
Reviewed-on: https://go-review.googlesource.com/96555
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-01 15:24:33 +00:00
Brad Fitzpatrick
ed8b7dedd3 bytes: mention strings.Builder in Buffer.String docs
Fixes #22778

Change-Id: I37f7a59c15828aa720fe787fff42fb3ef17729c7
Reviewed-on: https://go-review.googlesource.com/80815
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-11-30 01:46:50 +00:00
Wei Xiao
9a14cd9e75 bytes: add optimized countByte for arm64
Use SIMD instructions when counting a single byte.
Inspired from runtime IndexByte implementation.

Benchmark results of bytes, where 1 byte in every 8 is the one we are looking:

name               old time/op   new time/op    delta
CountSingle/10-8    96.1ns ± 1%    38.8ns ± 0%    -59.64%  (p=0.000 n=9+7)
CountSingle/32-8     172ns ± 2%      36ns ± 1%    -79.27%  (p=0.000 n=10+10)
CountSingle/4K-8    18.2µs ± 1%     0.9µs ± 0%    -95.17%  (p=0.000 n=9+10)
CountSingle/4M-8    18.4ms ± 0%     0.9ms ± 0%    -95.00%  (p=0.000 n=10+9)
CountSingle/64M-8    284ms ± 4%      19ms ± 0%    -93.40%  (p=0.000 n=10+10)

name               old speed     new speed      delta
CountSingle/10-8   104MB/s ± 1%   258MB/s ± 0%   +147.99%  (p=0.000 n=9+10)
CountSingle/32-8   185MB/s ± 1%   897MB/s ± 1%   +385.33%  (p=0.000 n=9+10)
CountSingle/4K-8   225MB/s ± 1%  4658MB/s ± 0%  +1967.40%  (p=0.000 n=9+10)
CountSingle/4M-8   228MB/s ± 0%  4555MB/s ± 0%  +1901.71%  (p=0.000 n=10+9)
CountSingle/64M-8  236MB/s ± 4%  3575MB/s ± 0%  +1414.69%  (p=0.000 n=10+10)

Change-Id: Ifccb51b3c8658c49773fe05147c3cf3aead361e5
Reviewed-on: https://go-review.googlesource.com/71111
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-21 19:07:38 +00:00
Daniel Martí
f91ab6c025 bytes: don't use an iota for the readOp constants
As per the comments in golang.org/cl/78617. Also leaving a comment here,
to make sure noone else thinks to re-introduce the iota like I did.

Change-Id: I2a2275998b81896eaa0e9d5ee0197661ebe84acf
Reviewed-on: https://go-review.googlesource.com/78676
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-19 16:37:43 +00:00
Daniel Martí
962834dd14 bytes: make all readOp constants actually typed
This is a regression introduced in golang.org/cl/28817. That change got
rid of the iota, which meant that the type was no longer applied to all
the constant names.

Re-add the iota starting at -1, simplifying the code and adding the
types once more.

Change-Id: I38bd0e04f8d298196bccd33651e29f5011401a8d
Reviewed-on: https://go-review.googlesource.com/78617
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-18 17:07:05 +00:00
Russ Cox
2a166c93a3 bytes, strings: restore O(1) behavior of IndexAny(s, "") and LastIndexAny(s, "")
CL 65851 (bytes) and CL 65910 (strings) “improve[d] readability”
by removing the special case that bypassed the whole function body
when chars == "". In doing so, yes, the function was unindented a
level, which is nice, but the runtime of that case went from O(1) to O(n)
where n = len(s).

I don't know if anyone's code depends on the O(1) behavior in this case,
but quite possibly someone's does.

This CL adds the special case back, with a comment to prevent future
deletions, and without reindenting each function body in full.

Change-Id: I5aba33922b304dd1b8657e6d51d6c937a7f95c81
Reviewed-on: https://go-review.googlesource.com/78112
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-15 21:26:05 +00:00
Russ Cox
e7628bee6e bytes: make ExampleTrimLeft and ExampleTrimRight match
ExampleTrimLeft was inexplicably complex.

Change-Id: I13ca81bdeba728bdd632acf82e3a1101d29b9f39
Reviewed-on: https://go-review.googlesource.com/78111
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-11-15 21:25:13 +00:00
Russ Cox
22671e7344 bytes: change ExampleReader_Len to use a non-ASCII string
This should help make clear that Len is not counting runes.
Also delete empty string, which doesn't add much.

Change-Id: I1602352df1897fef6e855e9db0bababb8ab788ca
Reviewed-on: https://go-review.googlesource.com/78110
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-11-15 21:25:07 +00:00
Keith Randall
a025277505 bytes,strings: in generic Index, use mix of IndexByte and Rabin-Karp
Use IndexByte first, as it allows us to skip lots of bytes quickly.
If IndexByte is generating a lot of false positives, switch over to Rabin-Karp.

Experiments for ppc64le
bytes:
name                             old time/op  new time/op  delta
IndexPeriodic/IndexPeriodic2-2   1.12ms ± 0%  0.18ms ± 0%  -83.54%  (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic4-2    635µs ± 0%   184µs ± 0%  -71.06%  (p=0.000 n=9+9)
IndexPeriodic/IndexPeriodic8-2    289µs ± 0%   184µs ± 0%  -36.51%  (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic16-2   133µs ± 0%   183µs ± 0%  +37.68%  (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic32-2  68.3µs ± 0%  70.2µs ± 0%   +2.76%  (p=0.000 n=10+10)
IndexPeriodic/IndexPeriodic64-2  35.8µs ± 0%  36.6µs ± 0%   +2.17%  (p=0.000 n=8+10)

strings:
name                             old time/op  new time/op  delta
IndexPeriodic/IndexPeriodic2-2    184µs ± 0%   184µs ± 0%   +0.11%  (p=0.029 n=4+4)
IndexPeriodic/IndexPeriodic4-2    184µs ± 0%   184µs ± 0%     ~     (p=0.886 n=4+4)
IndexPeriodic/IndexPeriodic8-2    184µs ± 0%   184µs ± 0%     ~     (p=0.486 n=4+4)
IndexPeriodic/IndexPeriodic16-2   185µs ± 1%   184µs ± 0%     ~     (p=0.343 n=4+4)
IndexPeriodic/IndexPeriodic32-2   184µs ± 0%    69µs ± 0%  -62.37%  (p=0.029 n=4+4)
IndexPeriodic/IndexPeriodic64-2   184µs ± 0%    37µs ± 0%  -80.17%  (p=0.029 n=4+4)

Fixes #22578

Change-Id: If2a4d8554cb96bfd699b58149d13ac294615f8b8
Reviewed-on: https://go-review.googlesource.com/76070
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2017-11-15 17:35:09 +00:00
Keith Randall
936b977c17 bytes: reduce work in IndexNearPageBoundary test
This test was taking too long on ppc64x.
There were a few reasons.

The first is that the page size on ppc64x is 64k instead of 4k.
That's 16x more work.

The second is that the generic Index is pretty bad in this case.
It first calls IndexByte which does a bunch of setup work only to find
the byte we're looking for at index 0.  Then it calls Equal which
has to look at the whole string to find a difference on the last byte.

To fix, just limit our attention to near the end of the page.

Change-Id: I6b8bcbb94652a2da853862acc23803def0c49303
Reviewed-on: https://go-review.googlesource.com/76050
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-11-04 11:09:49 +00:00
Ian Lance Taylor
a9e2479a44 bytes: set cap of slices returned by Split and Fields and friends
This avoids the problem in which appending to a slice returned by
Split can affect subsequent slices.

Fixes #21149.

Change-Id: Ie3df2b9ceeb9605d4625f47d49073c5f348cf0a1
Reviewed-on: https://go-review.googlesource.com/74510
Reviewed-by: Jelte Fennema <github-tech@jeltef.nl>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-11-03 19:06:15 +00:00
Keith Randall
3043c355f4 bytes: add more page boundary tests
Make sure Index and IndexByte don't read past the queried byte slice.

Hopefully will be helpful for CL 33597.

Also remove the code which maps/unmaps the Go heap.
Much safer to play with protection bits off-heap.

Change-Id: I50d73e879b2d83285e1bc7c3e810efe4c245fe75
Reviewed-on: https://go-review.googlesource.com/75890
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-11-03 18:51:51 +00:00
Javier Segura
7128ed0501 bytes: add examples of Equal and IndexByte
Change-Id: Ibf3179d0903eb443c89b6d886802c36f8d199898
Reviewed-on: https://go-review.googlesource.com/70933
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-16 03:34:28 +00:00
Afanasev Stanislav
07e36af7d6 bytes: panic in ReadFrom with more information with negative Read counts
This is only to aid in human debugging, and for that reason we maintain a panic, and not return an error.

Fixes #22097

Change-Id: If72e4d1e47ec9125ca7bc97d5fe4cedb7f76ae72
Reviewed-on: https://go-review.googlesource.com/67970
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-06 06:49:40 +00:00
Gabriel Aszalos
52abe50c33 bytes: correct Map documentation
Fix incorrect reference to string instead of byte slice.

Change-Id: I95553da32acfbcf5dde9613b07ea38408cb31ae8
Reviewed-on: https://go-review.googlesource.com/68090
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-10-04 14:27:22 +00:00
Tim Cooper
f2af0c1780 bytes: explicitly state if a function expects UTF-8-encoded data
Fixes #21950

Change-Id: I6fa392abd2c3bf6a4f80f14c6b1419470e9a944d
Reviewed-on: https://go-review.googlesource.com/66750
Run-TryBot: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2017-10-02 00:31:47 +00:00
Gabriel Aszalos
b71f39612a bytes: improve readability of IndexAny and LastIndexAny functions
This change removes the check of len(chars) > 0 inside the Index and
IndexAny functions which was redundant.

Change-Id: Ic4bf8b8a37d7f040d3ebd81b4fc45fcb386b639a
Reviewed-on: https://go-review.googlesource.com/65851
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-25 18:05:01 +00:00
Daniel Martí
6945c67e10 cmd/compile: merge bytes inline test with the rest
In golang.org/cl/42813, a test was added in the bytes package to check
if a Buffer method was being inlined, using 'go tool nm'.

Now that we have a compiler test that verifies that certain funcs are
inlineable, merge it there. Knowing whether the funcs are inlineable is
also more reliable than whether or not their symbol appears in the
binary, too. For example, under some circumstances, inlineable funcs
can't be inlined, such as if closures are used.

While at it, add a few more bytes.Buffer methods that are currently
inlined and should clearly stay that way.

Updates #21851.

Change-Id: I62066e32ef5542d37908bd64f90bda51276da4de
Reviewed-on: https://go-review.googlesource.com/65658
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-25 06:52:31 +00:00
Gabriel Aszalos
dd5a86f18c bytes: add documentation to reader methods
Some methods that were used to implement various `io` interfaces in the
Reader were documented, whereas others were not. This change adds
documentation to all the missing methods used to implement these
interfaces.

Change-Id: I2dac6e328542de3cd87e89510651cd6ba74a7b7d
Reviewed-on: https://go-review.googlesource.com/65231
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-21 18:57:04 +00:00
Gabriel Aszalos
977578816e bytes: improve test readability
This CL improves the readability of the tests in the bytes package by
naming the `data` test variable `testString`, using the same convention
as its counterpart, `testBytes`.

It additionally removes some type casting which was unnecessary.

Change-Id: If38b5606ce8bda0306bae24498f21cb8dbbb6377
Reviewed-on: https://go-review.googlesource.com/64931
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-20 13:41:03 +00:00
Gabriel Aszalos
a696db1be1 bytes: correct message in test log
Change-Id: Ib731874b9a37ff141e4305d8ccfdf7c165155da6
Reviewed-on: https://go-review.googlesource.com/64930
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-20 12:46:54 +00:00
Gabriel Aszalos
66ce8e383f bytes: removed unnecessary slicing on copy
Change-Id: Ia42e3479c852a88968947411de8b32e5bcda5ae3
Reviewed-on: https://go-review.googlesource.com/64371
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-18 16:48:34 +00:00
Andrzej Żeżel
de25b12d9f bytes: add example for Len function of Reader
Change-Id: If7ecdc57f190f647bfc673bde8e66b4ef12aa906
Reviewed-on: https://go-review.googlesource.com/64190
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-17 00:02:19 +00:00
Borja Clemente
394f6a5ac0 bytes: Add missing examples to functions
Fixes #21570

Change-Id: Ia0734929a04fbce8fdd5fbcb1b7baff9a8bbe39e
Reviewed-on: https://go-review.googlesource.com/58030
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-25 20:50:58 +00:00
Michael Brandenburg
6cbe5c8ac3 bytes: add examples for TrimLeft and TrimRight
Change-Id: Ib6d94f185dd43568cf97ef267dd51a09f43a402f
Reviewed-on: https://go-review.googlesource.com/51391
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-22 21:31:54 +00:00
Marvin Stenger
1ba4556a2c bytes: clean-up of buffer.go
Clean-up changes in no particular order:
- use uint8 instead of int for readOp
- remove duplicated code in ReadFrom()
- introduce (*Buffer).empty()
- remove naked returns

Change-Id: Ie6e673c20c398f980f8be0448969a36ad4778804
Reviewed-on: https://go-review.googlesource.com/42816
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-18 17:41:11 +00:00
Daniel Martí
59413d34c9 all: unindent some big chunks of code
Found with mvdan.cc/unindent. Prioritized the ones with the biggest wins
for now.

Change-Id: I2b032e45cdd559fc9ed5b1ee4c4de42c4c92e07b
Reviewed-on: https://go-review.googlesource.com/56470
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-18 06:59:48 +00:00
Bryan C. Mills
6a34ffa073 bytes: avoid overflow in (*Buffer).Grow and ReadFrom
fixes #21481

Change-Id: I26717876a1c0ee25a86c81159c6b3c59563dfec6
Reviewed-on: https://go-review.googlesource.com/56230
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-16 21:25:51 +00:00
Martin Möhrmann
7df29b50b2 bytes: speed up Fields and FieldsFunc
Applies the optimizations from golang.org/cl/42810 and golang.org/cl/37959
done to the strings package to the bytes package.

name                      old time/op    new time/op     delta
Fields/ASCII/16              417ns ± 4%      118ns ± 3%    -71.65%  (p=0.000 n=10+10)
Fields/ASCII/256            5.95µs ± 3%     0.88µs ± 0%    -85.23%  (p=0.000 n=10+7)
Fields/ASCII/4096           92.3µs ± 1%     12.8µs ± 2%    -86.13%  (p=0.000 n=10+10)
Fields/ASCII/65536          1.49ms ± 1%     0.25ms ± 1%    -83.14%  (p=0.000 n=10+10)
Fields/ASCII/1048576        25.0ms ± 1%      6.5ms ± 2%    -74.04%  (p=0.000 n=10+10)
Fields/Mixed/16              406ns ± 1%      222ns ± 1%    -45.24%  (p=0.000 n=10+9)
Fields/Mixed/256            5.78µs ± 1%     2.27µs ± 1%    -60.73%  (p=0.000 n=9+10)
Fields/Mixed/4096           97.9µs ± 1%     40.5µs ± 3%    -58.66%  (p=0.000 n=10+10)
Fields/Mixed/65536          1.58ms ± 1%     0.69ms ± 1%    -56.58%  (p=0.000 n=10+10)
Fields/Mixed/1048576        26.6ms ± 1%     12.6ms ± 2%    -52.44%  (p=0.000 n=9+10)
FieldsFunc/ASCII/16          395ns ± 1%      188ns ± 1%    -52.34%  (p=0.000 n=10+10)
FieldsFunc/ASCII/256        5.90µs ± 1%     2.00µs ± 1%    -66.06%  (p=0.000 n=10+10)
FieldsFunc/ASCII/4096       92.5µs ± 1%     33.0µs ± 1%    -64.34%  (p=0.000 n=10+9)
FieldsFunc/ASCII/65536      1.48ms ± 1%     0.54ms ± 1%    -63.38%  (p=0.000 n=10+9)
FieldsFunc/ASCII/1048576    25.1ms ± 1%     10.5ms ± 3%    -58.24%  (p=0.000 n=10+10)
FieldsFunc/Mixed/16          401ns ± 1%      205ns ± 2%    -48.87%  (p=0.000 n=10+10)
FieldsFunc/Mixed/256        5.70µs ± 1%     1.98µs ± 1%    -65.28%  (p=0.000 n=10+10)
FieldsFunc/Mixed/4096       97.5µs ± 1%     35.4µs ± 1%    -63.65%  (p=0.000 n=10+10)
FieldsFunc/Mixed/65536      1.57ms ± 1%     0.61ms ± 1%    -61.20%  (p=0.000 n=10+10)
FieldsFunc/Mixed/1048576    26.5ms ± 1%     11.4ms ± 2%    -56.84%  (p=0.000 n=10+10)

name                      old speed      new speed       delta
Fields/ASCII/16           38.4MB/s ± 4%  134.9MB/s ± 3%   +251.55%  (p=0.000 n=10+10)
Fields/ASCII/256          43.0MB/s ± 3%  290.6MB/s ± 1%   +575.97%  (p=0.000 n=10+8)
Fields/ASCII/4096         44.4MB/s ± 1%  320.0MB/s ± 2%   +620.90%  (p=0.000 n=10+10)
Fields/ASCII/65536        44.0MB/s ± 1%  260.7MB/s ± 1%   +493.15%  (p=0.000 n=10+10)
Fields/ASCII/1048576      42.0MB/s ± 1%  161.6MB/s ± 2%   +285.21%  (p=0.000 n=10+10)
Fields/Mixed/16           39.4MB/s ± 1%   71.7MB/s ± 1%    +82.20%  (p=0.000 n=10+10)
Fields/Mixed/256          44.3MB/s ± 1%  112.8MB/s ± 1%   +154.64%  (p=0.000 n=9+10)
Fields/Mixed/4096         41.9MB/s ± 1%  101.2MB/s ± 3%   +141.92%  (p=0.000 n=10+10)
Fields/Mixed/65536        41.5MB/s ± 1%   95.5MB/s ± 1%   +130.29%  (p=0.000 n=10+10)
Fields/Mixed/1048576      39.4MB/s ± 1%   82.9MB/s ± 2%   +110.28%  (p=0.000 n=9+10)
FieldsFunc/ASCII/16       40.5MB/s ± 1%   84.9MB/s ± 2%   +109.80%  (p=0.000 n=10+10)
FieldsFunc/ASCII/256      43.4MB/s ± 1%  127.9MB/s ± 1%   +194.58%  (p=0.000 n=10+10)
FieldsFunc/ASCII/4096     44.3MB/s ± 1%  124.2MB/s ± 1%   +180.44%  (p=0.000 n=10+9)
FieldsFunc/ASCII/65536    44.2MB/s ± 1%  120.6MB/s ± 1%   +173.06%  (p=0.000 n=10+9)
FieldsFunc/ASCII/1048576  41.8MB/s ± 1%  100.2MB/s ± 3%   +139.53%  (p=0.000 n=10+10)
FieldsFunc/Mixed/16       39.8MB/s ± 1%   77.8MB/s ± 2%    +95.46%  (p=0.000 n=10+10)
FieldsFunc/Mixed/256      44.9MB/s ± 1%  129.4MB/s ± 1%   +187.97%  (p=0.000 n=10+10)
FieldsFunc/Mixed/4096     42.0MB/s ± 1%  115.6MB/s ± 1%   +175.08%  (p=0.000 n=10+10)
FieldsFunc/Mixed/65536    41.6MB/s ± 1%  107.3MB/s ± 1%   +157.75%  (p=0.000 n=10+10)
FieldsFunc/Mixed/1048576  39.6MB/s ± 1%   91.8MB/s ± 2%   +131.72%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op    delta
Fields/ASCII/16              80.0B ± 0%      80.0B ± 0%       ~     (all equal)
Fields/ASCII/256              768B ± 0%       768B ± 0%       ~     (all equal)
Fields/ASCII/4096           9.47kB ± 0%     9.47kB ± 0%       ~     (all equal)
Fields/ASCII/65536           147kB ± 0%      147kB ± 0%       ~     (all equal)
Fields/ASCII/1048576        2.27MB ± 0%     2.27MB ± 0%       ~     (all equal)
Fields/Mixed/16              96.0B ± 0%      96.0B ± 0%       ~     (all equal)
Fields/Mixed/256              768B ± 0%       768B ± 0%       ~     (all equal)
Fields/Mixed/4096           9.47kB ± 0%    24.83kB ± 0%   +162.16%  (p=0.000 n=10+10)
Fields/Mixed/65536           147kB ± 0%      497kB ± 0%   +237.24%  (p=0.000 n=10+10)
Fields/Mixed/1048576        2.26MB ± 0%     9.61MB ± 0%   +324.89%  (p=0.000 n=10+10)
FieldsFunc/ASCII/16          80.0B ± 0%      80.0B ± 0%       ~     (all equal)
FieldsFunc/ASCII/256          768B ± 0%       768B ± 0%       ~     (all equal)
FieldsFunc/ASCII/4096       9.47kB ± 0%    24.83kB ± 0%   +162.16%  (p=0.000 n=10+10)
FieldsFunc/ASCII/65536       147kB ± 0%      497kB ± 0%   +237.24%  (p=0.000 n=10+10)
FieldsFunc/ASCII/1048576    2.27MB ± 0%     9.61MB ± 0%   +323.72%  (p=0.000 n=10+10)
FieldsFunc/Mixed/16          96.0B ± 0%      96.0B ± 0%       ~     (all equal)
FieldsFunc/Mixed/256          768B ± 0%       768B ± 0%       ~     (all equal)
FieldsFunc/Mixed/4096       9.47kB ± 0%    24.83kB ± 0%   +162.16%  (p=0.000 n=10+10)
FieldsFunc/Mixed/65536       147kB ± 0%      497kB ± 0%   +237.24%  (p=0.000 n=10+10)
FieldsFunc/Mixed/1048576    2.26MB ± 0%     9.61MB ± 0%   +324.89%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op   delta
Fields/ASCII/16               1.00 ± 0%       1.00 ± 0%       ~     (all equal)
Fields/ASCII/256              1.00 ± 0%       1.00 ± 0%       ~     (all equal)
Fields/ASCII/4096             1.00 ± 0%       1.00 ± 0%       ~     (all equal)
Fields/ASCII/65536            1.00 ± 0%       1.00 ± 0%       ~     (all equal)
Fields/ASCII/1048576          1.00 ± 0%       1.00 ± 0%       ~     (all equal)
Fields/Mixed/16               1.00 ± 0%       1.00 ± 0%       ~     (all equal)
Fields/Mixed/256              1.00 ± 0%       1.00 ± 0%       ~     (all equal)
Fields/Mixed/4096             1.00 ± 0%       5.00 ± 0%   +400.00%  (p=0.000 n=10+10)
Fields/Mixed/65536            1.00 ± 0%      12.00 ± 0%  +1100.00%  (p=0.000 n=10+10)
Fields/Mixed/1048576          1.00 ± 0%      24.00 ± 0%  +2300.00%  (p=0.000 n=10+10)
FieldsFunc/ASCII/16           1.00 ± 0%       1.00 ± 0%       ~     (all equal)
FieldsFunc/ASCII/256          1.00 ± 0%       1.00 ± 0%       ~     (all equal)
FieldsFunc/ASCII/4096         1.00 ± 0%       5.00 ± 0%   +400.00%  (p=0.000 n=10+10)
FieldsFunc/ASCII/65536        1.00 ± 0%      12.00 ± 0%  +1100.00%  (p=0.000 n=10+10)
FieldsFunc/ASCII/1048576      1.00 ± 0%      24.00 ± 0%  +2300.00%  (p=0.000 n=10+10)
FieldsFunc/Mixed/16           1.00 ± 0%       1.00 ± 0%       ~     (all equal)
FieldsFunc/Mixed/256          1.00 ± 0%       1.00 ± 0%       ~     (all equal)
FieldsFunc/Mixed/4096         1.00 ± 0%       5.00 ± 0%   +400.00%  (p=0.000 n=10+10)
FieldsFunc/Mixed/65536        1.00 ± 0%      12.00 ± 0%  +1100.00%  (p=0.000 n=10+10)
FieldsFunc/Mixed/1048576      1.00 ± 0%      24.00 ± 0%  +2300.00%  (p=0.000 n=10+10)

Change-Id: If1926782decc2f60d3b4b8c41c2ce7d8bdedfd8f
Reviewed-on: https://go-review.googlesource.com/55131
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-08-14 21:08:51 +00:00
Brian Downs
cd619caff4 bytes: add example for (*Buffer).Grow
Change-Id: I04849883dd2e1f6d083e9f57d2a8c1bd7d258953
Reviewed-on: https://go-review.googlesource.com/48878
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
2017-07-16 03:49:43 +00:00
Alberto Donizetti
29469d2406 bytes: note that NewBuffer take ownership of its argument
Fixes #19383

Change-Id: Ic84517053ced7794006f6fc65e6f249e97d6cf35
Reviewed-on: https://go-review.googlesource.com/44691
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-06-02 18:03:36 +00:00
Martin Möhrmann
69972aea74 internal/cpu: new package to detect cpu features
Implements detection of x86 cpu features that
are used in the go standard library.

Changes all standard library packages to use the new cpu package
instead of using runtime internal variables to check x86 cpu features.

Updates: #15403

Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9
Reviewed-on: https://go-review.googlesource.com/41476
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-10 17:02:21 +00:00
Marvin Stenger
8c49c06b48 bytes: skip inline test by default
The test "TestTryGrowByResliceInlined" introduced in c08ac36 broke the
noopt builder as it fails when inlining is disabled.
Since there are currently no other options at hand for checking
inlined-ness other than looking at emited symbols of the compilation,
we for now skip the problem causing test by default and only run
it on one specific builder ("linux-amd64").
Also see CL 42813, which introduced the test and contains comments
suggesting this temporary solution.

Change-Id: I3978ab0831da04876cf873d78959f821c459282b
Reviewed-on: https://go-review.googlesource.com/42820
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-08 15:16:21 +00:00
Marvin Stenger
c08ac36761 bytes: optimize Buffer's Write, WriteString, WriteByte, and WriteRune
In the common case, the grow method only needs to reslice the internal
buffer. Making another function call to grow can be expensive when Write
is called very often with small pieces of data (like a byte or rune).
Thus, we add a tryGrowByReslice method that is inlineable so that we can
avoid an extra call in most cases.

name                       old time/op    new time/op    delta
WriteByte-4                  35.5µs ± 0%    17.4µs ± 1%   -51.03%  (p=0.000 n=19+20)
WriteRune-4                  55.7µs ± 1%    38.7µs ± 1%   -30.56%  (p=0.000 n=18+19)
BufferNotEmptyWriteRead-4     304µs ± 5%     283µs ± 3%    -6.86%  (p=0.000 n=19+17)
BufferFullSmallReads-4       87.0µs ± 5%    66.8µs ± 2%   -23.26%  (p=0.000 n=17+17)

name                       old speed      new speed      delta
WriteByte-4                 115MB/s ± 0%   235MB/s ± 1%  +104.19%  (p=0.000 n=19+20)
WriteRune-4                 221MB/s ± 1%   318MB/s ± 1%   +44.01%  (p=0.000 n=18+19)

Fixes #17857

Change-Id: I08dfb10a1c7e001817729dbfcc951bda12fe8814
Reviewed-on: https://go-review.googlesource.com/42813
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-07 17:44:46 +00:00
Robert Griesemer
86cfe93515 bytes: clarify documentation for UnreadByte/Rune
Fixes #19522.

Change-Id: Ib3cf0336e0bf91580d533704ec1a9d45eb0bf62d
Reviewed-on: https://go-review.googlesource.com/42020
Reviewed-by: Rob Pike <r@golang.org>
2017-04-28 16:37:13 +00:00
Josselin Costanzi
d206af1e6c strings: optimize Count for amd64
Move optimized Count implementation from bytes to runtime. Use in
both bytes and strings packages.
Add CountByte benchmark to strings.

Strings benchmarks:
name                       old time/op    new time/op    delta
CountHard1-4                 226µs ± 1%      226µs ± 2%      ~     (p=0.247 n=10+10)
CountHard2-4                 316µs ± 1%      315µs ± 0%      ~     (p=0.133 n=9+10)
CountHard3-4                 919µs ± 1%      920µs ± 1%      ~     (p=0.968 n=10+9)
CountTorture-4              15.4µs ± 1%     15.7µs ± 1%    +2.47%  (p=0.000 n=10+9)
CountTortureOverlapping-4   9.60ms ± 0%     9.65ms ± 1%      ~     (p=0.247 n=10+10)
CountByte/10-4              26.3ns ± 1%     10.9ns ± 1%   -58.71%  (p=0.000 n=9+9)
CountByte/32-4              42.7ns ± 0%     14.2ns ± 0%   -66.64%  (p=0.000 n=10+10)
CountByte/4096-4            3.07µs ± 0%     0.31µs ± 2%   -89.99%  (p=0.000 n=9+10)
CountByte/4194304-4         3.48ms ± 1%     0.34ms ± 1%   -90.09%  (p=0.000 n=10+9)
CountByte/67108864-4        55.6ms ± 1%      7.0ms ± 0%   -87.49%  (p=0.000 n=9+8)

name                      old speed      new speed       delta
CountByte/10-4             380MB/s ± 1%    919MB/s ± 1%  +142.21%  (p=0.000 n=9+9)
CountByte/32-4             750MB/s ± 0%   2247MB/s ± 0%  +199.62%  (p=0.000 n=10+10)
CountByte/4096-4          1.33GB/s ± 0%  13.32GB/s ± 2%  +898.13%  (p=0.000 n=9+10)
CountByte/4194304-4       1.21GB/s ± 1%  12.17GB/s ± 1%  +908.87%  (p=0.000 n=10+9)
CountByte/67108864-4      1.21GB/s ± 1%   9.65GB/s ± 0%  +699.29%  (p=0.000 n=9+8)

Fixes #19411

Change-Id: I8d2d409f0fa6df6d03b60790aa86e540b4a4e3b0
Reviewed-on: https://go-review.googlesource.com/38693
Reviewed-by: Keith Randall <khr@golang.org>
2017-04-07 14:25:13 +00:00
Eric Lagergren
59f6549d1c bytes, strings: declare variables inside loop they're used in
The recently updated Count functions declare variables before
special-cased returns.

Change-Id: I8f726118336b7b0ff72117d12adc48b6e37e60ea
Reviewed-on: https://go-review.googlesource.com/39357
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-03 23:30:36 +00:00
Eric Lagergren
094498c9a1 all: fix minor misspellings
Change-Id: I1f1cfb161640eb8756fb1a283892d06b30b7a8fa
Reviewed-on: https://go-review.googlesource.com/39356
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-03 23:19:07 +00:00
Josselin Costanzi
0d3cd51c9c bytes: fix typo in comment
Change-Id: Ia739337dc9961422982912cc6a669022559fb991
Reviewed-on: https://go-review.googlesource.com/38365
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-22 19:41:54 +00:00
Josselin Costanzi
01cd22c687 bytes: add optimized countByte for amd64
Use SSE/AVX2 when counting a single byte.
Inspired from runtime indexbyte implementation.

Benchmark against previous implementation, where
1 byte in every 8 is the one we are looking for:

* On a machine without AVX2
name               old time/op   new time/op     delta
CountSingle/10-4    61.8ns ±10%     15.6ns ±11%    -74.83%  (p=0.000 n=10+10)
CountSingle/32-4     100ns ± 4%       17ns ±10%    -82.54%  (p=0.000 n=10+9)
CountSingle/4K-4    9.66µs ± 3%     0.37µs ± 6%    -96.21%  (p=0.000 n=10+10)
CountSingle/4M-4    11.0ms ± 6%      0.4ms ± 4%    -96.04%  (p=0.000 n=10+10)
CountSingle/64M-4    194ms ± 8%        8ms ± 2%    -95.64%  (p=0.000 n=10+10)

name               old speed     new speed       delta
CountSingle/10-4   162MB/s ±10%    645MB/s ±10%   +297.00%  (p=0.000 n=10+10)
CountSingle/32-4   321MB/s ± 5%   1844MB/s ± 9%   +474.79%  (p=0.000 n=10+9)
CountSingle/4K-4   424MB/s ± 3%  11169MB/s ± 6%  +2533.10%  (p=0.000 n=10+10)
CountSingle/4M-4   381MB/s ± 7%   9609MB/s ± 4%  +2421.88%  (p=0.000 n=10+10)
CountSingle/64M-4  346MB/s ± 7%   7924MB/s ± 2%  +2188.78%  (p=0.000 n=10+10)

* On a machine with AVX2
name               old time/op   new time/op     delta
CountSingle/10-8    37.1ns ± 3%      8.2ns ± 1%    -77.80%  (p=0.000 n=10+10)
CountSingle/32-8    66.1ns ± 3%      9.8ns ± 2%    -85.23%  (p=0.000 n=10+10)
CountSingle/4K-8    7.36µs ± 3%     0.11µs ± 1%    -98.54%  (p=0.000 n=10+10)
CountSingle/4M-8    7.46ms ± 2%     0.15ms ± 2%    -97.95%  (p=0.000 n=10+9)
CountSingle/64M-8    124ms ± 2%        6ms ± 4%    -95.09%  (p=0.000 n=10+10)

name               old speed     new speed       delta
CountSingle/10-8   269MB/s ± 3%   1213MB/s ± 1%   +350.32%  (p=0.000 n=10+10)
CountSingle/32-8   484MB/s ± 4%   3277MB/s ± 2%   +576.66%  (p=0.000 n=10+10)
CountSingle/4K-8   556MB/s ± 3%  37933MB/s ± 1%  +6718.36%  (p=0.000 n=10+10)
CountSingle/4M-8   562MB/s ± 2%  27444MB/s ± 3%  +4783.43%  (p=0.000 n=10+9)
CountSingle/64M-8  543MB/s ± 2%  11054MB/s ± 3%  +1935.81%  (p=0.000 n=10+10)

Fixes #19411

Change-Id: Ieaf20b1fabccabe767c55c66e242e86f3617f883
Reviewed-on: https://go-review.googlesource.com/38258
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-21 20:25:17 +00:00
Carlo Alberto Ferraris
55310403dd bytes: make bytes.Buffer cache-friendly
During benchmark of an internal tool we found out that (*Buffer).Reset() was
surprisingly showing up in CPU profiles.

This CL contains two related changes aimed at speeding up Reset():
1. Create a fast path for Truncate(0) by moving the logic to Reset()
   (this makes Reset() a simple leaf func that gets inlined since it
   gets compiled to 3 MOVx instructions). Accordingly change calls in
   the rest of the Buffer methods to call Reset() instead of Truncate(0).
2. Reorder the fields in the Buffer struct so that frequently accessed
   fields are packed together (buf, off, lastRead). This also make them
   likely to be in the same cacheline.

Ideally it would be advisable to have Buffer{} cacheline-aligned, but I
couldn't find a way to do this without changing the size of the bootstrap
array (but this will cause some regressions, because it will make duffcopy
show up in CPU profiles where it wasn't showing up before).

go1 benchmarks are not really affected, but some other benchmarks that
exercise Buffer more show improvements:

name                     old time/op    new time/op    delta
BinaryTree17-4              2.46s ± 9%     2.43s ± 3%    ~     (p=0.982 n=14+14)
Fannkuch11-4                2.98s ± 1%     2.90s ± 1%  -2.58%  (p=0.000 n=15+14)
FmtFprintfEmpty-4          45.2ns ± 1%    45.2ns ± 1%    ~     (p=0.494 n=14+15)
FmtFprintfString-4         76.8ns ± 1%    83.1ns ± 2%  +8.23%  (p=0.000 n=10+15)
FmtFprintfInt-4            78.0ns ± 2%    74.6ns ± 1%  -4.46%  (p=0.000 n=15+15)
FmtFprintfIntInt-4          113ns ± 1%     109ns ± 2%  -2.91%  (p=0.000 n=13+15)
FmtFprintfPrefixedInt-4     152ns ± 2%     143ns ± 2%  -6.04%  (p=0.000 n=15+14)
FmtFprintfFloat-4           224ns ± 1%     222ns ± 2%  -1.08%  (p=0.001 n=15+14)
FmtManyArgs-4               464ns ± 2%     463ns ± 2%    ~     (p=0.303 n=14+15)
GobDecode-4                6.25ms ± 2%    6.32ms ± 3%  +1.20%  (p=0.002 n=14+14)
GobEncode-4                5.41ms ± 2%    5.41ms ± 2%    ~     (p=0.967 n=15+15)
Gzip-4                      215ms ± 2%     218ms ± 2%  +1.35%  (p=0.002 n=15+15)
Gunzip-4                   34.3ms ± 2%    34.2ms ± 2%    ~     (p=0.539 n=15+15)
HTTPClientServer-4         76.4µs ± 2%    75.4µs ± 1%  -1.31%  (p=0.000 n=15+15)
JSONEncode-4               14.7ms ± 2%    14.6ms ± 3%    ~     (p=0.094 n=14+14)
JSONDecode-4               48.0ms ± 1%    48.5ms ± 1%  +0.92%  (p=0.001 n=14+12)
Mandelbrot200-4            4.04ms ± 2%    4.06ms ± 1%    ~     (p=0.108 n=15+13)
GoParse-4                  2.99ms ± 2%    3.00ms ± 1%    ~     (p=0.130 n=15+13)
RegexpMatchEasy0_32-4      78.3ns ± 1%    79.5ns ± 1%  +1.51%  (p=0.000 n=15+14)
RegexpMatchEasy0_1K-4       185ns ± 1%     186ns ± 1%  +0.76%  (p=0.005 n=15+15)
RegexpMatchEasy1_32-4      79.0ns ± 2%    76.7ns ± 1%  -2.87%  (p=0.000 n=14+15)

name                     old speed      new speed      delta
GobDecode-4               123MB/s ± 2%   121MB/s ± 3%  -1.18%  (p=0.002 n=14+14)
GobEncode-4               142MB/s ± 2%   142MB/s ± 1%    ~     (p=0.959 n=15+15)
Gzip-4                   90.3MB/s ± 2%  89.1MB/s ± 2%  -1.34%  (p=0.002 n=15+15)
Gunzip-4                  565MB/s ± 2%   567MB/s ± 2%    ~     (p=0.539 n=15+15)
JSONEncode-4              132MB/s ± 2%   133MB/s ± 3%    ~     (p=0.091 n=14+14)
JSONDecode-4             40.4MB/s ± 1%  40.0MB/s ± 1%  -0.92%  (p=0.001 n=14+12)
GoParse-4                19.4MB/s ± 2%  19.3MB/s ± 1%    ~     (p=0.121 n=15+13)
RegexpMatchEasy0_32-4     409MB/s ± 1%   403MB/s ± 1%  -1.47%  (p=0.000 n=15+14)
RegexpMatchEasy0_1K-4    5.53GB/s ± 1%  5.49GB/s ± 1%  -0.86%  (p=0.002 n=15+15)
RegexpMatchEasy1_32-4     405MB/s ± 2%   417MB/s ± 1%  +2.94%  (p=0.000 n=14+15)

name                old time/op  new time/op  delta
PoolsSingle1K-4     34.9ns ± 2%  30.4ns ± 4%  -12.80%  (p=0.000 n=15+15)
PoolsSingle64K-4    36.9ns ± 1%  34.4ns ± 4%   -6.72%  (p=0.000 n=14+15)
PoolsRandomSmall-4  34.8ns ± 3%  29.5ns ± 1%  -15.19%  (p=0.000 n=15+14)
PoolsRandomLarge-4  38.6ns ± 1%  34.3ns ± 3%  -11.17%  (p=0.000 n=14+15)
PoolSingle1K-4      26.1ns ± 1%  21.2ns ± 2%  -18.59%  (p=0.000 n=15+14)
PoolSingle64K-4     26.7ns ± 2%  21.5ns ± 2%  -19.72%  (p=0.000 n=15+15)
MakeSingle1K-4      24.2ns ± 2%  24.3ns ± 3%     ~     (p=0.132 n=13+15)
MakeSingle64K-4     6.76µs ± 1%  6.96µs ± 5%   +2.94%  (p=0.002 n=13+13)
MakeRandomSmall-4    531ns ± 4%   538ns ± 5%     ~     (p=0.066 n=14+15)
MakeRandomLarge-4    152µs ± 0%   152µs ± 1%   -0.31%  (p=0.001 n=14+13)

Change-Id: I86d7d9d2cac65335baf62214fbb35ba0fd8f9528
Reviewed-on: https://go-review.googlesource.com/37416
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-28 05:19:38 +00:00
Aliaksandr Valialkin
8946502776 bytes, strings: optimize Split*
The relevant benchmark results on linux/amd64:

bytes:

SplitSingleByteSeparator-4   25.7ms ± 5%   9.1ms ± 4%  -64.40%  (p=0.000 n=10+10)
SplitMultiByteSeparator-4    13.8ms ±20%   4.3ms ± 8%  -69.23%  (p=0.000 n=10+10)
SplitNSingleByteSeparator-4  1.88µs ± 9%  0.88µs ± 4%  -53.25%  (p=0.000 n=10+10)
SplitNMultiByteSeparator-4   4.83µs ±10%  1.32µs ± 9%  -72.65%  (p=0.000 n=10+10)

strings:

name                         old time/op  new time/op  delta
SplitSingleByteSeparator-4   21.4ms ± 8%   8.5ms ± 5%  -60.19%  (p=0.000 n=10+10)
SplitMultiByteSeparator-4    13.2ms ± 9%   3.9ms ± 4%  -70.29%  (p=0.000 n=10+10)
SplitNSingleByteSeparator-4  1.54µs ± 5%  0.75µs ± 7%  -51.21%  (p=0.000 n=10+10)
SplitNMultiByteSeparator-4   3.57µs ± 8%  1.01µs ±11%  -71.76%  (p=0.000 n=10+10)

Fixes #18973

Change-Id: Ie4bc010c6cc389001e72eab530497c81e5b26f34
Reviewed-on: https://go-review.googlesource.com/36510
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-08 18:39:43 +00:00
Ilya Tocar
438818d9f1 bytes: use Index in Count
Similar to https://go-review.googlesource.com/28586,
but for package bytes instead of strings.
This provides simpler code and some performance gain.
Also update strings.Count to use the same code.

On AMD64 with heavily optimized Index I see:

name             old time/op    new time/op     delta
Count/10-6         47.3ns ± 0%     36.8ns ± 0%    -22.35%  (p=0.000 n=10+10)
Count/32-6          286ns ± 0%       38ns ± 0%    -86.71%  (p=0.000 n=10+10)
Count/4K-6         50.1µs ± 0%      4.4µs ± 0%    -91.18%  (p=0.000 n=10+10)
Count/4M-6         48.1ms ± 1%      4.5ms ± 0%    -90.56%  (p=0.000 n=10+9)
Count/64M-6         784ms ± 0%       73ms ± 0%    -90.73%  (p=0.000 n=10+10)
CountEasy/10-6     28.4ns ± 0%     31.0ns ± 0%     +9.23%  (p=0.000 n=10+10)
CountEasy/32-6     30.6ns ± 0%     37.0ns ± 0%    +20.92%  (p=0.000 n=10+10)
CountEasy/4K-6      186ns ± 0%      198ns ± 0%     +6.45%  (p=0.000 n=9+10)
CountEasy/4M-6      233µs ± 2%      234µs ± 2%       ~     (p=0.912 n=10+10)
CountEasy/64M-6    6.70ms ± 0%     6.68ms ± 1%       ~     (p=0.762 n=8+10)

name             old speed      new speed       delta
Count/10-6        211MB/s ± 0%    272MB/s ± 0%    +28.77%  (p=0.000 n=10+9)
Count/32-6        112MB/s ± 0%    842MB/s ± 0%   +652.84%  (p=0.000 n=10+10)
Count/4K-6       81.8MB/s ± 0%  927.6MB/s ± 0%  +1033.63%  (p=0.000 n=10+9)
Count/4M-6       87.2MB/s ± 1%  924.0MB/s ± 0%   +959.25%  (p=0.000 n=10+9)
Count/64M-6      85.6MB/s ± 0%  922.9MB/s ± 0%   +978.31%  (p=0.000 n=10+10)
CountEasy/10-6    352MB/s ± 0%    322MB/s ± 0%     -8.41%  (p=0.000 n=10+10)
CountEasy/32-6   1.05GB/s ± 0%   0.87GB/s ± 0%    -17.35%  (p=0.000 n=9+10)
CountEasy/4K-6   22.0GB/s ± 0%   20.6GB/s ± 0%     -6.33%  (p=0.000 n=10+10)
CountEasy/4M-6   18.0GB/s ± 2%   18.0GB/s ± 2%       ~     (p=0.912 n=10+10)
CountEasy/64M-6  10.0GB/s ± 0%   10.0GB/s ± 1%       ~     (p=0.762 n=8+10)

On 386, without asm version of Index:

Count/10-6         57.0ns ± 0%     56.9ns ± 0%   -0.11%  (p=0.006 n=10+9)
Count/32-6          340ns ± 0%      274ns ± 0%  -19.48%  (p=0.000 n=10+9)
Count/4K-6         49.5µs ± 0%     37.1µs ± 0%  -24.96%  (p=0.000 n=10+10)
Count/4M-6         51.1ms ± 0%     38.2ms ± 0%  -25.21%  (p=0.000 n=10+10)
Count/64M-6         818ms ± 0%      613ms ± 0%  -25.07%  (p=0.000 n=8+10)
CountEasy/10-6     60.0ns ± 0%     70.4ns ± 0%  +17.34%  (p=0.000 n=10+10)
CountEasy/32-6     81.1ns ± 0%     94.0ns ± 0%  +15.97%  (p=0.000 n=9+10)
CountEasy/4K-6     4.37µs ± 0%     4.39µs ± 0%   +0.30%  (p=0.000 n=10+9)
CountEasy/4M-6     4.43ms ± 0%     4.43ms ± 0%     ~     (p=0.579 n=10+10)
CountEasy/64M-6    70.9ms ± 0%     70.9ms ± 0%     ~     (p=0.912 n=10+10)

name             old speed      new speed       delta
Count/10-6        176MB/s ± 0%    176MB/s ± 0%   +0.10%  (p=0.000 n=10+9)
Count/32-6       93.9MB/s ± 0%  116.5MB/s ± 0%  +24.06%  (p=0.000 n=10+9)
Count/4K-6       82.7MB/s ± 0%  110.3MB/s ± 0%  +33.26%  (p=0.000 n=10+10)
Count/4M-6       82.1MB/s ± 0%  109.7MB/s ± 0%  +33.70%  (p=0.000 n=10+10)
Count/64M-6      82.0MB/s ± 0%  109.5MB/s ± 0%  +33.46%  (p=0.000 n=8+10)
CountEasy/10-6    167MB/s ± 0%    142MB/s ± 0%  -14.75%  (p=0.000 n=9+10)
CountEasy/32-6    395MB/s ± 0%    340MB/s ± 0%  -13.77%  (p=0.000 n=10+10)
CountEasy/4K-6    936MB/s ± 0%    934MB/s ± 0%   -0.29%  (p=0.000 n=10+9)
CountEasy/4M-6    947MB/s ± 0%    946MB/s ± 0%     ~     (p=0.591 n=10+10)
CountEasy/64M-6   947MB/s ± 0%    947MB/s ± 0%     ~     (p=0.867 n=10+10)

Change-Id: Ia76b247372b6f5b5d23a9f10253a86536a5153b3
Reviewed-on: https://go-review.googlesource.com/36489
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-08 17:52:30 +00:00