qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-19 23:14:47 -07:00

Author	SHA1	Message	Date
Carlos Eduardo Seo	9459c03b29	math/big: improve performance for addVV/subVV for ppc64x This change adds a better asm implementation of addVV for ppc64x, with speedups up to nearly 3x in the best cases. benchmark old ns/op new ns/op delta BenchmarkAddVV/1-8 7.33 5.81 -20.74% BenchmarkAddVV/2-8 8.72 6.49 -25.57% BenchmarkAddVV/3-8 10.5 7.08 -32.57% BenchmarkAddVV/4-8 12.7 7.57 -40.39% BenchmarkAddVV/5-8 14.3 8.06 -43.64% BenchmarkAddVV/10-8 27.6 11.1 -59.78% BenchmarkAddVV/100-8 218 82.4 -62.20% BenchmarkAddVV/1000-8 2064 718 -65.21% BenchmarkAddVV/10000-8 20536 7153 -65.17% BenchmarkAddVV/100000-8 211004 72403 -65.69% benchmark old MB/s new MB/s speedup BenchmarkAddVV/1-8 8729.74 11006.26 1.26x BenchmarkAddVV/2-8 14683.65 19707.55 1.34x BenchmarkAddVV/3-8 18226.96 27103.63 1.49x BenchmarkAddVV/4-8 20204.50 33805.81 1.67x BenchmarkAddVV/5-8 22348.64 39694.06 1.78x BenchmarkAddVV/10-8 23212.74 57631.08 2.48x BenchmarkAddVV/100-8 29300.07 77629.53 2.65x BenchmarkAddVV/1000-8 31000.56 89094.54 2.87x BenchmarkAddVV/10000-8 31163.61 89469.16 2.87x BenchmarkAddVV/100000-8 30331.16 88393.73 2.91x It also adds the use of CTR for the loop counter in subVV, instead of manually updating the loop counter. This is slightly faster. Change-Id: Ic4b05cad384fd057972d46a5618ed5c3039d7460 Reviewed-on: https://go-review.googlesource.com/41010 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2017-04-25 13:15:39 +00:00
Ilya Tocar	bc6459ac6c	math: remove asm version of sincos everywhere, except 386 We have dedicated asm implementation of sincos only on 386 and amd64, on everything else we are just jumping to generic version. However amd64 version is actually slower than generic one: Sincos-6 34.4ns ± 0% 24.8ns ± 0% -27.79% (p=0.000 n=8+10) So remove all sincos*.s and keep only generic and 386. Updates #19819 Change-Id: I7eefab35743729578264f52f6d23ee2c227c92a5 Reviewed-on: https://go-review.googlesource.com/41200 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-04-24 15:09:18 +00:00
Michael Munday	eed6938cbb	cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions The instructions allow moves between floating point and general purpose registers without any conversion taking place. Change-Id: I82c6f3ad9c841a83783b5be80dcf5cd538ff49e6 Reviewed-on: https://go-review.googlesource.com/38777 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-04-17 16:33:51 +00:00
Robert Griesemer	9d01def597	math/bits: support negative rotation count and remove RotateRight For details see the discussion on the issue below. RotateLeft functions can now be inlined because the don't panic anymore for negative rotation counts. name old time/op new time/op delta RotateLeft-8 6.72ns ± 2% 1.86ns ± 0% -72.33% (p=0.016 n=5+4) RotateLeft8-8 4.41ns ± 2% 1.67ns ± 1% -62.15% (p=0.008 n=5+5) RotateLeft16-8 4.46ns ± 6% 1.65ns ± 0% -63.06% (p=0.008 n=5+5) RotateLeft32-8 4.50ns ± 5% 1.67ns ± 1% -62.86% (p=0.008 n=5+5) RotateLeft64-8 4.54ns ± 1% 1.85ns ± 1% -59.32% (p=0.008 n=5+5) https://perf.golang.org/search?q=upload:20170411.4 (Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.) For #18616. Change-Id: I0828d80d54ec24f8d44954a57b3d6aeedb69c686 Reviewed-on: https://go-review.googlesource.com/40394 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-11 23:57:24 +00:00
Eric Lagergren	094498c9a1	all: fix minor misspellings Change-Id: I1f1cfb161640eb8756fb1a283892d06b30b7a8fa Reviewed-on: https://go-review.googlesource.com/39356 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-03 23:19:07 +00:00
Carlos Eduardo Seo	4a1140472b	math/big: Unify divWW implementation for ppc64 and ppc64le. Starting in go1.9, the minimum processor requirement for ppc64 is POWER8. So it may now use the same divWW implementation as ppc64le. Updates #19074 Change-Id: If1a85f175cda89eee06a1024ccd468da6124c844 Reviewed-on: https://go-review.googlesource.com/39010 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2017-03-31 14:05:12 +00:00
Ilya Tocar	4f579cc65b	math: speed up Log on amd64 After https://golang.org/cl/31490 we break false output dependency for CVTS.. in compiler generated code. I've looked through asm code, which uses CVTS.. and added XOR to the only case where it affected performance. Log-6 21.6ns ± 0% 19.9ns ± 0% -7.87% (p=0.000 n=10+10) Change-Id: I25d9b405e3041a3839b40f9f9a52e708034bb347 Reviewed-on: https://go-review.googlesource.com/38771 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-29 20:36:29 +00:00
Robert Griesemer	70ea0ec30f	math/big: replace local versions of bitLen, nlz with math/bits versions Verified that BenchmarkBitLen time went down from 2.25 ns/op to 0.65 ns/op an a 2.3 GHz Intel Core i7, before removing that benchmark (now covered by math/bits benchmarks). Change-Id: I3890bb7d1889e95b9a94bd68f0bdf06f1885adeb Reviewed-on: https://go-review.googlesource.com/38464 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-23 19:43:09 +00:00
Robert Griesemer	9ecfd177cf	math/big: fix TestFloatSetFloat64String A -0 constant is the same as 0. Use explicit negative zero for float64 -0.0. Also, fix two test cases that were wrong. Fixes #19673. Change-Id: Ic09775f29d9bc2ee7814172e59c4a693441ea730 Reviewed-on: https://go-review.googlesource.com/38463 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-23 17:17:16 +00:00
Josh Bleecher Snyder	2de773d45f	math/big: make nat.setUint64 vet-friendly nat.setUint64 is nicely generic. By assuming 32- or 64-bit words, however, we can write simpler code, and eliminate some shifts in dead code that vet complains about. Generated code for 64 bit systems is unaltered. Generated code for 32 bit systems is much better. For 386, the routine length drops from 325 bytes of code to 271 bytes of code, with fewer loops. Change-Id: I1bc14c06272dee37a7fcb48d33dd1e621eba945d Reviewed-on: https://go-review.googlesource.com/38070 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2017-03-11 00:39:23 +00:00
Eitan Adler	789c5255a4	all: remove the the duplicate words Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0 Reviewed-on: https://go-review.googlesource.com/37793 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-06 04:39:12 +00:00
Robert Griesemer	32b41c8dc7	math/bits: move left-over functionality from bits_impl.go to bits.go Removes an extra function call for TrailingZeroes and thus may increase chances for inlining. Change-Id: Iefd8d4402dc89b64baf4e5c865eb3dadade623af Reviewed-on: https://go-review.googlesource.com/37613 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-28 23:50:47 +00:00
Robert Griesemer	83bc4a2fee	math/bits: faster LeadingZeros and Len functions benchmark old ns/op new ns/op delta BenchmarkLeadingZeros-8 8.43 3.10 -63.23% BenchmarkLeadingZeros8-8 8.13 1.33 -83.64% BenchmarkLeadingZeros16-8 7.34 2.07 -71.80% BenchmarkLeadingZeros32-8 7.99 2.87 -64.08% BenchmarkLeadingZeros64-8 8.13 2.96 -63.59% Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3. Change-Id: Id343531b408d42ac45f10c76f60e85bdb977f91e Reviewed-on: https://go-review.googlesource.com/37582 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-28 20:55:13 +00:00
Robert Griesemer	9515cb511a	math/bits: faster TrailingZeroes8 For sizes > 8, the existing code is faster. benchmark old ns/op new ns/op delta BenchmarkTrailingZeros8-8 1.95 1.29 -33.85% Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3. Change-Id: I6f3a33ec633a2c544ec29693c141f2f99335c745 Reviewed-on: https://go-review.googlesource.com/37581 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-28 20:55:01 +00:00
Robert Griesemer	d7a659b11b	math/bits: faster OnesCount using table lookups for sizes 8,16,32 For uint64, the existing algorithm is faster. benchmark old ns/op new ns/op delta BenchmarkOnesCount8-8 1.95 0.97 -50.26% BenchmarkOnesCount16-8 2.54 1.39 -45.28% BenchmarkOnesCount32-8 2.61 1.96 -24.90% Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3. Change-Id: I6cc42882fef3d24694720464039161e339a9ae99 Reviewed-on: https://go-review.googlesource.com/37580 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-28 20:54:49 +00:00
Robert Griesemer	e18adbf88d	math/bits: faster Reverse8/16 functions using table lookups Measured on 2.3 GHz Intel Core i7, running macOS 10.12.3: benchmark old ns/op new ns/op delta BenchmarkReverse8-8 1.70 0.99 -41.76% BenchmarkReverse16-8 2.24 1.32 -41.07% Fixes #19279. Change-Id: I398cf8a3513b7fa63c130efc7846a7c5353999d4 Reviewed-on: https://go-review.googlesource.com/37459 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-25 22:18:58 +00:00
Robert Griesemer	ac91a514ff	math/bits: fix incorrect doc strings for TrailingZeros functions Change-Id: I3e40018ab1903d3b9ada7ad7812ba71ea2a428e7 Reviewed-on: https://go-review.googlesource.com/37456 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-25 00:58:25 +00:00
Robert Griesemer	322fff8ac8	math/big: use math/bits where appropriate This change adds math/bits as a new dependency of math/big. - use bits.LeadingZeroes instead of local implementation (they are identical, so there's no performance loss here) - leave other functionality local (ntz, bitLen) since there's faster implementations in math/big at the moment Change-Id: I1218aa8a1df0cc9783583b090a4bb5a8a145c4a2 Reviewed-on: https://go-review.googlesource.com/37141 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-24 19:19:02 +00:00
Martin Möhrmann	8c6643846e	math: speed up and improve accuracy of Pow10 Removes init function from the math package. Allows stripping of arrays with pre-computed values used for Pow10 from binaries if Pow10 is not used. cmd/go shrinks by 128 bytes. Fixed small values like 10**-323 being 0 instead of 1e-323. Overall precision is increased but still not as good as predefined constants for some inputs. Samples: Pow10(208) before: 1.0000000000000006662e+208 after: 1.0000000000000000959e+208 Pow10(202) before 1.0000000000000009895e+202 after 1.0000000000000001193e+202 Pow10(60) before 1.0000000000000001278e+60 after 0.9999999999999999494e+60 Pow10(-100) before 0.99999999999999938551e-100 after 0.99999999999999989309e-100 Pow10(-200) before 0.9999999999999988218e-200 after 1.0000000000000001271e-200 name old time/op new time/op delta Pow10Pos-4 44.6ns ± 2% 1.2ns ± 1% -97.39% (p=0.000 n=19+17) Pow10Neg-4 50.8ns ± 1% 4.1ns ± 2% -92.02% (p=0.000 n=17+19) Change-Id: If094034286b8ac64be3a95fd9e8ffa3d4ad39b31 Reviewed-on: https://go-review.googlesource.com/36331 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-22 19:17:04 +00:00
Alexander Döring	ffb3b3698c	math: add more tests for special cases of Bessel functions Y0, Y1, Yn Test finite negative x with Y0(-1), Y1(-1), Yn(2,-1), Yn(-3,-1). Also test the special case Yn(0,0). Fixes #19130. Change-Id: I95f05a72e1c455ed8ddf202c56f4266f03f370fd Reviewed-on: https://go-review.googlesource.com/37310 Reviewed-by: Robert Griesemer <gri@golang.org>	2017-02-22 17:52:15 +00:00
Robert Griesemer	174058038c	math/big: define Word as uint instead of uintptr For compatibility with math/bits uint operations. When math/big was written originally, the Go compiler used 32bit int/uint values even on a 64bit machine. uintptr was the type that represented the machine register size. Now, the int/uint types are sized to the native machine register size, so they are the natural machine Word type. On most machines, the size of int/uint correspond to the size of uintptr. On platforms where uint and uintptr have different sizes, this change may lead to performance differences (e.g., amd64p32). Change-Id: Ief249c160b707b6441848f20041e32e9e9d8d8ca Reviewed-on: https://go-review.googlesource.com/37372 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-21 19:31:40 +00:00
Robert Griesemer	177dfba112	math/bits: faster OnesCount Using some additional suggestions per "Hacker's Delight". Added documentation and extra tests. Measured on 1.7 GHz Intel Core i7, running macOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkOnesCount-4 7.34 5.38 -26.70% BenchmarkOnesCount8-4 2.03 1.98 -2.46% BenchmarkOnesCount16-4 2.56 2.50 -2.34% BenchmarkOnesCount32-4 2.98 2.39 -19.80% BenchmarkOnesCount64-4 4.22 2.96 -29.86% Change-Id: I566b0ef766e55cf5776b1662b6016024ebe5d878 Reviewed-on: https://go-review.googlesource.com/37223 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-19 18:50:48 +00:00
Martin Möhrmann	6cfc3b25e9	math: protect benchmarked functions from being optimized away Add exported global variables and store the results of benchmarked functions in them. This prevents the current compiler optimizations from removing the instructions that are needed to compute the return values of the benchmarked functions. Change-Id: If8b08424e85f3796bb6dd73e761c653abbabcc5e Reviewed-on: https://go-review.googlesource.com/37195 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-18 17:00:59 +00:00
Robert Griesemer	a4a3d63dbe	math/bits: added benchmarks for Leading/TrailingZeros BenchmarkLeadingZeros-8 200000000 8.80 ns/op BenchmarkLeadingZeros8-8 200000000 8.21 ns/op BenchmarkLeadingZeros16-8 200000000 7.49 ns/op BenchmarkLeadingZeros32-8 200000000 7.80 ns/op BenchmarkLeadingZeros64-8 200000000 8.67 ns/op BenchmarkTrailingZeros-8 1000000000 2.05 ns/op BenchmarkTrailingZeros8-8 2000000000 1.94 ns/op BenchmarkTrailingZeros16-8 2000000000 1.94 ns/op BenchmarkTrailingZeros32-8 2000000000 1.92 ns/op BenchmarkTrailingZeros64-8 2000000000 2.03 ns/op Change-Id: I45497bf2d6369ba6cfc88ded05aa735908af8908 Reviewed-on: https://go-review.googlesource.com/37220 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 23:41:16 +00:00
Robert Griesemer	19028bdd18	math/bits: faster Rotate functions, added respective benchmarks Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkRotateLeft-8 7.87 7.00 -11.05% BenchmarkRotateLeft8-8 8.41 4.52 -46.25% BenchmarkRotateLeft16-8 8.07 4.55 -43.62% BenchmarkRotateLeft32-8 8.36 4.73 -43.42% BenchmarkRotateLeft64-8 7.93 4.78 -39.72% BenchmarkRotateRight-8 8.23 6.72 -18.35% BenchmarkRotateRight8-8 8.76 4.39 -49.89% BenchmarkRotateRight16-8 9.07 4.44 -51.05% BenchmarkRotateRight32-8 8.85 4.46 -49.60% BenchmarkRotateRight64-8 8.11 4.43 -45.38% Change-Id: I79ea1e9e6fc65f95794a91f860a911efed3aa8a1 Reviewed-on: https://go-review.googlesource.com/37219 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 23:40:45 +00:00
Robert Griesemer	a12edb8db6	math/bits: faster OnesCount, added respective benchmarks Also: Changed Reverse/ReverseBytes implementations to use the same (smaller) masks as OnesCount. BenchmarkOnesCount-8 37.0 6.26 -83.08% BenchmarkOnesCount8-8 7.24 1.99 -72.51% BenchmarkOnesCount16-8 11.3 2.47 -78.14% BenchmarkOnesCount32-8 18.4 3.02 -83.59% BenchmarkOnesCount64-8 40.0 3.78 -90.55% BenchmarkReverse-8 6.69 6.22 -7.03% BenchmarkReverse8-8 1.64 1.64 +0.00% BenchmarkReverse16-8 2.26 2.18 -3.54% BenchmarkReverse32-8 2.88 2.87 -0.35% BenchmarkReverse64-8 5.64 4.34 -23.05% BenchmarkReverseBytes-8 2.48 2.17 -12.50% BenchmarkReverseBytes16-8 0.63 0.95 +50.79% BenchmarkReverseBytes32-8 1.13 1.24 +9.73% BenchmarkReverseBytes64-8 2.50 2.16 -13.60% OnesCount-8 37.0ns ± 0% 6.3ns ± 0% ~ (p=1.000 n=1+1) OnesCount8-8 7.24ns ± 0% 1.99ns ± 0% ~ (p=1.000 n=1+1) OnesCount16-8 11.3ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1) OnesCount32-8 18.4ns ± 0% 3.0ns ± 0% ~ (p=1.000 n=1+1) OnesCount64-8 40.0ns ± 0% 3.8ns ± 0% ~ (p=1.000 n=1+1) Reverse-8 6.69ns ± 0% 6.22ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 1.64ns ± 0% 1.64ns ± 0% ~ (all samples are equal) Reverse16-8 2.26ns ± 0% 2.18ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 2.88ns ± 0% 2.87ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 5.64ns ± 0% 4.34ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 2.48ns ± 0% 2.17ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 0.63ns ± 0% 0.95ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 1.13ns ± 0% 1.24ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 2.50ns ± 0% 2.16ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I591b0ffc83fc3a42828256b6e5030f32c64f9497 Reviewed-on: https://go-review.googlesource.com/37218 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 23:40:10 +00:00
Robert Griesemer	4498b68390	math/bits: faster Reverse, ReverseBytes - moved from: x&m>>k \| x&^m<<k to: x&m>>k \| x<<k&m This permits use of the same constant m twice () which may be better for machines that can't use large immediate constants directly with an AND instruction and have to load them explicitly. ) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m) - simplified returns This improves the generated code because the compiler recognizes x>>k \| x<<k as ROT when k is the bitsize of x. The 8-bit versions of these instructions can be significantly faster still if they are replaced with table lookups, as long as the table is in cache. If the table is not in cache, table-lookup is probably slower, hence the choice of an explicit register-only implementation for now. BenchmarkReverse-8 8.50 6.86 -19.29% BenchmarkReverse8-8 2.17 1.74 -19.82% BenchmarkReverse16-8 2.89 2.34 -19.03% BenchmarkReverse32-8 3.55 2.95 -16.90% BenchmarkReverse64-8 6.81 5.57 -18.21% BenchmarkReverseBytes-8 3.49 2.48 -28.94% BenchmarkReverseBytes16-8 0.93 0.62 -33.33% BenchmarkReverseBytes32-8 1.55 1.13 -27.10% BenchmarkReverseBytes64-8 2.47 2.47 +0.00% Reverse-8 8.50ns ± 0% 6.86ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 2.17ns ± 0% 1.74ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 2.89ns ± 0% 2.34ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 3.55ns ± 0% 2.95ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 6.81ns ± 0% 5.57ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 3.49ns ± 0% 2.48ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 0.93ns ± 0% 0.62ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 1.55ns ± 0% 1.13ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 2.47ns ± 0% 2.47ns ± 0% ~ (all samples are equal) Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d Reviewed-on: https://go-review.googlesource.com/37215 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-17 22:20:28 +00:00
Robert Griesemer	3a239a6ae4	math/bits: fix benchmarks (make sure calls don't get optimized away) Sum up function results and store them in an exported (global) variable. This prevents the compiler from optimizing away the otherwise side-effect free function calls. We now have more realistic set of benchmark numbers... Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. Note: These measurements are based on the same "old" implementation as the prior measurements (commit `7d5c003`). benchmark old ns/op new ns/op delta BenchmarkReverse-8 72.9 8.50 -88.34% BenchmarkReverse8-8 13.2 2.17 -83.56% BenchmarkReverse16-8 21.2 2.89 -86.37% BenchmarkReverse32-8 36.3 3.55 -90.22% BenchmarkReverse64-8 71.3 6.81 -90.45% BenchmarkReverseBytes-8 11.2 3.49 -68.84% BenchmarkReverseBytes16-8 6.24 0.93 -85.10% BenchmarkReverseBytes32-8 7.40 1.55 -79.05% BenchmarkReverseBytes64-8 10.5 2.47 -76.48% Reverse-8 72.9ns ± 0% 8.5ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 13.2ns ± 0% 2.2ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 21.2ns ± 0% 2.9ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 36.3ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 71.3ns ± 0% 6.8ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 11.2ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 6.24ns ± 0% 0.93ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 7.40ns ± 0% 1.55ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 10.5ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I8aef1334b84f6cafd25edccad7e6868b37969efb Reviewed-on: https://go-review.googlesource.com/37213 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 20:58:12 +00:00
Robert Griesemer	ddb15cea4a	math/bits: much faster ReverseBytes, added respective benchmarks Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkReverseBytes-8 11.4 3.51 -69.21% BenchmarkReverseBytes16-8 6.87 0.64 -90.68% BenchmarkReverseBytes32-8 7.79 0.65 -91.66% BenchmarkReverseBytes64-8 11.6 0.64 -94.48% name old time/op new time/op delta ReverseBytes-8 11.4ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 6.87ns ± 0% 0.64ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 7.79ns ± 0% 0.65ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 11.6ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I67b529652b3b613c61687e9e185e8d4ee40c51a2 Reviewed-on: https://go-review.googlesource.com/37211 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 19:38:26 +00:00
Robert Griesemer	7d5c003a3a	math/bits: much faster Reverse, added respective benchmarks Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. name old time/op new time/op delta Reverse-8 76.6ns ± 0% 8.1ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 12.6ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 20.8ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 36.5ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 74.0ns ± 0% 6.4ns ± 0% ~ (p=1.000 n=1+1) benchmark old ns/op new ns/op delta BenchmarkReverse-8 76.6 8.07 -89.46% BenchmarkReverse8-8 12.6 0.64 -94.92% BenchmarkReverse16-8 20.8 0.64 -96.92% BenchmarkReverse32-8 36.5 0.64 -98.25% BenchmarkReverse64-8 74.0 6.38 -91.38% Change-Id: I6b99b10cee2f2babfe79342b50ee36a45a34da30 Reviewed-on: https://go-review.googlesource.com/37149 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 19:38:13 +00:00
Robert Griesemer	81acd308a4	math/bits: expand doc strings for all functions Follow-up on https://go-review.googlesource.com/36315. No functionality change. For #18616. Change-Id: Id4df34dd7d0381be06eea483a11bf92f4a01f604 Reviewed-on: https://go-review.googlesource.com/37140 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-17 19:02:56 +00:00
Shenghou Ma	211102c85f	math: fix typos in Bessel function docs While we're at it, also document Yn(0, 0) = -Inf for completeness. Fixes #18823. Change-Id: Ib6db68f76d29cc2373c12ebdf3fab129cac8c167 Reviewed-on: https://go-review.googlesource.com/35970 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-16 22:41:34 +00:00
Robert Griesemer	661e2179e5	math/bits: added package for bit-level counting and manipulation Initial platform-independent implementation. For #18616. Change-Id: I4585c55b963101af9059c06c1b8a866cb384754c Reviewed-on: https://go-review.googlesource.com/36315 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-16 21:54:59 +00:00
Daniel Martí	6910756f9b	math/big: simplify bool expression Change-Id: I280c53be455f2fe0474ad577c0f7b7908a4eccb2 Reviewed-on: https://go-review.googlesource.com/36993 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 23:34:25 +00:00
Michael Munday	d2fea0447f	math/big: fix s390x test build tags The tests failed to compile when using the math_big_pure_go tag on s390x. Change-Id: I2a09f53ff6562ab9bc9b886cffc0f6205bbfcfbb Reviewed-on: https://go-review.googlesource.com/36956 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-14 19:44:35 +00:00
Josh Bleecher Snyder	785cb7e098	all: fix some printf format strings Appease vet. Change-Id: Ie88de08b91041990c0eaf2e15628cdb98d40c660 Reviewed-on: https://go-review.googlesource.com/36938 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 02:09:30 +00:00
Michael Munday	a524616860	cmd/{asm,internal/obj/s390x}, math: remove emulated float instructions The s390x port was based on the ppc64 port and, because of the way the port was done, inherited some instructions from it. ppc64 supports 3-operand (4-operand for FMADD etc.) floating point instructions but s390x doesn't (the destination register is always an input) and so these were emulated. There is a bug in the emulation of FMADD whereby if the destination register is also a source for the multiplication it will be clobbered. This doesn't break any assembly code in the std lib but could affect future work. To fix this I have gone through the floating point instructions and removed all unnecessary 3-/4-operand emulation. The compiler doesn't need it and assembly writers don't need it, it's just a source of bugs. I've also deleted the FNMADD family of emulated instructions. They aren't used anywhere. Change-Id: Ic07cedcf141a6a3b43a0c84895460f6cfbf56c04 Reviewed-on: https://go-review.googlesource.com/33350 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-10 16:11:25 +00:00
Alberto Donizetti	f44e587031	math: check overflow in amd64 Exp implementation Unlike the pure go implementation used by every other architecture, the amd64 asm implementation of Exp does not fail early if the argument is known to overflow. Make it fail early. Cost of the check is < 1ns (on an old Sandy Bridge machine): name old time/op new time/op delta Exp-4 18.3ns ± 1% 18.7ns ± 1% +2.08% (p=0.000 n=18+20) Fixes #14932 Fixes #18912 Change-Id: I04b3f9b4ee853822cbdc97feade726fbe2907289 Reviewed-on: https://go-review.googlesource.com/36271 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> Reviewed-by: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-10 13:40:08 +00:00
Robert Griesemer	1f93ba66d6	math/big: add IsInt64/IsUint64 predicates Change-Id: Ia5ed3919cb492009ac8f66d175b47a69f83ee4f1 Reviewed-on: https://go-review.googlesource.com/36487 Reviewed-by: Alan Donovan <adonovan@google.com>	2017-02-07 23:02:33 +00:00
Russ Cox	850e55b8c0	crypto/*: document use or non-use of constant-time algorithms Fixes #16821. Change-Id: I63d5f3d7cfba1c76259912d754025c5f3cbe4a56 Reviewed-on: https://go-review.googlesource.com/31573 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-12-07 16:34:50 +00:00
Russ Cox	3f69822a9a	math/rand: export Source64, mainly for documentation value There is some code value too: types intending to implement Source64 can write a conversion confirming that. For #4254 and the Go 1.8 release notes. Change-Id: I7fc350a84f3a963e4dab317ad228fa340dda5c66 Reviewed-on: https://go-review.googlesource.com/33456 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-11-23 04:29:25 +00:00
Russ Cox	37d078ede3	math/big: add Baillie-PSW test to (Int).ProbablyPrime After x.ProbablyPrime(n) passes the n Miller-Rabin rounds, add a Baillie-PSW test before declaring x probably prime. Although the provable error bounds are unchanged, the empirical error bounds drop dramatically: there are no known inputs for which Baillie-PSW gives the wrong answer. For example, before this CL, big.NewInt(4431327).ProbablyPrime(1) == true. Now it is (correctly) false. The new Baillie-PSW test is two pieces: an added Miller-Rabin round with base 2, and a so-called extra strong Lucas test. (See the references listed in prime.go for more details.) The Lucas test takes about 3.5x as long as the Miller-Rabin round, which is close to theoretical expectations. name time/op ProbablyPrime/Lucas 2.91ms ± 2% ProbablyPrime/MillerRabinBase2 850µs ± 1% ProbablyPrime/n=0 3.75ms ± 3% The speed of prime testing for a prime input does get slower: name old time/op new time/op delta ProbablyPrime/n=1 849µs ± 1% 4521µs ± 1% +432.31% (p=0.000 n=10+9) ProbablyPrime/n=5 4.31ms ± 3% 7.87ms ± 1% +82.70% (p=0.000 n=10+10) ProbablyPrime/n=10 8.52ms ± 3% 12.28ms ± 1% +44.11% (p=0.000 n=10+10) ProbablyPrime/n=20 16.9ms ± 2% 21.4ms ± 2% +26.35% (p=0.000 n=9+10) However, because the Baillie-PSW test is only added when the old ProbablyPrime(n) would return true, testing composites runs at the same speed as before, except in the case where the result would have been incorrect and is now correct. In particular, the most important use of this code is for generating random primes in crypto/rand. That use spends essentially all its time testing composites, so it is not slowed down by the new Baillie-PSW check: name old time/op new time/op delta Prime 104ms ±22% 111ms ±16% ~ (p=0.165 n=10+10) Thanks to Serhat Şevki Dinçer for CL 20170, which this CL builds on. Fixes #13229. Change-Id: Id26dde9b012c7637c85f2e96355d029b6382812a Reviewed-on: https://go-review.googlesource.com/30770 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2016-11-22 02:05:47 +00:00
Brad Fitzpatrick	d8b14c5243	math/rand: make floating point tests shorter on mips and mipsle Like GOARM=5 does. Fixes #17944 Change-Id: Ica2a54a90fbd4a29471d1c6009ace2fcc5e82a73 Reviewed-on: https://go-review.googlesource.com/33326 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2016-11-16 19:22:53 +00:00
Dmitri Shuralyov	d8264de868	all: spell "marshal" and "unmarshal" consistently The tree is inconsistent about single l vs double l in those words in documentation, test messages, and one error value text. $ git grep -E '[Mm]arshall(\|s\|er\|ers\|ed\|ing)' \| wc -l 42 $ git grep -E '[Mm]arshal(\|s\|er\|ers\|ed\|ing)' \| wc -l 1694 Make it consistently a single l, per earlier decisions. This means contributors won't be confused by misleading precedence, and it helps consistency. Change the spelling in one error value text in newRawAttributes of crypto/x509 package to be consistent. This change was generated with: perl -i -npe 's,([Mm]arshal)l(\|s\|er\|ers\|ed\|ing),$1$2,' $(git grep -l -E '[Mm]arshall' \| grep -v AUTHORS \| grep -v CONTRIBUTORS) Updates #12431. Follows https://golang.org/cl/14150. Change-Id: I85d28a2d7692862ccb02d6a09f5d18538b6049a2 Reviewed-on: https://go-review.googlesource.com/33017 Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-11-12 00:13:35 +00:00
Bill O'Farrell	b6a15683f0	math: use SIMD to accelerate some scalar math functions on s390x Note, most math functions are structured to use stubs, so that they can be accelerated with assembly on any platform. Sinh, cosh, and tanh were not structued with stubs, so this CL does that. This set of routines was chosen as likely to produce good speedups with assembly on any platform. Technique used was minimax polynomial approximation using tables of polynomial coefficients, with argument range reduction. A table of scaling factors was also used for cosh and log10. before after speedup BenchmarkCos 22.1 ns/op 6.79 ns/op 3.25x BenchmarkCosh 125 ns/op 11.7 ns/op 10.68x BenchmarkLog10 48.4 ns/op 12.5 ns/op 3.87x BenchmarkSin 22.2 ns/op 6.55 ns/op 3.39x BenchmarkSinh 125 ns/op 14.2 ns/op 8.80x BenchmarkTanh 65.0 ns/op 15.1 ns/op 4.30x Accuracy was tested against a high precision reference function to determine maximum error. Approximately 4,000,000 points were tested for each function, producing the following result. Note: ulperr is error in "units in the last place" max ulperr sin 1.43 (returns NaN beyond +-2^50) cos 1.79 (returns NaN beyond +-2^50) cosh 1.05 sinh 3.02 tanh 3.69 log10 1.75 Also includes a set of tests to test non-vector functions even when SIMD is enabled Change-Id: Icb45f14d00864ee19ed973d209c3af21e4df4edc Reviewed-on: https://go-review.googlesource.com/32352 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <munday@ca.ibm.com>	2016-11-11 20:20:23 +00:00
Vladimir Stefanovic	d1e9104fb2	math, math/big: add support for GOARCH=mips{,le} Change-Id: I54e100cced5b49674937fb87d1e0f585f962aeb7 Reviewed-on: https://go-review.googlesource.com/31484 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2016-11-03 22:55:06 +00:00
Cherry Zhang	0dabbcdc43	math/big: flip long/short flag on TestFloat32Distribution It looks like a typo in CL 30707. Change-Id: Ia2d013567dbd1a49901d9be0cd2d5a103e6e38cf Reviewed-on: https://go-review.googlesource.com/32187 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-10-27 21:44:37 +00:00
Bill O'Farrell	1e6b12a201	math/big: uses SIMD for some math big functions on s390x The following benchmarks are improved by the amounts shown (Others unaffected beyond the level of noise.) Also adds a test to confirm non-SIMD implementation still correct, even when run on SIMD-capable machine Benchmark old new BenchmarkAddVV/100-18 66148.08 MB/s 117546.19 MB/s 1.8x BenchmarkAddVV/1000-18 70168.27 MB/s 133478.96 MB/s 1.9x BenchmarkAddVV/10000-18 67489.80 MB/s 100010.79 MB/s 1.5x BenchmarkAddVV/100000-18 54329.99 MB/s 69232.45 MB/s 1.3x BenchmarkAddVW/100-18 9929.10 MB/s 14841.31 MB/s 1.5x BenchmarkAddVW/1000-18 10583.31 MB/s 18674.44 MB/s 1.76x BenchmarkAddVW/10000-18 10521.15 MB/s 17484.10 MB/s 1.66x BenchmarkAddVW/100000-18 10616.56 MB/s 18084.27 MB/s 1.7x Change-Id: Ic9234c41a43f6c5e9d0e9377de8b4deeefc428a7 Reviewed-on: https://go-review.googlesource.com/32211 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-10-26 23:52:10 +00:00
Mohit Agarwal	5a9549260d	math/cmplx: prevent infinite loop in tanSeries The condition to determine if any further iterations are needed is evaluated to false in case it encounters a NaN. Instead, flip the condition to keep looping until the factor is greater than the machine roundoff error. Updates #17577 Change-Id: I058abe73fcd49d3ae4e2f7b33020437cc8f290c3 Reviewed-on: https://go-review.googlesource.com/31952 Reviewed-by: Robert Griesemer <gri@golang.org>	2016-10-25 18:32:22 +00:00
Alexander Döring	4c9c023346	math,math/cmplx: fix linter issues Change-Id: If061f1f120573cb109d97fa40806e160603cd593 Reviewed-on: https://go-review.googlesource.com/31871 Reviewed-by: Rob Pike <r@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-10-24 23:25:46 +00:00

1 2 3 4 5 ...

306 Commits