qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-17 18:24:45 -07:00

Author	SHA1	Message	Date
surechen	73eb24ccb6	math: Remove redundant local variable Ln2 Use the const variable Ln2 in math/const.go for function acosh. Change-Id: I5381d03dd3142c227ae5773ece9be6c8f377615e Reviewed-on: https://go-review.googlesource.com/c/go/+/232517 Reviewed-by: Robert Griesemer <gri@golang.org> Trust: Robert Griesemer <gri@golang.org> Trust: Giovanni Bajo <rasky@develer.com>	2020-09-19 09:09:52 +00:00
Xiangdong Ji	ae7b6a3b77	math/big: tune addVW/subVW performance on arm64 Add an optimization for addVW and subVW over large-sized vectors, it switches from add/sub with carry to copy the rest of the vector when we are done with carries. Consistent performance improvement are observed on various arm64 machines. Add additional tests and benchmarks to increase the test coverage. TestFunVWExt: Testing with various types of input vector, using the result from go-version addVW/subVW as golden reference. BenchmarkAddVWext and BenchmarkSubVWext: Benchmarking using input vector having all 1s or all 0s, for evaluating the overhead of worst case. 1. Perf. comparison over randomly generated input vectors: Server 1: name old time/op new time/op delta AddVW/1 12.3ns ± 3% 12.0ns ± 0% -2.60% (p=0.001 n=10+8) AddVW/2 12.5ns ± 2% 12.3ns ± 0% -1.84% (p=0.001 n=10+8) AddVW/3 12.6ns ± 2% 12.3ns ± 0% -1.91% (p=0.009 n=10+10) AddVW/4 13.1ns ± 3% 12.7ns ± 0% -2.98% (p=0.006 n=10+8) AddVW/5 14.4ns ± 1% 13.9ns ± 0% -3.81% (p=0.000 n=10+10) AddVW/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal) AddVW/100 47.8ns ± 0% 29.9ns ± 2% -37.38% (p=0.000 n=10+9) AddVW/1000 446ns ± 0% 207ns ± 0% -53.59% (p=0.000 n=10+10) AddVW/10000 4.35µs ± 1% 2.92µs ± 0% -32.85% (p=0.000 n=10+10) AddVW/100000 43.6µs ± 0% 29.7µs ± 0% -31.92% (p=0.000 n=8+10) SubVW/1 12.6ns ± 0% 12.3ns ± 2% -2.22% (p=0.000 n=7+10) SubVW/2 12.7ns ± 0% 12.6ns ± 1% -0.39% (p=0.046 n=8+10) SubVW/3 12.7ns ± 1% 12.6ns ± 1% ~ (p=0.410 n=10+10) SubVW/4 13.3ns ± 3% 13.1ns ± 3% ~ (p=0.072 n=10+10) SubVW/5 14.2ns ± 0% 14.1ns ± 1% -0.63% (p=0.046 n=8+10) SubVW/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal) SubVW/100 47.8ns ± 0% 33.1ns ±19% -30.71% (p=0.000 n=10+10) SubVW/1000 446ns ± 0% 207ns ± 0% -53.59% (p=0.000 n=10+10) SubVW/10000 4.33µs ± 1% 2.92µs ± 0% -32.66% (p=0.000 n=10+6) SubVW/100000 43.4µs ± 0% 29.6µs ± 0% -31.90% (p=0.000 n=10+9) Server 2: name old time/op new time/op delta AddVW/1 5.49ns ± 0% 5.53ns ± 2% ~ (p=1.000 n=9+10) AddVW/2 5.96ns ± 2% 5.92ns ± 1% -0.69% (p=0.039 n=10+10) AddVW/3 6.72ns ± 0% 6.73ns ± 0% ~ (p=0.078 n=10+10) AddVW/4 7.07ns ± 0% 6.75ns ± 2% -4.55% (p=0.000 n=10+10) AddVW/5 8.14ns ± 0% 8.17ns ± 0% +0.46% (p=0.003 n=8+8) AddVW/10 10.0ns ± 0% 10.1ns ± 1% +0.70% (p=0.003 n=10+10) AddVW/100 43.0ns ± 0% 33.5ns ± 0% -22.09% (p=0.000 n=9+9) AddVW/1000 394ns ± 0% 278ns ± 0% -29.44% (p=0.000 n=10+10) AddVW/10000 4.18µs ± 0% 3.14µs ± 0% -24.81% (p=0.000 n=8+8) AddVW/100000 68.3µs ± 3% 62.1µs ± 5% -9.13% (p=0.000 n=10+10) SubVW/1 5.37ns ± 2% 5.42ns ± 1% ~ (p=0.990 n=10+10) SubVW/2 5.89ns ± 0% 5.92ns ± 1% +0.58% (p=0.000 n=8+10) SubVW/3 6.64ns ± 1% 6.82ns ± 3% +2.63% (p=0.000 n=9+10) SubVW/4 7.17ns ± 0% 6.69ns ± 2% -6.74% (p=0.000 n=10+9) SubVW/5 8.22ns ± 0% 8.18ns ± 0% -0.46% (p=0.001 n=8+9) SubVW/10 10.0ns ± 1% 10.1ns ± 1% ~ (p=0.341 n=10+10) SubVW/100 43.0ns ± 0% 33.5ns ± 0% -22.09% (p=0.000 n=7+10) SubVW/1000 394ns ± 0% 278ns ± 0% -29.44% (p=0.000 n=10+10) SubVW/10000 4.18µs ± 0% 3.15µs ± 0% -24.62% (p=0.000 n=9+9) SubVW/100000 67.7µs ± 4% 62.4µs ± 2% -7.92% (p=0.000 n=10+10) 2. Perf. comparison over input vectors of all 1s or all 0s Server 1: name old time/op new time/op delta AddVWext/1 12.6ns ± 0% 12.0ns ± 0% -4.76% (p=0.000 n=6+10) AddVWext/2 12.7ns ± 0% 12.4ns ± 1% -2.52% (p=0.000 n=10+10) AddVWext/3 12.7ns ± 0% 12.4ns ± 0% -2.36% (p=0.000 n=9+7) AddVWext/4 13.2ns ± 4% 12.7ns ± 0% -3.71% (p=0.001 n=10+9) AddVWext/5 14.6ns ± 0% 13.9ns ± 0% -4.79% (p=0.000 n=10+8) AddVWext/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal) AddVWext/100 47.8ns ± 0% 47.4ns ± 0% -0.84% (p=0.000 n=10+10) AddVWext/1000 446ns ± 0% 399ns ± 0% -10.54% (p=0.000 n=10+10) AddVWext/10000 4.34µs ± 1% 3.90µs ± 0% -10.12% (p=0.000 n=10+10) AddVWext/100000 43.9µs ± 1% 39.4µs ± 0% -10.18% (p=0.000 n=10+10) SubVWext/1 12.6ns ± 0% 12.3ns ± 2% -2.70% (p=0.000 n=7+10) SubVWext/2 12.6ns ± 1% 12.6ns ± 2% ~ (p=0.234 n=10+10) SubVWext/3 12.7ns ± 0% 12.6ns ± 2% -0.71% (p=0.033 n=10+10) SubVWext/4 13.4ns ± 0% 13.1ns ± 3% -2.01% (p=0.006 n=8+10) SubVWext/5 14.2ns ± 0% 14.1ns ± 1% -0.85% (p=0.003 n=10+10) SubVWext/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal) SubVWext/100 47.8ns ± 0% 47.4ns ± 0% -0.84% (p=0.000 n=10+10) SubVWext/1000 446ns ± 0% 399ns ± 0% -10.54% (p=0.000 n=10+10) SubVWext/10000 4.33µs ± 1% 3.90µs ± 0% -10.02% (p=0.000 n=10+10) SubVWext/100000 43.5µs ± 0% 39.5µs ± 1% -9.16% (p=0.000 n=7+10) Server 2: name old time/op new time/op delta AddVWext/1 5.48ns ± 0% 5.43ns ± 1% -0.97% (p=0.000 n=9+9) AddVWext/2 5.99ns ± 2% 5.93ns ± 1% ~ (p=0.054 n=10+10) AddVWext/3 6.74ns ± 0% 6.79ns ± 1% +0.80% (p=0.000 n=9+10) AddVWext/4 7.18ns ± 0% 7.21ns ± 1% +0.36% (p=0.034 n=9+10) AddVWext/5 7.93ns ± 3% 8.18ns ± 0% +3.18% (p=0.000 n=10+8) AddVWext/10 10.0ns ± 0% 10.1ns ± 1% +0.60% (p=0.011 n=10+10) AddVWext/100 43.0ns ± 0% 47.7ns ± 0% +10.93% (p=0.000 n=9+10) AddVWext/1000 394ns ± 0% 399ns ± 0% +1.27% (p=0.000 n=10+10) AddVWext/10000 4.18µs ± 0% 4.50µs ± 0% +7.73% (p=0.000 n=9+10) AddVWext/100000 67.6µs ± 2% 68.4µs ± 3% ~ (p=0.139 n=9+8) SubVWext/1 5.46ns ± 1% 5.43ns ± 0% -0.55% (p=0.002 n=9+9) SubVWext/2 5.89ns ± 0% 5.93ns ± 1% +0.68% (p=0.000 n=8+10) SubVWext/3 6.72ns ± 1% 6.79ns ± 1% +1.07% (p=0.000 n=10+10) SubVWext/4 6.98ns ± 1% 7.21ns ± 0% +3.25% (p=0.000 n=10+10) SubVWext/5 8.22ns ± 0% 7.99ns ± 3% -2.83% (p=0.000 n=8+10) SubVWext/10 10.0ns ± 1% 10.1ns ± 1% ~ (p=0.239 n=10+10) SubVWext/100 43.0ns ± 0% 47.7ns ± 0% +10.93% (p=0.000 n=8+10) SubVWext/1000 394ns ± 0% 399ns ± 0% +1.27% (p=0.000 n=10+10) SubVWext/10000 4.18µs ± 0% 4.51µs ± 0% +7.86% (p=0.000 n=8+8) SubVWext/100000 68.3µs ± 2% 68.0µs ± 3% ~ (p=0.515 n=10+8) Change-Id: I134a5194b8a2deaaebbaa2b771baf72846971d58 Reviewed-on: https://go-review.googlesource.com/c/go/+/229739 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-08-28 16:40:41 +00:00
surechen	bd6dfe9a3e	math/big: add a comment for SetMantExp Change-Id: I9ff5d1767cf70648c2251268e5e815944a7cb371 Reviewed-on: https://go-review.googlesource.com/c/go/+/233737 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-08-28 16:25:32 +00:00
zhouzhongyuan	63828096f6	math/big: add function example While reading the source code of the math/big package, I found the SetString function example of float type missing. Change-Id: Id8c16a58e2e24f9463e8ff38adbc98f8c418ab26 Reviewed-on: https://go-review.googlesource.com/c/go/+/232804 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-08-26 16:15:32 +00:00
SparrowLii	41bc0a1713	math/big: fix TestShiftOverlap for test -count arguments > 1 Don't overwrite incoming test data. The change uses copy instead of assigning statement to avoid this. Change-Id: Ib907101822d811de5c45145cb9d7961907e212c3 Reviewed-on: https://go-review.googlesource.com/c/go/+/250137 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-08-25 17:04:40 +00:00
Lynn Boger	216714e44f	math/big: improve performance of mulAddVWW on ppc64x This changes the assembly implementation on ppc64x to improve performance by reordering some instructions. It also eliminates an unnecessary move by changing an ADDZE to use the correct target register. Improvement on power9: MulAddVWW/1 6.89ns ± 0% 7.30ns ± 0% +5.95% (p=1.000 n=1+1) MulAddVWW/2 8.04ns ± 0% 8.06ns ± 0% +0.25% (p=1.000 n=1+1) MulAddVWW/3 9.39ns ± 0% 9.39ns ± 0% ~ (all equal) MulAddVWW/4 9.76ns ± 0% 9.48ns ± 0% -2.87% (p=1.000 n=1+1) MulAddVWW/5 10.5ns ± 0% 10.3ns ± 0% -1.90% (p=1.000 n=1+1) MulAddVWW/10 15.4ns ± 0% 14.9ns ± 0% -3.25% (p=1.000 n=1+1) MulAddVWW/100 149ns ± 0% 125ns ± 0% -16.11% (p=1.000 n=1+1) MulAddVWW/1000 1.42µs ± 0% 1.28µs ± 0% -9.74% (p=1.000 n=1+1) MulAddVWW/10000 14.2µs ± 0% 12.8µs ± 0% -9.73% (p=1.000 n=1+1) MulAddVWW/100000 144µs ± 0% 129µs ± 0% -10.10% (p=1.000 n=1+1) Change-Id: I0ae7002a69783ca19d7a4e3e42042ae75dc60069 Reviewed-on: https://go-review.googlesource.com/c/go/+/248721 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: Paul Murphy <murp@ibm.com>	2020-08-18 20:25:26 +00:00
kakulisen	441b52f566	math: simplify the code Simplifying some code without compromising performance. My CPU is Intel Xeon Gold 6161, 2.20GHz, 64-bit operating system. The memory is 8GB. This is my test environment, I hope to help you judge. Benchmark: name old time/op new time/op delta Log1p-4 21.8ns ± 5% 21.8ns ± 4% ~ (p=0.973 n=20+20) Change-Id: Icd8f96f1325b00007602d114300b92d4c57de409 Reviewed-on: https://go-review.googlesource.com/c/go/+/233940 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-08-15 02:20:42 +00:00
Alberto Donizetti	65f514edfb	math: fix dead link to springerlink (now link.springer) Change-Id: Ie5fd026af45d2e7bc371a38d15dbb52a1b4958cd Reviewed-on: https://go-review.googlesource.com/c/go/+/235717 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2020-05-29 14:33:50 +00:00
Bryan C. Mills	13617380ca	testing: clean up remaining TempDir issues from CL 231958 Updates #38850 Change-Id: I33f48762f5520eb0c0a841d8ca1ccdd65ecc20c8 Reviewed-on: https://go-review.googlesource.com/c/go/+/234583 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-05-19 19:55:14 +00:00
Filippo Valsorda	c9d5f60eaa	math/big: add (Int).FillBytes Replaced almost every use of Bytes with FillBytes. Note that the approved proposal was for func (Int) FillBytes(buf []byte) while this implements func (*Int) FillBytes(buf []byte) []byte because the latter was far nicer to use in all callsites. Fixes #35833 Change-Id: Ia912df123e5d79b763845312ea3d9a8051343c0a Reviewed-on: https://go-review.googlesource.com/c/go/+/230397 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-05-05 00:36:44 +00:00
Joel Sing	03e6073b13	math: implement Min/Max in riscv64 assembly Change-Id: If34422859d47bc8f44974a00c6b7908e7655ff41 Reviewed-on: https://go-review.googlesource.com/c/go/+/223561 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-05-04 17:29:13 +00:00
kakulisen	e90b0ce68b	math: add function examples. The function Modf lacks corresponding examples. Change-Id: Id93423500e87d35b0b6870882be1698b304797ae Reviewed-on: https://go-review.googlesource.com/c/go/+/231097 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-05-02 20:22:19 +00:00
Brian Kessler	4209a9f65a	math/cmplx: handle special cases Implement special case handling and testing to ensure conformance with the C99 standard annex G.6 Complex arithmetic. Fixes #29320 Change-Id: Id72eb4c5a35d5a54b4b8690d2f7176ab11028f1b Reviewed-on: https://go-review.googlesource.com/c/go/+/220689 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-05-01 03:16:37 +00:00
kakulisen	df2862cf54	math: Add a function example When I browsed the source code, I saw that there is no corresponding example of this function. I am not sure if there is a need for an increase, this is my first time to submit CL. Change-Id: Idbf4e1e1ed2995176a76959d561e152263a2fd26 Reviewed-on: https://go-review.googlesource.com/c/go/+/230741 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-30 00:33:38 +00:00
Ruixin(Peter) Bao	a7e9e84716	math/big: simplify hasVX checking on s390x Originally, we use an assembly function that returns a boolean result to tell whether the machine has vector facility or not. It is now no longer needed when we can directly use cpu.S390X.HasVX variable. Change-Id: Ic1dae851982532bcfd9a9453416c112347f21d87 Reviewed-on: https://go-review.googlesource.com/c/go/+/230318 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-27 20:20:53 +00:00
Ruixin Bao	d2f5e4e38c	math: simplify hasVX checking on s390x Originally, we use an assembly function that returns a boolean result to tell whether the machine has vector facility or not. It is now no longer needed when we can directly use cpu.S390X.HasVX variable. Change-Id: Ic3ffeb9e63238ef41406d97cdc42502145ddb454 Reviewed-on: https://go-review.googlesource.com/c/go/+/230319 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-27 20:06:57 +00:00
Tyson Andre	19648622d2	math/cmplx: fix typo in code comment Everywhere else is using "cancellation" as of 2019 The reasoning is mentioned in 170060. > Though there is variation in the spelling of canceled, > cancellation is always spelled with a double l. > > Reference: https://www.grammarly.com/blog/canceled-vs-cancelled/ Change-Id: I933ea68d7251986ce582b92c33b7cb13cee1d207 GitHub-Last-Rev: `fc3d5ada2b` GitHub-Pull-Request: golang/go#38661 Reviewed-on: https://go-review.googlesource.com/c/go/+/230199 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2020-04-25 21:06:35 +00:00
Ruixin(Peter) Bao	3a37fd4010	math/big: rewrite subVW to use fast path on s390x This CL replaces the original subVW implementation with a implementation that uses a similar idea as CL 164968. When we know the borrow bit is zero, we can copy the rest of words as they will not be updated. Also, since we are copying vector of a words, a faster implementation of copy is written in this CL to copy a word or multiple words at a time. Benchmarks: name old time/op new time/op delta SubVW/1-18 4.43ns ± 0% 3.82ns ± 0% -13.85% (p=0.000 n=20+20) SubVW/2-18 5.39ns ± 0% 4.25ns ± 0% -21.23% (p=0.000 n=20+20) SubVW/3-18 6.29ns ± 0% 4.65ns ± 0% -26.07% (p=0.000 n=16+19) SubVW/4-18 6.08ns ± 2% 4.84ns ± 0% -20.43% (p=0.000 n=20+20) SubVW/5-18 7.06ns ± 1% 4.93ns ± 0% -30.18% (p=0.000 n=20+20) SubVW/10-18 10.3ns ± 2% 7.2ns ± 0% -30.35% (p=0.000 n=20+19) SubVW/100-18 48.0ns ± 4% 17.6ns ± 0% -63.32% (p=0.000 n=18+19) SubVW/1000-18 448ns ±10% 236ns ± 1% -47.24% (p=0.000 n=20+20) SubVW/10000-18 4.83µs ± 5% 2.96µs ± 0% -38.73% (p=0.000 n=20+19) SubVW/100000-18 46.6µs ± 3% 30.6µs ± 1% -34.30% (p=0.000 n=20+20) [Geo mean] 56.3ns 37.0ns -34.24% name old speed new speed delta SubVW/1-18 1.80GB/s ± 0% 2.10GB/s ± 0% +16.16% (p=0.000 n=20+20) SubVW/2-18 2.97GB/s ± 0% 3.77GB/s ± 0% +26.95% (p=0.000 n=20+20) SubVW/3-18 3.82GB/s ± 0% 5.16GB/s ± 0% +35.26% (p=0.000 n=20+19) SubVW/4-18 5.26GB/s ± 1% 6.61GB/s ± 0% +25.59% (p=0.000 n=20+20) SubVW/5-18 5.67GB/s ± 1% 8.11GB/s ± 0% +43.12% (p=0.000 n=20+20) SubVW/10-18 7.79GB/s ± 2% 11.17GB/s ± 0% +43.52% (p=0.000 n=20+19) SubVW/100-18 16.7GB/s ± 4% 45.5GB/s ± 0% +172.61% (p=0.000 n=18+20) SubVW/1000-18 17.9GB/s ± 9% 33.9GB/s ± 1% +89.25% (p=0.000 n=20+20) SubVW/10000-18 16.6GB/s ± 5% 27.0GB/s ± 0% +63.08% (p=0.000 n=20+19) SubVW/100000-18 17.2GB/s ± 2% 26.1GB/s ± 1% +52.18% (p=0.000 n=20+20) [Geo mean] 7.25GB/s 11.03GB/s +52.01% Change-Id: I32e99cbab3260054a96231d02b87049c833ab77e Reviewed-on: https://go-review.googlesource.com/c/go/+/227297 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-24 14:50:59 +00:00
Ruixin(Peter) Bao	ee8972cd12	math/big: rewrite addVW to use fast path on s390x Rewrite addVW to use a fast path and remove the original vector and non vector implementation of addVW in assembly. This CL uses a similar idea as CL 164968, where we copy the rest of words when we know carry bit is zero. In addition, since we are copying vector of words, a faster implementation of copy is written in this CL to copy a word or multiple words at a time. Benchmarks: name old time/op new time/op delta AddVW/1-18 4.56ns ± 0% 4.01ns ± 6% -12.14% (p=0.000 n=18+20) AddVW/2-18 5.54ns ± 0% 4.42ns ± 5% -20.20% (p=0.000 n=18+20) AddVW/3-18 6.55ns ± 0% 4.61ns ± 0% -29.62% (p=0.000 n=16+18) AddVW/4-18 6.11ns ± 2% 5.12ns ± 6% -16.19% (p=0.000 n=20+20) AddVW/5-18 7.32ns ± 4% 5.14ns ± 0% -29.77% (p=0.000 n=20+19) AddVW/10-18 10.6ns ± 2% 7.2ns ± 1% -31.47% (p=0.000 n=20+20) AddVW/100-18 49.6ns ± 2% 18.0ns ± 0% -63.63% (p=0.000 n=20+20) AddVW/1000-18 465ns ± 3% 244ns ± 0% -47.54% (p=0.000 n=20+20) AddVW/10000-18 4.99µs ± 4% 2.97µs ± 0% -40.54% (p=0.000 n=20+20) AddVW/100000-18 48.3µs ± 3% 30.8µs ± 1% -36.29% (p=0.000 n=20+20) [Geo mean] 58.1ns 38.0ns -34.57% name old speed new speed delta AddVW/1-18 1.76GB/s ± 0% 2.00GB/s ± 6% +14.04% (p=0.000 n=20+20) AddVW/2-18 2.89GB/s ± 0% 3.63GB/s ± 5% +25.55% (p=0.000 n=18+20) AddVW/3-18 3.66GB/s ± 0% 5.21GB/s ± 0% +42.25% (p=0.000 n=18+19) AddVW/4-18 5.24GB/s ± 2% 6.27GB/s ± 6% +19.61% (p=0.000 n=20+20) AddVW/5-18 5.47GB/s ± 4% 7.78GB/s ± 0% +42.28% (p=0.000 n=20+18) AddVW/10-18 7.55GB/s ± 2% 11.04GB/s ± 1% +46.09% (p=0.000 n=20+20) AddVW/100-18 16.1GB/s ± 2% 44.3GB/s ± 0% +174.77% (p=0.000 n=20+20) AddVW/1000-18 17.2GB/s ± 3% 32.8GB/s ± 1% +90.58% (p=0.000 n=20+20) AddVW/10000-18 16.0GB/s ± 4% 26.9GB/s ± 0% +68.11% (p=0.000 n=20+20) AddVW/100000-18 16.6GB/s ± 3% 26.0GB/s ± 1% +56.94% (p=0.000 n=20+20) [Geo mean] 7.03GB/s 10.75GB/s +52.93% Change-Id: Idbb73f3178311bd2b18a93bdc1e48f26869d2f6a Reviewed-on: https://go-review.googlesource.com/c/go/+/209679 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-24 13:26:34 +00:00
Michael Munday	ab7a65f283	cmd/compile: clean up codegen for branch-on-carry on s390x This CL optimizes code that uses a carry from a function such as bits.Add64 as the condition in an if statement. For example: x, c := bits.Add64(a, b, 0) if c != 0 { panic("overflow") } Rather than converting the carry into a 0 or a 1 value and using that as an input to a comparison instruction the carry flag is now used as the input to a conditional branch directly. This typically removes an ADD LOGICAL WITH CARRY instruction when user code is doing overflow detection and is closer to the code that a user would expect to generate. Change-Id: I950431270955ab72f1b5c6db873b6abe769be0da Reviewed-on: https://go-review.googlesource.com/c/go/+/219757 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-04-22 20:11:06 +00:00
Ruixin(Peter) Bao	0329c915a0	math/big: clean up whitespace in arith_s390x.s file This CL looks big but it only does formatting changes to arith_s390x.s. The file was formatted using asmfmt(https://github.com/klauspost/asmfmt) , so there should not be any functional impact. I verified that the generated assembly of big.test file is identical. Change-Id: I8b4035ef082a4d0357881869327e25253f2d8be1 Reviewed-on: https://go-review.googlesource.com/c/go/+/229302 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-22 15:40:55 +00:00
Alberto Donizetti	813f8eae27	math/big: remove Direct Sqrt computation The Float.Sqrt method switches (for performance reasons) between direct (uses Quo) and inverse (doesn't) computation, depending on the precision, with threshold 128. Unfortunately the implementation of recursive division in CL 172018 made Quo slightly slower exactly in the range around and below the threshold Sqrt is using, so this strategy is no longer profitable. The new division algorithm allocates more, and this has increased the amount of allocations performed by Sqrt when using the direct method; on low precisions the computation is fast, so additional allocations have an negative impact on performance. Interestingly, only using the inverse method doesn't just reverse the effects of the Quo algorithm change, but it seems to make performances better overall for small precisions: name old time/op new time/op delta FloatSqrt/64-4 643ns ± 1% 635ns ± 1% -1.24% (p=0.000 n=10+10) FloatSqrt/128-4 1.44µs ± 1% 1.02µs ± 1% -29.25% (p=0.000 n=10+10) FloatSqrt/256-4 1.49µs ± 1% 1.49µs ± 1% ~ (p=0.752 n=10+10) FloatSqrt/1000-4 3.71µs ± 1% 3.74µs ± 1% +0.87% (p=0.001 n=10+10) FloatSqrt/10000-4 35.3µs ± 1% 35.6µs ± 1% +0.82% (p=0.002 n=10+9) FloatSqrt/100000-4 844µs ± 1% 844µs ± 0% ~ (p=0.549 n=10+9) FloatSqrt/1000000-4 69.5ms ± 0% 69.6ms ± 0% ~ (p=0.222 n=9+9) name old alloc/op new alloc/op delta FloatSqrt/64-4 280B ± 0% 200B ± 0% -28.57% (p=0.000 n=10+10) FloatSqrt/128-4 504B ± 0% 248B ± 0% -50.79% (p=0.000 n=10+10) FloatSqrt/256-4 344B ± 0% 344B ± 0% ~ (all equal) FloatSqrt/1000-4 1.30kB ± 0% 1.30kB ± 0% ~ (all equal) FloatSqrt/10000-4 13.5kB ± 0% 13.5kB ± 0% ~ (p=0.237 n=10+10) FloatSqrt/100000-4 123kB ± 0% 123kB ± 0% ~ (p=0.247 n=10+10) FloatSqrt/1000000-4 1.83MB ± 1% 1.83MB ± 3% ~ (p=0.779 n=8+10) name old allocs/op new allocs/op delta FloatSqrt/64-4 8.00 ± 0% 5.00 ± 0% -37.50% (p=0.000 n=10+10) FloatSqrt/128-4 11.0 ± 0% 5.0 ± 0% -54.55% (p=0.000 n=10+10) FloatSqrt/256-4 5.00 ± 0% 5.00 ± 0% ~ (all equal) FloatSqrt/1000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/10000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/100000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/1000000-4 10.3 ±13% 10.3 ±13% ~ (p=1.000 n=10+10) For example, 1.02µs for FloatSqrt/128 is actually better than what I was getting on the same machine before the Quo changes. The .8% slowdown on /1000 and /10000 appears to be real and it is quite baffling (that codepath was not touched at all); it may be caused by code alignment changes. Change-Id: Ib03761cdc1055674bc7526d4f3a23d7a25094029 Reviewed-on: https://go-review.googlesource.com/c/go/+/228062 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 16:37:53 +00:00
Brad Fitzpatrick	8f53fad035	math/big: add test that linker is able to remove unused code (Follow-up to CL 228108.) Change-Id: Ia6d119ee19c7aa923cdeead06d3cee87a1751105 Reviewed-on: https://go-review.googlesource.com/c/go/+/228109 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 03:25:21 +00:00
Hanjun Kim	5a447c0ae9	math/big: fix typo in documentation for Int.Exp Fixes #38304 Also change `If m > 0, y < 0, ...` to `If m != 0, y < 0, ...` since `Exp` will return `nil` whatever `m`'s sign is. Change-Id: I17d7337ccd1404318cea5d42a8de904ad185fd00 GitHub-Last-Rev: `2399510300` GitHub-Pull-Request: golang/go#38390 Reviewed-on: https://go-review.googlesource.com/c/go/+/228000 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 00:32:18 +00:00
Brad Fitzpatrick	a55645fa34	math/big: don't use Float in init to help linker discard 162 KiB Removes 162 KiB from binaries that don't use math/big.Float: -rwxr-xr-x 1 bradfitz bradfitz 1916590 Apr 14 12:21 x.after -rwxr-xr-x 1 bradfitz bradfitz 2082575 Apr 14 12:21 x.before No change in deps (this package already used sync). No change in benchmarks: name old time/op new time/op delta FloatSqrt/64-8 1.06µs ±10% 1.03µs ± 6% ~ (p=0.133 n=10+9) FloatSqrt/128-8 2.26µs ± 9% 2.28µs ± 9% ~ (p=0.460 n=10+8) FloatSqrt/256-8 2.29µs ± 5% 2.31µs ± 3% ~ (p=0.214 n=9+9) FloatSqrt/1000-8 5.82µs ± 3% 5.87µs ± 7% ~ (p=0.666 n=9+9) FloatSqrt/10000-8 56.4µs ± 5% 57.0µs ± 6% ~ (p=0.436 n=10+10) FloatSqrt/100000-8 1.34ms ± 8% 1.31ms ± 3% ~ (p=0.447 n=10+9) FloatSqrt/1000000-8 106ms ± 5% 107ms ± 7% ~ (p=0.315 n=10+10) name old alloc/op new alloc/op delta FloatSqrt/64-8 280B ± 0% 280B ± 0% ~ (all equal) FloatSqrt/128-8 504B ± 0% 504B ± 0% ~ (all equal) FloatSqrt/256-8 344B ± 0% 344B ± 0% ~ (all equal) FloatSqrt/1000-8 1.30kB ± 0% 1.30kB ± 0% ~ (all equal) FloatSqrt/10000-8 13.5kB ± 0% 13.5kB ± 0% ~ (p=0.403 n=10+10) FloatSqrt/100000-8 123kB ± 0% 123kB ± 0% ~ (p=0.393 n=10+10) FloatSqrt/1000000-8 1.84MB ± 7% 1.84MB ± 5% ~ (p=0.739 n=10+10) name old allocs/op new allocs/op delta FloatSqrt/64-8 8.00 ± 0% 8.00 ± 0% ~ (all equal) FloatSqrt/128-8 11.0 ± 0% 11.0 ± 0% ~ (all equal) FloatSqrt/256-8 5.00 ± 0% 5.00 ± 0% ~ (all equal) FloatSqrt/1000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/10000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/100000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/1000000-8 10.9 ±10% 10.8 ±17% ~ (p=0.974 n=10+10) Change-Id: I3337f1f531bf7b4fae192b9d90cd24ff2be14fea Reviewed-on: https://go-review.googlesource.com/c/go/+/228108 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-14 20:50:19 +00:00
Rémy Oudompheng	ac1fd419b6	math/big: correct off-by-one access in divBasic The divBasic function computes the quotient of big nats u/v word by word. It estimates each word qhat by performing a long division (top 2 words of u divided by top word of v), looks at the next word to correct the estimate, then perform a full multiplication (qhatv) to catch any inaccuracy in the estimate. In the latter case, "negative" values appear temporarily and carries must be carefully managed, and the recursive division refactoring introduced a case where qhatv has the same length as v, triggering an out-of-bounds write in the case it happens when computing the top word of the quotient. Fixes #37499 Change-Id: I15089da4a4027beda43af497bf6de261eb792f94 Reviewed-on: https://go-review.googlesource.com/c/go/+/221980 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-08 20:53:58 +00:00
Brian Kessler	6b6414cab4	math: correct Atan2(±y,+∞) = ±0 on s390x The s390x assembly implementation was previously only handling this case correctly for x = -Pi. Update the special case handling for any y. Fixes #35446 Change-Id: I355575e9ec8c7ce8bd9db10d74f42a22f39a2f38 Reviewed-on: https://go-review.googlesource.com/c/go/+/223420 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <mike.munday@ibm.com> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-03-25 04:06:34 +00:00
Alberto Donizetti	c5058652fd	math/big: document that Sqrt doesn't set Accuracy Document that the Float.Sqrt method does not set the receiver's Accuracy field. Updates #37915 Change-Id: Ief1dcac07eacc0ef02f86bfac9044501477bca1c Reviewed-on: https://go-review.googlesource.com/c/go/+/224497 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-03-20 19:24:37 +00:00
Brian Kessler	d774d979dd	math/cmplx: disable TanHuge test on s390x s390x has inaccurate range reduction for the assembly routines in math so these tests are diabled until these are corrected. Updates #37854 Change-Id: I1e26acd6d09ae3e592a3dd90aec73a6844f5c6fe Reviewed-on: https://go-review.googlesource.com/c/go/+/223457 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org>	2020-03-14 07:03:15 +00:00
Brian Kessler	70dc28f766	math/cmplx: implement Payne-Hanek range reduction Tan has poles along the real axis. In order to accurately calculate the value near these poles, a range reduction by Pi is performed and the result calculated via a Taylor series. The prior implementation of range reduction used Cody-Waite range reduction in three parts. This fails when x is too large to accurately calculate the partial products in the summation accurately. Above this threshold, Payne-Hanek range reduction using a multiple precision value of 1/Pi is required. Additionally, the threshold used in math/trig_reduce.go for Payne-Hanek range reduction was not set conservatively enough. The prior threshold ensured that catastrophic failure did not occur where the argument x would not actually be reduced below Pi/4. However, errors in reduction begin to occur at values much lower when z = ((x - yPI4A) - yPI4B) - yPI4C is not exact because yPI4A cannot be exactly represented as a float64. reduceThreshold is lowered to the proper value. Fixes #31566 Change-Id: I0f39a4171a5be44f64305f18dc57f6c29f19dba7 Reviewed-on: https://go-review.googlesource.com/c/go/+/172838 Reviewed-by: Rob Pike <r@golang.org>	2020-03-14 04:12:41 +00:00
Joel Sing	fe70838598	math/big: initial vector arithmetic in riscv64 assembly Provide an assembly implementation of mulWW - for now all others run the Go code. Change-Id: Icb594c31048255f131bdea8d64f56784fc9db4d1 Reviewed-on: https://go-review.googlesource.com/c/go/+/220919 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-02-25 16:47:02 +00:00
Joel Sing	89f249a40d	math: implement Sqrt in assembly for riscv64 Change-Id: I9a5dc33271434e58335f5562a30cc131c6a8332c Reviewed-on: https://go-review.googlesource.com/c/go/+/220918 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-02-25 16:43:26 +00:00
Filippo Valsorda	88ae4ccefb	math/big: reintroduce pre-Go 1.14 mention in GCD docs It was removed in CL 217302 but was intentionally added in CL 217104. Change-Id: I1a478d80ad1ec4f0a0184bfebf8f1a5e352cfe8c Reviewed-on: https://go-review.googlesource.com/c/go/+/217941 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-02-05 20:54:27 +00:00
Filippo Valsorda	cdb7fd6b06	math/big: simplify GCD docs We don't usually document past behavior (like "As of Go 1.14 ...") and in isolation the current docs made it sound like a and b could only be negative or zero. Change-Id: I0d3c2b8579a9c01159ce528a3128b1478e99042a Reviewed-on: https://go-review.googlesource.com/c/go/+/217302 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-01-31 23:32:37 +00:00
Robert Griesemer	9bb40ed8ec	math/big: update comment on Int.GCD Per the suggestion https://golang.org/cl/216200/2/doc/go1.14.html#423. Updates #28878. Change-Id: I654d2d114409624219a0041916f0a4030efc7573 Reviewed-on: https://go-review.googlesource.com/c/go/+/217104 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-01-30 20:37:01 +00:00
Joel Sing	5a3a5d3525	math, math/big: add support for riscv64 Based on riscv-go port. Updates #27532 Change-Id: Id8ae7d851c393ec3702e4176c363accb0a42587f Reviewed-on: https://go-review.googlesource.com/c/go/+/204633 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2020-01-15 18:49:52 +00:00
Brad Fitzpatrick	7d1d944626	math/rand: update comment to avoid use of ^ for exponentiation Fixes #35920 Change-Id: I1a4d26c5f7f3fbd4de13fc337de482667d83c47f Reviewed-on: https://go-review.googlesource.com/c/go/+/209758 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>	2019-12-04 21:14:24 +00:00
Ville Skyttä	440f7d6404	all: fix a bunch of misspellings Change-Id: I5b909df0fd048cd66c5a27fca1b06466d3bcaac7 GitHub-Last-Rev: `778c5d2131` GitHub-Pull-Request: golang/go#35624 Reviewed-on: https://go-review.googlesource.com/c/go/+/207421 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-11-15 21:04:43 +00:00
Rémy Oudompheng	7ad27481f8	math/big: fix out-of-bounds panic in divRecursive The bounds in the last carry branch were wrong as there is no reason for len(u) >= n+n/2 to always hold true. We also adjust test to avoid using a remainder of 1 (in which case, the last step of the algorithm computes (qhatv+1) - qhatv which rarely produces a carry). Change-Id: I69fbab9c5e19d0db1c087fbfcd5b89352c2d26fb Reviewed-on: https://go-review.googlesource.com/c/go/+/206839 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-11-13 19:15:27 +00:00
Robert Griesemer	1fe33e3cb2	math/big: ensure correct test input There is a (theoretical, but possible) chance that the random number values a, b used for TestDiv are 0 or 1, in which case the test would fail. This CL makes sure that a >= 1 and b >= 2 at all times. Fixes #35523. Change-Id: I6451feb94241249516a821cd0066e95a0c65b0ed Reviewed-on: https://go-review.googlesource.com/c/go/+/206818 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-11-12 18:52:52 +00:00
Rémy Oudompheng	194ae3236d	math/big: implement recursive algorithm for division The current division algorithm produces one word of result at a time, using 2-word division to compute the top word and mulAddVWW to compute the remainder. The top word may need to be adjusted by 1 or 2 units. The recursive version, based on Burnikel, Ziegler, "Fast Recursive Division", uses the same principles, but in a multi-word setting, so that multiplication benefits from the Karatsuba algorithm (and possibly later improvements). benchmark old ns/op new ns/op delta BenchmarkDiv/20/10-4 38.2 38.3 +0.26% BenchmarkDiv/40/20-4 38.7 38.5 -0.52% BenchmarkDiv/100/50-4 62.5 62.6 +0.16% BenchmarkDiv/200/100-4 238 259 +8.82% BenchmarkDiv/400/200-4 311 338 +8.68% BenchmarkDiv/1000/500-4 604 649 +7.45% BenchmarkDiv/2000/1000-4 1214 1278 +5.27% BenchmarkDiv/20000/10000-4 38279 36510 -4.62% BenchmarkDiv/200000/100000-4 3022057 1359615 -55.01% BenchmarkDiv/2000000/1000000-4 310827664 54012939 -82.62% BenchmarkDiv/20000000/10000000-4 33272829421 1965401359 -94.09% BenchmarkString/10/Base10-4 158 156 -1.27% BenchmarkString/100/Base10-4 797 792 -0.63% BenchmarkString/1000/Base10-4 3677 3814 +3.73% BenchmarkString/10000/Base10-4 16633 17116 +2.90% BenchmarkString/100000/Base10-4 5779029 1793808 -68.96% BenchmarkString/1000000/Base10-4 889840820 85524031 -90.39% BenchmarkString/10000000/Base10-4 134338236860 4935657026 -96.33% Fixes #21960 Updates #30943 Change-Id: I134c6f81a47870c688ca95b6081eb9211def15a2 Reviewed-on: https://go-review.googlesource.com/c/go/+/172018 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-11-12 05:18:25 +00:00
Ian Lance Taylor	f5e89c2214	Revert "math/cmplx: handle special cases" This reverts CL 169501. Reason for revert: The new tests fail at least on s390x and MIPS. This is likely a minor bug in the compiler or runtime. But this point in the release cycle is not the time to debug these details, which are unlikely to be new. Let's try again for 1.15. Updates #29320 Fixes #35443 Change-Id: I2218b2083f8974b57d528e3742524393fc72b355 Reviewed-on: https://go-review.googlesource.com/c/go/+/206037 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-11-08 03:07:30 +00:00
Brian Kessler	68dce4296e	math/cmplx: handle special cases Implement special case handling and testing to ensure conformance with the C99 standard annex G.6 Complex arithmetic. Fixes #29320 Change-Id: Ieb0527191dd7fdea5b1aecb42b9e23aae3f74260 Reviewed-on: https://go-review.googlesource.com/c/go/+/169501 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-11-07 19:48:30 +00:00
Russ Cox	e8f01d591f	math: test portable FMA even on system with hardware FMA This makes it a little less likely the portable FMA will be broken without realizing it. Change-Id: I7f7f4509b35160a9709f8b8a0e494c09ea6e410a Reviewed-on: https://go-review.googlesource.com/c/go/+/205337 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-07 14:53:38 +00:00
Russ Cox	543c6d2e0d	math, cmd/compile: rename Fma to FMA This API was added for #25819, where it was discussed as math.FMA. The commit adding it used math.Fma, presumably for consistency with the rest of the unusual names in package math (Sincos, Acosh, Erfcinv, Float32bits, etc). I believe that using an idiomatic Go name is more important here than consistency with these other names, most of which are historical baggage from C's standard library. Early additions like Float32frombits happened before "uppercase for export" (so they were originally like "float32frombits") and they were not properly reconsidered when we uppercased the symbols to export them. That's a mistake we live with. The names of functions we have added since then, and even a few that were legacy, are more properly Go-cased, such as IsNaN, IsInf, and RoundToEven, rather than Isnan, Isinf, and Roundtoeven. And also constants like MaxFloat32. For new API, we should keep using proper Go-cased symbols instead of minimally-upper-cased-C symbols. So math.FMA, not math.Fma. This API has not yet been released, so this change does not break the compatibility promise. This CL also modifies cmd/compile, since the compiler knows the name of the function. I could have stopped at changing the string constants, but it seemed to make more sense to use a consistent casing everywhere. Change-Id: I0f6f3407f41e99bfa8239467345c33945088896e Reviewed-on: https://go-review.googlesource.com/c/go/+/205317 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-11-07 14:51:06 +00:00
Brian Kessler	f5949b6067	math/big: allow all values for GCD Allow the inputs a and b to be zero or negative to GCD with the following definitions. If x or y are not nil, GCD sets their value such that z = ax + by. Regardless of the signs of a and b, z is always >= 0. If a == b == 0, GCD sets z = x = y = 0. If a == 0 and b != 0, GCD sets z = \|b\|, x = 0, y = sign(b) * 1. If a != 0 and b == 0, GCD sets z = \|a\|, x = sign(a) * 1, y = 0. Fixes #28878 Change-Id: Ia83fce66912a96545c95cd8df0549bfd852652f3 Reviewed-on: https://go-review.googlesource.com/c/go/+/164972 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-11-07 06:58:44 +00:00
Rémy Oudompheng	8f30d25168	math/big: use nat pool to reduce allocations in mul and sqr This notably allows to reuse temporaries across the karatsubaSqr recursion. benchmark old ns/op new ns/op delta BenchmarkNatMul/10-4 227 228 +0.44% BenchmarkNatMul/100-4 8339 8589 +3.00% BenchmarkNatMul/1000-4 313796 312272 -0.49% BenchmarkNatMul/10000-4 11924720 11873589 -0.43% BenchmarkNatMul/100000-4 503813354 503839058 +0.01% BenchmarkNatSqr/20-4 549 513 -6.56% BenchmarkNatSqr/30-4 945 874 -7.51% BenchmarkNatSqr/50-4 1993 1832 -8.08% BenchmarkNatSqr/80-4 4096 3874 -5.42% BenchmarkNatSqr/100-4 6192 5712 -7.75% BenchmarkNatSqr/200-4 20388 19543 -4.14% BenchmarkNatSqr/300-4 38735 36715 -5.21% BenchmarkNatSqr/500-4 99562 93542 -6.05% BenchmarkNatSqr/800-4 195554 184907 -5.44% BenchmarkNatSqr/1000-4 286302 275053 -3.93% BenchmarkNatSqr/10000-4 9817057 9441641 -3.82% BenchmarkNatSqr/100000-4 390713416 379696789 -2.82% benchmark old allocs new allocs delta BenchmarkNatMul/10-4 1 1 +0.00% BenchmarkNatMul/100-4 1 1 +0.00% BenchmarkNatMul/1000-4 2 1 -50.00% BenchmarkNatMul/10000-4 2 1 -50.00% BenchmarkNatMul/100000-4 9 11 +22.22% BenchmarkNatSqr/20-4 2 1 -50.00% BenchmarkNatSqr/30-4 2 1 -50.00% BenchmarkNatSqr/50-4 2 1 -50.00% BenchmarkNatSqr/80-4 2 1 -50.00% BenchmarkNatSqr/100-4 2 1 -50.00% BenchmarkNatSqr/200-4 2 1 -50.00% BenchmarkNatSqr/300-4 4 1 -75.00% BenchmarkNatSqr/500-4 4 1 -75.00% BenchmarkNatSqr/800-4 10 1 -90.00% BenchmarkNatSqr/1000-4 10 1 -90.00% BenchmarkNatSqr/10000-4 731 1 -99.86% BenchmarkNatSqr/100000-4 19687 6 -99.97% benchmark old bytes new bytes delta BenchmarkNatMul/10-4 192 192 +0.00% BenchmarkNatMul/100-4 4864 4864 +0.00% BenchmarkNatMul/1000-4 57344 49224 -14.16% BenchmarkNatMul/10000-4 565248 498772 -11.76% BenchmarkNatMul/100000-4 5749504 7263720 +26.34% BenchmarkNatSqr/20-4 672 352 -47.62% BenchmarkNatSqr/30-4 992 512 -48.39% BenchmarkNatSqr/50-4 1792 896 -50.00% BenchmarkNatSqr/80-4 2688 1408 -47.62% BenchmarkNatSqr/100-4 3584 1792 -50.00% BenchmarkNatSqr/200-4 6656 3456 -48.08% BenchmarkNatSqr/300-4 24448 16387 -32.97% BenchmarkNatSqr/500-4 36864 24591 -33.29% BenchmarkNatSqr/800-4 69760 40981 -41.25% BenchmarkNatSqr/1000-4 86016 49180 -42.82% BenchmarkNatSqr/10000-4 2524800 487368 -80.70% BenchmarkNatSqr/100000-4 68599808 5876581 -91.43% Change-Id: I8e6e409ae1cb48be9d5aa9b5f428d6cbe487673a Reviewed-on: https://go-review.googlesource.com/c/go/+/172017 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-10-25 03:14:39 +00:00
Robert Griesemer	a52c0a1992	math/big: make Rat.Denom side-effect free A Rat is represented via a quotient a/b where a and b are Int values. To make it possible to use an uninitialized Rat value (with a and b uninitialized and thus == 0), the implementation treats a 0 denominator as 1. Rat.Num and Rat.Denom return pointers to these values a and b. Because b may be 0, Rat.Denom used to first initialize it to 1 and thus produce an undesirable side-effect (by changing the Rat's denominator). This CL changes Denom to return a new (not shared) *Int with value 1 in the rare case where the Rat was not initialized. This eliminates the side effect and returns the correct denominator value. While this is changing behavior of the API, the impact should now be minor because together with (prior) CL https://golang.org/cl/202997, which initializes Rats ASAP, Denom is unlikely used to access the denominator of an uninitialized (and thus 0) Rat. Any operation that will somehow set a Rat value will ensure that the denominator is not 0. Fixes #33792. Updates #3521. Change-Id: I0bf15ac60513cf52162bfb62440817ba36f0c3fc Reviewed-on: https://go-review.googlesource.com/c/go/+/203059 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-24 03:34:24 +00:00
Robert Griesemer	4412181e7c	math/big: normalize unitialized denominators ASAP A Rat is represented via a quotient a/b where a and b are Int values. To make it possible to use an uninitialized Rat value (with a and b uninitialized and thus == 0), the implementation treats a 0 denominator as 1. For each operation we check if the denominator is 0, and then treat it as 1 (if necessary). Operations that create a new Rat result, normalize that value such that a result denominator 1 is represened as 0 again. This CL changes this behavior slightly: 0 denominators are still interpreted as 1, but whenever we (safely) can, we set an uninitialized 0 denominator to 1. This simplifies the code overall. Also: Improved some doc strings. Preparation for addressing issue #33792. Updates #33792. Change-Id: I3040587c8d0dad2e840022f96ca027d8470878a0 Reviewed-on: https://go-review.googlesource.com/c/go/+/202997 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-24 03:34:11 +00:00
Akhil Indurti	93a601dd2a	math: add guaranteed-precision FMA implementation Currently, the precision of the float64 multiply-add operation (x * y) + z varies across architectures. While generated code for ppc64, s390x, and arm64 can guarantee that there is no intermediate rounding on those platforms, other architectures like x86, mips, and arm will exhibit different behavior depending on available instruction set. Consequently, applications cannot rely on results being identical across GOARCH-dependent codepaths. This CL introduces a software implementation that performs an IEEE 754 double-precision fused-multiply-add operation. The only supported rounding mode is round-to-nearest ties-to-even. Separate CLs include hardware implementations when available. Otherwise, this software fallback is given as the default implementation. Specifically, - arm64, ppc64, s390x: Uses the FMA instruction provided by all of these ISAs. - mips[64][le]: Falls back to this software implementation. Only release 6 of the ISA includes a strict FMA instruction with MADDF.D (not implementation defined). Because the number of R6 processors in the wild is scarce, the assembly implementation is left as a future optimization. - x86: Guards the use of VFMADD213SD by checking cpu.X86.HasFMA. - arm: Guards the use of VFMA by checking cpu.ARM.HasVFPv4. - software fallback: Uses mostly integer arithmetic except for input that involves Inf, NaN, or zero. Updates #25819. Change-Id: Iadadff2219638bacc9fec78d3ab885393fea4a08 Reviewed-on: https://go-review.googlesource.com/c/go/+/127458 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 16:15:30 +00:00

1 2 3 4 5 ...

556 Commits