qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-14 23:20:27 -07:00

Author	SHA1	Message	Date
Vladimir Kuzmin	7395083136	cmd/compile: avoid extra mapaccess in "m[k] op= r" Currently, order desugars map assignment operations like m[k] op= r into m[k] = m[k] op r which in turn is transformed during walk into: tmp := mapaccess(m, k) tmp = tmp op r mapassign(m, k) = tmp However, this is suboptimal, as we could instead produce just: mapassign(m, k) op= r One complication though is if "r == 0", then "m[k] /= r" and "m[k] %= r" will panic, and they need to do so before* calling mapassign, otherwise we may insert a new zero-value element into the map. It would be spec compliant to just emit the "r != 0" check before calling mapassign (see #23735), but currently these checks aren't generated until SSA construction. For now, it's simpler to continue desugaring /= and %= into two map indexing operations. Fixes #23661. Change-Id: I46e3739d9adef10e92b46fdd78b88d5aabe68952 Reviewed-on: https://go-review.googlesource.com/91557 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-12 19:27:44 +00:00
isharipo	85a8d25d53	cmd/compile/internal/ssa: emit IMUL3{L/Q} for MUL{L/Q}const on x86 cmd/asm now supports three-operand form of IMUL, so instead of using IMUL with resultInArg0, emit IMUL3 instruction. This results in less redundant MOVs where SSA assigns different registers to input[0] and dst arguments. Note: these have exactly the same encoding when reg0=reg1: IMUL3x $const, reg0, reg1 IMULx $const, reg Two-operand IMULx is like a crippled IMUL3x, with dst fixed to input[0]. This is why we don't bother to generate IMULx for the case where dst is the same as input[0]. Change-Id: I4becda475b3dffdd07b6fdf1c75bacc82af654e4 Reviewed-on: https://go-review.googlesource.com/99656 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-12 19:02:36 +00:00
Giovanni Bajo	080187f4f7	cmd/compile: implement CMOV on amd64 This builds upon the branchelim pass, activating it for amd64 and lowering CondSelect. Special care is made to FPU instructions for NaN handling. Benchmark results on Xeon E5630 (Westmere EP): name old time/op new time/op delta BinaryTree17-16 4.99s ± 9% 4.66s ± 2% ~ (p=0.095 n=5+5) Fannkuch11-16 4.93s ± 3% 5.04s ± 2% ~ (p=0.548 n=5+5) FmtFprintfEmpty-16 58.8ns ± 7% 61.4ns ±14% ~ (p=0.579 n=5+5) FmtFprintfString-16 114ns ± 2% 114ns ± 4% ~ (p=0.603 n=5+5) FmtFprintfInt-16 181ns ± 4% 125ns ± 3% -30.90% (p=0.008 n=5+5) FmtFprintfIntInt-16 263ns ± 2% 217ns ± 2% -17.34% (p=0.008 n=5+5) FmtFprintfPrefixedInt-16 230ns ± 1% 212ns ± 1% -7.99% (p=0.008 n=5+5) FmtFprintfFloat-16 411ns ± 3% 344ns ± 5% -16.43% (p=0.008 n=5+5) FmtManyArgs-16 828ns ± 4% 790ns ± 2% -4.59% (p=0.032 n=5+5) GobDecode-16 10.9ms ± 4% 10.8ms ± 5% ~ (p=0.548 n=5+5) GobEncode-16 9.52ms ± 5% 9.46ms ± 2% ~ (p=1.000 n=5+5) Gzip-16 334ms ± 2% 337ms ± 2% ~ (p=0.548 n=5+5) Gunzip-16 64.4ms ± 1% 65.0ms ± 1% +1.00% (p=0.008 n=5+5) HTTPClientServer-16 156µs ± 3% 155µs ± 3% ~ (p=0.690 n=5+5) JSONEncode-16 21.0ms ± 1% 21.8ms ± 0% +3.76% (p=0.016 n=5+4) JSONDecode-16 95.1ms ± 0% 95.7ms ± 1% ~ (p=0.151 n=5+5) Mandelbrot200-16 6.38ms ± 1% 6.42ms ± 1% ~ (p=0.095 n=5+5) GoParse-16 5.47ms ± 2% 5.36ms ± 1% -1.95% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 111ns ± 1% 111ns ± 1% ~ (p=0.635 n=5+4) RegexpMatchEasy0_1K-16 408ns ± 1% 411ns ± 2% ~ (p=0.087 n=5+5) RegexpMatchEasy1_32-16 103ns ± 1% 104ns ± 1% ~ (p=0.484 n=5+5) RegexpMatchEasy1_1K-16 659ns ± 2% 652ns ± 1% ~ (p=0.571 n=5+5) RegexpMatchMedium_32-16 176ns ± 2% 174ns ± 1% ~ (p=0.476 n=5+5) RegexpMatchMedium_1K-16 58.6µs ± 4% 57.7µs ± 4% ~ (p=0.548 n=5+5) RegexpMatchHard_32-16 3.07µs ± 3% 3.04µs ± 4% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-16 89.2µs ± 1% 87.9µs ± 2% -1.52% (p=0.032 n=5+5) Revcomp-16 575ms ± 0% 587ms ± 2% +2.12% (p=0.032 n=4+5) Template-16 110ms ± 1% 107ms ± 3% -3.00% (p=0.032 n=5+5) TimeParse-16 463ns ± 0% 462ns ± 0% ~ (p=0.810 n=5+4) TimeFormat-16 538ns ± 0% 535ns ± 0% -0.63% (p=0.024 n=5+5) name old speed new speed delta GobDecode-16 70.7MB/s ± 4% 71.4MB/s ± 5% ~ (p=0.452 n=5+5) GobEncode-16 80.7MB/s ± 5% 81.2MB/s ± 2% ~ (p=1.000 n=5+5) Gzip-16 58.2MB/s ± 2% 57.7MB/s ± 2% ~ (p=0.452 n=5+5) Gunzip-16 302MB/s ± 1% 299MB/s ± 1% -0.99% (p=0.008 n=5+5) JSONEncode-16 92.4MB/s ± 1% 89.1MB/s ± 0% -3.63% (p=0.016 n=5+4) JSONDecode-16 20.4MB/s ± 0% 20.3MB/s ± 1% ~ (p=0.135 n=5+5) GoParse-16 10.6MB/s ± 2% 10.8MB/s ± 1% +2.00% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 286MB/s ± 1% 285MB/s ± 3% ~ (p=1.000 n=5+5) RegexpMatchEasy0_1K-16 2.51GB/s ± 1% 2.49GB/s ± 2% ~ (p=0.095 n=5+5) RegexpMatchEasy1_32-16 309MB/s ± 1% 307MB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_1K-16 1.55GB/s ± 2% 1.57GB/s ± 1% ~ (p=0.690 n=5+5) RegexpMatchMedium_32-16 5.68MB/s ± 2% 5.73MB/s ± 1% ~ (p=0.579 n=5+5) RegexpMatchMedium_1K-16 17.5MB/s ± 4% 17.8MB/s ± 4% ~ (p=0.500 n=5+5) RegexpMatchHard_32-16 10.4MB/s ± 3% 10.5MB/s ± 4% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-16 11.5MB/s ± 1% 11.7MB/s ± 2% +1.57% (p=0.032 n=5+5) Revcomp-16 442MB/s ± 0% 433MB/s ± 2% -2.05% (p=0.032 n=4+5) Template-16 17.7MB/s ± 1% 18.2MB/s ± 3% +3.12% (p=0.032 n=5+5) Change-Id: Ic7cb7374d07da031e771bdcbfdd832fd1b17159c Reviewed-on: https://go-review.googlesource.com/98695 Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-03-12 18:01:33 +00:00
fanzha02	fdf5aaf555	cmd/asm: fix ARM64 vector register arrangement encoding bug The current code assigns vector register arrangement a wrong value when the arrangement specifier is S2, which causes the incorrect assembly. The patch fixes the issue and adds the test cases. Fixes #24249 Change-Id: I9736df1279494003d0b178da1af9cee9cd85ce21 Reviewed-on: https://go-review.googlesource.com/98555 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-12 15:05:53 +00:00
erifan01	140bfe9cfe	math/big: optimize shlVU and shrVU on arm64 This CL implements shlVU and shrVU with arm64 HW instructions "LDP" and "STP" to reduce load cost, it also removes unnecessary checks on the number of shifts for better performance. Benchmarks: name old time/op new time/op delta AddVV/1-8 21.6ns ± 1% 21.6ns ± 1% ~ (p=0.683 n=5+5) AddVV/2-8 13.5ns ± 0% 13.5ns ± 0% ~ (all equal) AddVV/3-8 15.5ns ± 0% 15.5ns ± 0% ~ (all equal) AddVV/4-8 17.5ns ± 0% 17.5ns ± 0% ~ (all equal) AddVV/5-8 19.5ns ± 0% 19.5ns ± 0% ~ (all equal) AddVV/10-8 29.5ns ± 0% 29.5ns ± 0% ~ (all equal) AddVV/100-8 217ns ± 0% 217ns ± 0% ~ (all equal) AddVV/1000-8 2.02µs ± 0% 2.03µs ± 0% +0.73% (p=0.008 n=5+5) AddVV/10000-8 20.5µs ± 0% 20.5µs ± 0% -0.01% (p=0.008 n=5+5) AddVV/100000-8 246µs ± 5% 250µs ± 4% ~ (p=0.548 n=5+5) AddVW/1-8 9.26ns ± 0% 9.32ns ± 0% +0.65% (p=0.016 n=4+5) AddVW/2-8 19.8ns ± 3% 19.8ns ± 0% ~ (p=0.143 n=5+5) AddVW/3-8 11.5ns ± 0% 11.5ns ± 0% ~ (all equal) AddVW/4-8 13.0ns ± 0% 13.0ns ± 0% ~ (all equal) AddVW/5-8 14.5ns ± 0% 14.5ns ± 0% ~ (all equal) AddVW/10-8 22.0ns ± 0% 22.0ns ± 0% ~ (all equal) AddVW/100-8 167ns ± 0% 166ns ± 0% -0.60% (p=0.000 n=5+4) AddVW/1000-8 1.52µs ± 0% 1.52µs ± 0% ~ (all equal) AddVW/10000-8 15.1µs ± 0% 15.1µs ± 0% +0.01% (p=0.008 n=5+5) AddVW/100000-8 163µs ± 4% 153µs ± 3% -5.97% (p=0.016 n=5+5) AddMulVVW/1-8 32.4ns ± 1% 33.0ns ± 1% +1.73% (p=0.040 n=5+5) AddMulVVW/2-8 56.4ns ± 2% 55.9ns ± 1% ~ (p=0.135 n=5+5) AddMulVVW/3-8 85.4ns ± 1% 85.1ns ± 0% ~ (p=0.079 n=5+5) AddMulVVW/4-8 129ns ± 1% 129ns ± 0% ~ (p=0.397 n=5+5) AddMulVVW/5-8 148ns ± 0% 148ns ± 0% ~ (all equal) AddMulVVW/10-8 270ns ± 0% 268ns ± 0% -0.74% (p=0.029 n=4+4) AddMulVVW/100-8 2.75µs ± 0% 2.75µs ± 0% -0.09% (p=0.008 n=5+5) AddMulVVW/1000-8 26.0µs ± 0% 26.0µs ± 0% -0.06% (p=0.024 n=5+5) AddMulVVW/10000-8 312µs ± 0% 312µs ± 0% -0.09% (p=0.008 n=5+5) AddMulVVW/100000-8 2.89ms ± 0% 2.89ms ± 0% +0.14% (p=0.016 n=5+5) DecimalConversion-8 315µs ± 1% 312µs ± 0% ~ (p=0.095 n=5+5) FloatString/100-8 2.56µs ± 1% 2.52µs ± 1% -1.31% (p=0.016 n=5+5) FloatString/1000-8 58.6µs ± 0% 58.2µs ± 0% -0.75% (p=0.008 n=5+5) FloatString/10000-8 4.59ms ± 0% 4.59ms ± 0% ~ (p=0.056 n=5+5) FloatString/100000-8 446ms ± 0% 446ms ± 0% -0.04% (p=0.008 n=5+5) FloatAdd/10-8 184ns ± 0% 178ns ± 0% -3.48% (p=0.008 n=5+5) FloatAdd/100-8 189ns ± 3% 178ns ± 2% -6.02% (p=0.008 n=5+5) FloatAdd/1000-8 371ns ± 0% 267ns ± 0% -27.99% (p=0.000 n=5+4) FloatAdd/10000-8 1.87µs ± 0% 1.03µs ± 0% -44.74% (p=0.008 n=5+5) FloatAdd/100000-8 17.1µs ± 0% 8.8µs ± 0% -48.71% (p=0.016 n=5+4) FloatSub/10-8 148ns ± 0% 146ns ± 0% -1.35% (p=0.000 n=5+4) FloatSub/100-8 148ns ± 0% 140ns ± 0% -5.41% (p=0.008 n=5+5) FloatSub/1000-8 242ns ± 0% 191ns ± 0% -21.24% (p=0.008 n=5+5) FloatSub/10000-8 1.07µs ± 0% 0.64µs ± 1% -39.89% (p=0.016 n=4+5) FloatSub/100000-8 9.48µs ± 0% 5.32µs ± 0% -43.87% (p=0.008 n=5+5) ParseFloatSmallExp-8 29.3µs ± 3% 28.6µs ± 1% ~ (p=0.310 n=5+5) ParseFloatLargeExp-8 125µs ± 1% 123µs ± 0% -1.99% (p=0.008 n=5+5) GCD10x10/WithoutXY-8 278ns ± 4% 289ns ± 5% +3.96% (p=0.040 n=5+5) GCD10x10/WithXY-8 2.12µs ± 2% 2.15µs ± 2% ~ (p=0.095 n=5+5) GCD10x100/WithoutXY-8 615ns ± 1% 629ns ± 5% ~ (p=0.135 n=5+5) GCD10x100/WithXY-8 3.42µs ± 1% 3.53µs ± 2% +3.38% (p=0.008 n=5+5) GCD10x1000/WithoutXY-8 1.39µs ± 1% 1.38µs ± 1% ~ (p=0.460 n=5+5) GCD10x1000/WithXY-8 7.47µs ± 2% 7.49µs ± 3% ~ (p=1.000 n=5+5) GCD10x10000/WithoutXY-8 8.71µs ± 1% 8.71µs ± 0% ~ (p=0.841 n=5+5) GCD10x10000/WithXY-8 28.4µs ± 2% 27.2µs ± 2% -4.24% (p=0.008 n=5+5) GCD10x100000/WithoutXY-8 78.9µs ± 1% 79.1µs ± 0% ~ (p=0.222 n=5+5) GCD10x100000/WithXY-8 240µs ± 1% 228µs ± 1% -4.98% (p=0.008 n=5+5) GCD100x100/WithoutXY-8 1.87µs ± 2% 1.89µs ± 1% ~ (p=0.095 n=5+5) GCD100x100/WithXY-8 26.6µs ± 1% 26.3µs ± 0% -1.14% (p=0.032 n=5+5) GCD100x1000/WithoutXY-8 4.44µs ± 2% 4.47µs ± 2% ~ (p=0.444 n=5+5) GCD100x1000/WithXY-8 36.7µs ± 1% 36.0µs ± 1% -1.96% (p=0.008 n=5+5) GCD100x10000/WithoutXY-8 22.8µs ± 1% 22.3µs ± 1% -2.52% (p=0.008 n=5+5) GCD100x10000/WithXY-8 145µs ± 1% 142µs ± 0% -1.88% (p=0.008 n=5+5) GCD100x100000/WithoutXY-8 198µs ± 1% 190µs ± 0% -4.06% (p=0.008 n=5+5) GCD100x100000/WithXY-8 1.11ms ± 0% 1.09ms ± 0% -1.87% (p=0.008 n=5+5) GCD1000x1000/WithoutXY-8 25.4µs ± 1% 25.0µs ± 1% -1.34% (p=0.008 n=5+5) GCD1000x1000/WithXY-8 515µs ± 1% 485µs ± 0% -5.85% (p=0.008 n=5+5) GCD1000x10000/WithoutXY-8 57.3µs ± 1% 56.2µs ± 1% -1.95% (p=0.008 n=5+5) GCD1000x10000/WithXY-8 1.21ms ± 0% 1.18ms ± 0% -2.65% (p=0.008 n=5+5) GCD1000x100000/WithoutXY-8 358µs ± 0% 352µs ± 1% -1.71% (p=0.008 n=5+5) GCD1000x100000/WithXY-8 8.72ms ± 0% 8.66ms ± 0% -0.71% (p=0.008 n=5+5) GCD10000x10000/WithoutXY-8 690µs ± 0% 687µs ± 1% ~ (p=0.095 n=5+5) GCD10000x10000/WithXY-8 16.0ms ± 0% 12.5ms ± 0% -22.01% (p=0.008 n=5+5) GCD10000x100000/WithoutXY-8 2.09ms ± 0% 2.07ms ± 0% -0.58% (p=0.008 n=5+5) GCD10000x100000/WithXY-8 86.8ms ± 0% 83.4ms ± 0% -3.95% (p=0.008 n=5+5) GCD100000x100000/WithoutXY-8 51.2ms ± 0% 51.2ms ± 0% ~ (p=0.548 n=5+5) GCD100000x100000/WithXY-8 1.25s ± 0% 0.89s ± 0% -28.98% (p=0.008 n=5+5) Hilbert-8 2.46ms ± 2% 2.53ms ± 1% +2.89% (p=0.032 n=5+5) Binomial-8 5.15µs ± 4% 4.92µs ± 1% -4.43% (p=0.032 n=5+5) QuoRem-8 7.10µs ± 0% 7.05µs ± 0% -0.59% (p=0.008 n=5+5) Exp-8 161ms ± 0% 161ms ± 0% -0.24% (p=0.008 n=5+5) Exp2-8 161ms ± 0% 161ms ± 0% -0.30% (p=0.016 n=4+5) Bitset-8 40.4ns ± 0% 40.3ns ± 0% ~ (p=0.159 n=5+5) BitsetNeg-8 158ns ± 4% 155ns ± 2% ~ (p=0.183 n=5+5) BitsetOrig-8 374ns ± 0% 383ns ± 1% +2.35% (p=0.008 n=5+5) BitsetNegOrig-8 620ns ± 1% 663ns ± 2% +7.00% (p=0.008 n=5+5) ModSqrt225_Tonelli-8 7.26ms ± 0% 7.27ms ± 0% ~ (p=0.841 n=5+5) ModSqrt224_3Mod4-8 2.24ms ± 0% 2.24ms ± 0% ~ (p=0.690 n=5+5) ModSqrt5430_Tonelli-8 62.3s ± 0% 62.4s ± 0% +0.15% (p=0.008 n=5+5) ModSqrt5430_3Mod4-8 20.8s ± 0% 20.8s ± 0% ~ (p=0.151 n=5+5) Sqrt-8 101µs ± 0% 97µs ± 0% -3.99% (p=0.008 n=5+5) IntSqr/1-8 32.7ns ± 1% 32.5ns ± 1% ~ (p=0.325 n=5+5) IntSqr/2-8 161ns ± 4% 160ns ± 4% ~ (p=0.659 n=5+5) IntSqr/3-8 296ns ± 7% 297ns ± 6% ~ (p=0.841 n=5+5) IntSqr/5-8 752ns ± 7% 755ns ± 6% ~ (p=0.889 n=5+5) IntSqr/8-8 1.91µs ± 3% 1.90µs ± 3% ~ (p=0.746 n=5+5) IntSqr/10-8 2.99µs ± 4% 3.00µs ± 4% ~ (p=0.516 n=5+5) IntSqr/20-8 6.29µs ± 2% 6.19µs ± 2% ~ (p=0.151 n=5+5) IntSqr/30-8 14.0µs ± 1% 13.8µs ± 2% ~ (p=0.056 n=5+5) IntSqr/50-8 38.1µs ± 3% 37.9µs ± 3% ~ (p=0.548 n=5+5) IntSqr/80-8 95.1µs ± 1% 94.7µs ± 1% ~ (p=0.310 n=5+5) IntSqr/100-8 148µs ± 1% 148µs ± 1% ~ (p=0.548 n=5+5) IntSqr/200-8 587µs ± 1% 587µs ± 1% ~ (p=1.000 n=5+5) IntSqr/300-8 1.31ms ± 1% 1.32ms ± 1% ~ (p=0.151 n=5+5) IntSqr/500-8 2.48ms ± 0% 2.49ms ± 0% ~ (p=0.310 n=5+5) IntSqr/800-8 4.68ms ± 0% 4.67ms ± 0% ~ (p=0.548 n=5+5) IntSqr/1000-8 7.57ms ± 0% 7.56ms ± 0% ~ (p=0.421 n=5+5) Mul-8 311ms ± 0% 311ms ± 0% ~ (p=0.151 n=5+5) Exp3Power/0x10-8 584ns ± 2% 573ns ± 1% ~ (p=0.190 n=5+5) Exp3Power/0x40-8 646ns ± 2% 649ns ± 1% ~ (p=0.690 n=5+5) Exp3Power/0x100-8 1.42µs ± 2% 1.45µs ± 1% +2.03% (p=0.032 n=5+5) Exp3Power/0x400-8 8.28µs ± 1% 8.39µs ± 0% +1.33% (p=0.008 n=5+5) Exp3Power/0x1000-8 60.1µs ± 0% 59.8µs ± 0% -0.44% (p=0.008 n=5+5) Exp3Power/0x4000-8 818µs ± 0% 816µs ± 0% -0.23% (p=0.008 n=5+5) Exp3Power/0x10000-8 7.79ms ± 0% 7.78ms ± 0% ~ (p=0.690 n=5+5) Exp3Power/0x40000-8 73.4ms ± 0% 73.3ms ± 0% ~ (p=0.151 n=5+5) Exp3Power/0x100000-8 665ms ± 0% 664ms ± 0% -0.16% (p=0.016 n=4+5) Exp3Power/0x400000-8 5.99s ± 0% 5.97s ± 0% -0.24% (p=0.008 n=5+5) Fibo-8 116ms ± 0% 117ms ± 0% +0.42% (p=0.008 n=5+5) NatSqr/1-8 113ns ± 2% 112ns ± 1% ~ (p=0.190 n=5+5) NatSqr/2-8 249ns ± 2% 250ns ± 2% ~ (p=0.365 n=5+5) NatSqr/3-8 379ns ± 1% 381ns ± 2% ~ (p=0.127 n=5+5) NatSqr/5-8 838ns ± 3% 841ns ± 5% ~ (p=0.754 n=5+5) NatSqr/8-8 1.97µs ± 3% 1.97µs ± 4% ~ (p=1.000 n=5+5) NatSqr/10-8 3.04µs ± 4% 3.04µs ± 4% ~ (p=1.000 n=5+5) NatSqr/20-8 6.49µs ± 3% 6.50µs ± 2% ~ (p=0.841 n=5+5) NatSqr/30-8 14.3µs ± 2% 14.2µs ± 2% ~ (p=0.548 n=5+5) NatSqr/50-8 38.5µs ± 3% 38.3µs ± 3% ~ (p=0.421 n=5+5) NatSqr/80-8 96.3µs ± 1% 96.1µs ± 1% ~ (p=0.421 n=5+5) NatSqr/100-8 149µs ± 1% 148µs ± 1% ~ (p=0.310 n=5+5) NatSqr/200-8 591µs ± 1% 592µs ± 1% ~ (p=0.690 n=5+5) NatSqr/300-8 1.31ms ± 1% 1.32ms ± 0% ~ (p=0.190 n=5+4) NatSqr/500-8 2.49ms ± 0% 2.49ms ± 0% ~ (p=0.095 n=5+5) NatSqr/800-8 4.70ms ± 0% 4.69ms ± 0% ~ (p=0.222 n=5+5) NatSqr/1000-8 7.60ms ± 0% 7.58ms ± 0% ~ (p=0.222 n=5+5) ScanPi-8 326µs ± 0% 327µs ± 1% ~ (p=0.222 n=5+5) StringPiParallel-8 71.4µs ± 5% 67.7µs ± 4% ~ (p=0.095 n=5+5) Scan/10/Base2-8 1.09µs ± 0% 1.10µs ± 1% ~ (p=0.810 n=5+5) Scan/100/Base2-8 7.79µs ± 0% 7.83µs ± 0% +0.53% (p=0.008 n=5+5) Scan/1000/Base2-8 78.9µs ± 0% 79.0µs ± 0% ~ (p=0.151 n=5+5) Scan/10000/Base2-8 1.22ms ± 0% 1.23ms ± 1% ~ (p=0.690 n=5+5) Scan/100000/Base2-8 55.1ms ± 0% 55.1ms ± 0% +0.10% (p=0.008 n=5+5) Scan/10/Base8-8 512ns ± 1% 534ns ± 1% +4.34% (p=0.008 n=5+5) Scan/100/Base8-8 2.90µs ± 1% 2.92µs ± 0% +0.67% (p=0.024 n=5+5) Scan/1000/Base8-8 31.0µs ± 0% 31.1µs ± 0% +0.27% (p=0.008 n=5+5) Scan/10000/Base8-8 741µs ± 0% 744µs ± 1% ~ (p=0.310 n=5+5) Scan/100000/Base8-8 50.5ms ± 0% 50.7ms ± 0% +0.23% (p=0.016 n=5+4) Scan/10/Base10-8 485ns ± 0% 510ns ± 1% +5.15% (p=0.008 n=5+5) Scan/100/Base10-8 2.68µs ± 0% 2.70µs ± 0% +0.84% (p=0.008 n=5+5) Scan/1000/Base10-8 28.7µs ± 0% 28.8µs ± 0% +0.34% (p=0.008 n=5+5) Scan/10000/Base10-8 717µs ± 0% 720µs ± 1% ~ (p=0.238 n=5+5) Scan/100000/Base10-8 50.3ms ± 0% 50.3ms ± 0% +0.02% (p=0.016 n=4+5) Scan/10/Base16-8 439ns ± 0% 461ns ± 1% +5.06% (p=0.008 n=5+5) Scan/100/Base16-8 2.48µs ± 0% 2.49µs ± 0% +0.59% (p=0.024 n=5+5) Scan/1000/Base16-8 27.2µs ± 0% 27.3µs ± 0% ~ (p=0.063 n=5+5) Scan/10000/Base16-8 722µs ± 0% 725µs ± 1% ~ (p=0.421 n=5+5) Scan/100000/Base16-8 52.7ms ± 0% 52.7ms ± 0% ~ (p=0.686 n=4+4) String/10/Base2-8 248ns ± 1% 248ns ± 1% ~ (p=0.802 n=5+5) String/100/Base2-8 1.51µs ± 0% 1.51µs ± 0% -0.54% (p=0.024 n=5+5) String/1000/Base2-8 13.6µs ± 0% 13.6µs ± 0% ~ (p=0.548 n=5+5) String/10000/Base2-8 135µs ± 1% 135µs ± 2% ~ (p=0.421 n=5+5) String/100000/Base2-8 1.32ms ± 1% 1.33ms ± 1% ~ (p=0.310 n=5+5) String/10/Base8-8 169ns ± 0% 170ns ± 0% ~ (p=0.079 n=5+5) String/100/Base8-8 635ns ± 1% 633ns ± 1% ~ (p=0.595 n=5+5) String/1000/Base8-8 5.33µs ± 0% 5.30µs ± 0% ~ (p=0.063 n=5+5) String/10000/Base8-8 50.7µs ± 1% 50.7µs ± 1% ~ (p=1.000 n=5+5) String/100000/Base8-8 499µs ± 1% 500µs ± 1% ~ (p=1.000 n=5+5) String/10/Base10-8 517ns ± 1% 512ns ± 1% -1.01% (p=0.032 n=5+5) String/100/Base10-8 1.97µs ± 0% 2.01µs ± 1% +2.13% (p=0.008 n=5+5) String/1000/Base10-8 12.6µs ± 1% 12.1µs ± 1% -4.16% (p=0.008 n=5+5) String/10000/Base10-8 57.9µs ± 1% 54.8µs ± 1% -5.46% (p=0.008 n=5+5) String/100000/Base10-8 25.6ms ± 0% 25.6ms ± 0% -0.12% (p=0.008 n=5+5) String/10/Base16-8 149ns ± 0% 149ns ± 1% ~ (p=1.000 n=5+5) String/100/Base16-8 514ns ± 0% 514ns ± 1% ~ (p=0.825 n=5+5) String/1000/Base16-8 4.01µs ± 0% 4.01µs ± 0% ~ (p=0.595 n=5+5) String/10000/Base16-8 37.7µs ± 0% 37.8µs ± 1% ~ (p=0.222 n=5+5) String/100000/Base16-8 373µs ± 1% 372µs ± 0% ~ (p=1.000 n=5+5) LeafSize/0-8 6.64ms ± 0% 6.66ms ± 0% +0.32% (p=0.008 n=5+5) LeafSize/1-8 74.0µs ± 1% 71.2µs ± 1% -3.75% (p=0.008 n=5+5) LeafSize/2-8 74.1µs ± 0% 70.7µs ± 1% -4.53% (p=0.008 n=5+5) LeafSize/3-8 379µs ± 0% 374µs ± 0% -1.25% (p=0.008 n=5+5) LeafSize/4-8 72.7µs ± 0% 69.2µs ± 0% -4.79% (p=0.008 n=5+5) LeafSize/5-8 471µs ± 0% 466µs ± 0% -1.05% (p=0.008 n=5+5) LeafSize/6-8 377µs ± 0% 373µs ± 0% -1.16% (p=0.008 n=5+5) LeafSize/7-8 245µs ± 0% 241µs ± 0% -1.65% (p=0.008 n=5+5) LeafSize/8-8 73.1µs ± 0% 69.4µs ± 0% -5.10% (p=0.008 n=5+5) LeafSize/9-8 538µs ± 0% 532µs ± 0% -1.01% (p=0.008 n=5+5) LeafSize/10-8 472µs ± 0% 467µs ± 0% -1.07% (p=0.008 n=5+5) LeafSize/11-8 460µs ± 0% 454µs ± 0% -1.22% (p=0.008 n=5+5) LeafSize/12-8 378µs ± 0% 373µs ± 0% -1.34% (p=0.008 n=5+5) LeafSize/13-8 344µs ± 0% 338µs ± 0% -1.61% (p=0.008 n=5+5) LeafSize/14-8 247µs ± 0% 243µs ± 0% -1.62% (p=0.008 n=5+5) LeafSize/15-8 169µs ± 0% 165µs ± 0% -2.71% (p=0.008 n=5+5) LeafSize/16-8 73.3µs ± 1% 69.5µs ± 0% -5.11% (p=0.008 n=5+5) LeafSize/32-8 82.7µs ± 0% 79.2µs ± 0% -4.24% (p=0.008 n=5+5) LeafSize/64-8 135µs ± 0% 132µs ± 0% -2.20% (p=0.008 n=5+5) ProbablyPrime/n=0-8 44.2ms ± 0% 43.9ms ± 0% -0.69% (p=0.008 n=5+5) ProbablyPrime/n=1-8 64.8ms ± 0% 64.4ms ± 0% -0.60% (p=0.008 n=5+5) ProbablyPrime/n=5-8 147ms ± 0% 147ms ± 0% -0.34% (p=0.008 n=5+5) ProbablyPrime/n=10-8 250ms ± 0% 249ms ± 0% -0.29% (p=0.008 n=5+5) ProbablyPrime/n=20-8 456ms ± 0% 455ms ± 0% -0.29% (p=0.008 n=5+5) ProbablyPrime/Lucas-8 23.6ms ± 0% 23.2ms ± 0% -1.44% (p=0.008 n=5+5) ProbablyPrime/MillerRabinBase2-8 20.6ms ± 0% 20.6ms ± 0% -0.31% (p=0.008 n=5+5) FloatSqrt/64-8 2.27µs ± 1% 2.11µs ± 1% -7.02% (p=0.008 n=5+5) FloatSqrt/128-8 4.93µs ± 1% 4.40µs ± 1% -10.73% (p=0.008 n=5+5) FloatSqrt/256-8 13.6µs ± 0% 6.6µs ± 1% -51.40% (p=0.008 n=5+5) FloatSqrt/1000-8 69.8µs ± 0% 31.2µs ± 0% -55.27% (p=0.008 n=5+5) FloatSqrt/10000-8 1.91ms ± 0% 0.59ms ± 0% -69.17% (p=0.008 n=5+5) FloatSqrt/100000-8 55.4ms ± 0% 17.8ms ± 0% -67.79% (p=0.008 n=5+5) FloatSqrt/1000000-8 4.56s ± 0% 1.52s ± 0% -66.59% (p=0.008 n=5+5) Change-Id: Icce52c69668f564490c69b908338b21a2288e116 Reviewed-on: https://go-review.googlesource.com/79355 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-12 14:42:11 +00:00
Giovanni Bajo	f7ac70a566	test: move rotate tests to top-level testsuite. Remove old tests from asm_test. Change-Id: Ib408ec7faa60068bddecf709b93ce308e0ef665a Reviewed-on: https://go-review.googlesource.com/100075 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>	2018-03-11 10:08:18 +00:00
Daniel Martí	c15b7b2a54	cmd: re-generate all stringer files The tool has gotten better over time, so re-generating the files brings some advantages like fewer objects, dropping the use of fmt, and dropping unnecessary bounds checks. While at it, add the missing go:generate line for obj.AddrType. Change-Id: I120c9795ee8faddf5961ff0384b9dcaf58d831ff Reviewed-on: https://go-review.googlesource.com/100015 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-10 21:20:50 +00:00
Kunpei Sakai	8f406af8e8	net: avoid unnecessary type conversions CL generated mechanically with github.com/mdempsky/unconvert. Change-Id: I6c555da5972618dca4302ef8be8d93c765f95db3 Reviewed-on: https://go-review.googlesource.com/100035 Run-TryBot: Kunpei Sakai <namusyaka@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-10 20:59:45 +00:00
Mark Rushakoff	5e52471761	all: fix non-standard "DO NOT EDIT" comments for generated files I found files to change with this command: git grep 'DO NOT EDIT' \| grep -v 'Code generated .* DO NOT' There are more files that match that grep, but I do not intend on fixing them. Change-Id: I4b474f1c29ca3135560d414785b0dbe0d1a4e52c GitHub-Last-Rev: `65804b0263` GitHub-Pull-Request: golang/go#24334 Reviewed-on: https://go-review.googlesource.com/99955 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-10 17:50:11 +00:00
Giovanni Bajo	5c432fe0e3	cmd/compile: gofmt rewriteARM64.go Change-Id: I7424257e496f8f40c9601b62335b64d641dcd3b5 Reviewed-on: https://go-review.googlesource.com/99996 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2018-03-10 13:03:55 +00:00
Daniel Martí	0c5cfec844	cmd/internal/test2json: support subtests containing colons The "updates" lines, such as RUN, do not contain a colon. However, test2json looked for one anyway, meaning that it would be thrown off if it encountered a line like: === RUN TestWithColons/[::1] In that case, it must not use the first colon it encounters to separate the action from the test name. Fixes #23920. Change-Id: I82eff23e24b83dae183c0cf9f85fc5f409f51c25 Reviewed-on: https://go-review.googlesource.com/98445 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-10 10:13:25 +00:00
Adam Woodbeck	32409a2dfc	text/scanner: add examples Added examples for use of Mode, Whitespace, and IsIdentRune properties. Fixes #23768 Change-Id: I2528e14fde63a4476f3c25510bf0c5b73f38ba5d Reviewed-on: https://go-review.googlesource.com/93199 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-10 02:01:58 +00:00
Henry Clifford	f69ad10377	mime/multipart: test for presence of filename instead of content-type Fixes #24041 Preserving the intended fix in https://go.googlesource.com/go/+/81ec7256072ed5e20b8827c583193258769aebc0 Change-Id: I600d3d7edc74ca072a066739e2ef3235877d808f GitHub-Last-Rev: `14973d7c2b` GitHub-Pull-Request: golang/go#24104 Reviewed-on: https://go-review.googlesource.com/96975 Reviewed-by: Andrew Bonventre <andybons@golang.org> Run-TryBot: Andrew Bonventre <andybons@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-10 00:33:10 +00:00
David Chase	0eacf8cbdf	cmd/compile: add DWARF reg defs & fix 32-bit location list bug Before DWARF location lists can be turned on, 3 bugs need fixing. This CL addresses two -- lack of register definitions for various architectures, and bugs on 32-bit platforms. The third bug comes later. Passes GO_GCFLAGS=-dwarflocationlists ./run.bash -no-rebuild (-no-rebuild because the map dependence causes trouble) Change-Id: I4223b48ade84763e4b048e4aeb81149f082c7bc7 Reviewed-on: https://go-review.googlesource.com/99255 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-09 23:17:18 +00:00
Robert Griesemer	99c30211b1	go/scanner: recognize //line and /line directives incl. columns This change updates go/scanner to recognize the extended line directives that are now also handled by cmd/compile: //line filename:line //line filename:line:column /line filename:line/ /line filename:line:column/ As before, //-style line directives must start in column 1. /-style line directives may be placed anywhere in the code. In both cases, the specified position applies to the character immediately following the comment; for line comments that is the first character on the next line (after the newline of the comment). The go/token API is extended by a new method File.AddLineColumnInfo(offset int, filename string, line, column int) which extends the existing File.AddLineInfo(offset int, filename string, line int) by adding a column parameter. Adjusted token.Position computation is changed to take into account column information if provided via a line directive: A (line-directive) relative position will have a non-zero column iff the line directive specified a column; if the position is on the same line as the line directive, the column is relative to the specified column (otherwise it is relative to the line beginning). See also #24183. Finally, Position.String() has been adjusted to not print a column value if the column is unknown (== 0). Fixes #24143. Change-Id: I5518c825ad94443365c049a95677407b46ba55a1 Reviewed-on: https://go-review.googlesource.com/97795 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-09 23:11:59 +00:00
Ben Shi	20046020c4	cmd/compile: fix an issue in MNEG of ARM64 There are two less optimized SSA rules in my previous CL https://go-review.googlesource.com/c/go/+/95075 . This CL fixes that issue and a test case gets about 10% performance improvement. name old time/op new time/op delta MNEG-4 263µs ± 3% 235µs ± 3% -10.53% (p=0.000 n=20+20) (https://github.com/benshi001/ugo1/blob/master/mneg_7_test.go) Change-Id: I30087097e281dd9d9d1c870d32e13b4ef4a96ad3 Reviewed-on: https://go-review.googlesource.com/99495 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-09 22:45:21 +00:00
Austin Clements	0def0f2e99	runtime: fix abort handling on arm64 The implementation of runtime.abort on arm64 currently branches to address 0, which results in a signal from PC 0, rather than from runtime.abort, so the runtime fails to recognize it as an abort. Fix runtime.abort on arm64 to read from address 0 like what other architectures do and recognize this in the signal handler. Should fix the linux/arm64 build. Change-Id: I960ab630daaeadc9190287604d4d8337b1ea3853 Reviewed-on: https://go-review.googlesource.com/99895 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-09 22:17:04 +00:00
Matthew Dempsky	e4de522c95	cmd/compile: fix Node.Etype overloading Add helper methods that validate n.Op and convert to/from the appropriate type. Notably, there was a lot of code in walk.go that thought setting Etype=1 on an OADDR node affected escape analysis. Passes toolstash-check. TBR=marvin Change-Id: Ieae7c67225c1459c9719f9e6a748a25b975cf758 Reviewed-on: https://go-review.googlesource.com/99535 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-09 21:44:35 +00:00
Ilya Tocar	91102bf723	runtime: use bytes.IndexByte in findnull bytes.IndexByte is heavily optimized. Use it in findnull. This is second attempt, similar to CL97523. In this version we never call IndexByte on region of memory, that crosses page boundary. A bit slower than CL97523, but still fast: name old time/op new time/op delta GoString-6 164ns ± 2% 118ns ± 0% -28.00% (p=0.000 n=10+6) findnull is also used in gostringnocopy, which is used in many hot spots in the runtime. Fixes #23830 Change-Id: Id843dd4f65a34309d92bdd8df229e484d26b0cb2 Reviewed-on: https://go-review.googlesource.com/98015 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-09 19:37:39 +00:00
Ian Lance Taylor	4902778607	runtime: set libcall values for Solaris system calls This lets SIGPROF signals get a useful traceback. Without it we just see sysvicallN calling asmcgocall. Updates #24142 Change-Id: I5dfe3add51f0c3a4cb1c98acb7738be6396214bc Reviewed-on: https://go-review.googlesource.com/99617 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-09 19:27:20 +00:00
Alberto Donizetti	8e427a3878	test/codegen: add README file for the codegen test harness This change adds a README file inside the test/codegen directory, explaining how to run the codegen tests and the syntax of the regexps comments used to match assembly instructions. Change-Id: Ica4eb3ffa9c6975371538cc8ae0ac3c1a3a03baf Reviewed-on: https://go-review.googlesource.com/99156 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-09 18:38:53 +00:00
Daniel Martí	1c6144d069	encoding/gob: work around TestFuzzOneByte panic The index 248 results in the decoder calling reflect.MakeMapWithSize with a size of 14754407682 - just under 15GB - which ends up in a runtime out of memory panic after some recent runtime changes on machines with 8GB of memory. Until that is fixed in either runtime or gob, skip the troublesome index. Updates #24308. Change-Id: Ia450217271c983e7386ba2f3f88c9ba50aa346f4 Reviewed-on: https://go-review.googlesource.com/99655 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-09 18:29:21 +00:00
Josh Bleecher Snyder	031f71efdf	runtime: add TestSizeof Borrowed from cmd/compile, TestSizeof ensures that the size of important types doesn't change unexpectedly. It also helps reviewers see the impact of intended changes. Change-Id: If57955f0c3e66054de3f40c6bba585b88694c7be Reviewed-on: https://go-review.googlesource.com/99837 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-09 17:03:25 +00:00
Vladimir Kuzmin	b2d1cd2ad0	net: optimize IP.String for IPv4 This is optimization is only for IPv4. It allocates a result buffer and writes the IPv4 octets as dotted decimal into it before converting it to a string just once, reducing allocations. Benchmark shows performance improvement: name old time/op new time/op delta IPString/IPv4-8 284ns ± 4% 144ns ± 6% -49.35% (p=0.000 n=19+17) IPString/IPv6-8 1.34µs ± 5% 1.14µs ± 5% -14.37% (p=0.000 n=19+20) name old alloc/op new alloc/op delta IPString/IPv4-8 24.0B ± 0% 16.0B ± 0% -33.33% (p=0.000 n=20+20) IPString/IPv6-8 232B ± 0% 224B ± 0% -3.45% (p=0.000 n=20+20) name old allocs/op new allocs/op delta IPString/IPv4-8 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=20+20) IPString/IPv6-8 12.0 ± 0% 11.0 ± 0% -8.33% (p=0.000 n=20+20) Fixes #24306 Change-Id: I4e2d30d364e78183d55a42907d277744494b6df3 Reviewed-on: https://go-review.googlesource.com/99395 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-09 17:03:05 +00:00
Alberto Donizetti	5f541b11aa	test/codegen: port MULs merging tests to codegen And delete them from asm_go. Change-Id: I0057cbd90ca55fa51c596e32406e190f3866f93e Reviewed-on: https://go-review.googlesource.com/99815 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-09 17:01:56 +00:00
Tobias Klauser	91f74069ef	runtime: fix comment for hwcap on linux/arm hwcap is set in archauxv, setup_auxv no longer exists. Change-Id: I0fc9393e0c1c45192e0eff4715e9bdd69fab2653 Reviewed-on: https://go-review.googlesource.com/99779 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-09 16:02:38 +00:00
Tobias Klauser	ad466d8b87	go/internal/gcimporter: simplify defer Directly use rc.Close instead of wrapping it with a closure. Change-Id: I3dc1c21ccbfe031c230b035126d5ea3bc62055c3 Reviewed-on: https://go-review.googlesource.com/99716 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-09 15:00:06 +00:00
Jean de Klerk	709317138f	cmd/go: briefly document test caching in go test -h output Fixes #23971 Change-Id: I073f278cc058aa15a23c0ea06292c02d50a3df21 Reviewed-on: https://go-review.googlesource.com/95582 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Rob Pike <r@golang.org>	2018-03-09 11:39:11 +00:00
Alberto Donizetti	cde34780b7	test/codegen: port math/bits.RotateLeft tests to codegen Only RotateLeft{64,32} were tested, and just for ppc64. This CL adds tests for RotateLeft{64,32,16,8} on arm64 and amd64/386, for the cases where the calls are actually instrinsified. RotateLeft tests (the last ones for math/bits functions) are deleted from asm_test. This CL also adds a space between the "//" and the arch name in the comments, to uniform this file to the style used in all the other files. Change-Id: Ifc2a27261d70bcc294b4ec64490d8367f62d2b89 Reviewed-on: https://go-review.googlesource.com/99596 Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-09 10:53:38 +00:00
Hana Kim	6b5a0b5c16	cmd/trace: set cname for span slices Define a set of color names available in trace viewer https://user-images.githubusercontent.com/4999471/37063995-5d0bad48-2169-11e8-92be-9cb363e21c38.png Change-Id: I312fcbc5430d7512b4c39ddc79a769259bad8c22 Reviewed-on: https://go-review.googlesource.com/99055 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-09 01:58:41 +00:00
Hana Kim	476f71c958	cmd/trace: remove unrelated arrows in task-oriented traceview Also grey out instants that represent events occurred outside the task's span. Furthermore, if the unrelated instants represent user annotation events but not for the task of the interest, skip rendering completely. This helps users to focus on the task-related events better. UI screen shot: https://gist.github.com/hyangah/1df5d2c8f429fd933c481e9636b89b55#file-golang-org_cl_99035 Change-Id: I2b5aef41584c827f8c1e915d0d8e5c95fe2b4b65 Reviewed-on: https://go-review.googlesource.com/99035 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-09 01:34:12 +00:00
Robert Griesemer	53c416396f	go/printer: simplify handling of line directives Strangely enough, the existing implementation used adjusted (by line directives) source positions to determine layout and thus required position corrections when printing a line directive. Instead, just use the unadjusted, absolute source positions and then printing a line directive doesn't require any adjustments, only some care to make sure it remains in column 1 as before. The new code doesn't need to parse line directives anymore and simply ensures that comments with the //line prefix and starting in column 1 remain in that position. That is a slight change from the old behavior (which ignored incorrect line directives, e.g. because they had an invalid line number) but unlikely to show up in real code. This is prep work for handling of line directives that also specify columns (which now won't require much special handling anymore). For #24143. Change-Id: I07eb2e1b35b37337e632e3dbf5b70c783c615f8a Reviewed-on: https://go-review.googlesource.com/99621 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-09 01:01:25 +00:00
Joe Tsai	2cc15b18db	encoding/csv: disallow quote for use as Comma '"' has special semantic meaning that conflicts with using it as Comma. Change-Id: Ife25ba43ca25dba2ea184c1bb7579a230d376059 Reviewed-on: https://go-review.googlesource.com/99696 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-09 00:33:43 +00:00
Austin Clements	5d22cebb12	runtime: explain and enforce that _panic values live on the stack It's a bit mysterious that _defer.sp is a uintptr that gets stack-adjusted explicitly while _panic.argp is an unsafe.Pointer that doesn't, but turns out to be critically important when a deferred function grows the stack before doing a recover. Add a comment explaining that this works because _panic values live on the stack. Enforce this by marking _panic go:notinheap. Change-Id: I9ca49e84ee1f86d881552c55dccd0662b530836b Reviewed-on: https://go-review.googlesource.com/99735 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-08 23:35:46 +00:00
Austin Clements	60a9e5d613	runtime: ensure abort actually crashes the process On all non-x86 arches, runtime.abort simply reads from nil. Unfortunately, if this happens on a user stack, the signal handler will dutifully turn this into a panicmem, which lets user defers run and which user code can even recover from. To fix this, add an explicit check to the signal handler that turns faults in abort into hard crashes directly in the signal handler. This has the added benefit of giving a register dump at the abort point. Change-Id: If26a7f13790745ee3867db7f53b72d8281176d70 Reviewed-on: https://go-review.googlesource.com/93661 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:55 +00:00
Austin Clements	c950a90d72	runtime: call abort instead of raw INT $3 or bad MOV Everything except for amd64, amd64p32, and 386 currently defines and uses an abort function. This CL makes these match. The next CL will recognize the abort function to make this more useful. Change-Id: I7c155871ea48919a9220417df0630005b444f488 Reviewed-on: https://go-review.googlesource.com/93660 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:54 +00:00
Austin Clements	7f1b2738bb	runtime: make throw safer to call Currently, throw may grow the stack, which means whenever we call it from a context where it's not safe to grow the stack, we first have to switch to the system stack. This is pretty easy to get wrong. Fix this by making throw switch to the system stack so it doesn't grow the stack and is hence safe to call without a system stack switch at the call site. The only thing this complicates is badsystemstack itself, which would now go into an infinite loop before printing anything (previously it would also go into an infinite loop, but would at least print the error first). Fix this by making badsystemstack do a direct write and then crash hard. Change-Id: Ic5b4a610df265e47962dcfa341cabac03c31c049 Reviewed-on: https://go-review.googlesource.com/93659 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:52 +00:00
Austin Clements	9d59234cbe	runtime: move unrecoverable panic handling to the system stack Currently parts of unrecoverable panic handling (notably, printing panic messages) can happen on the user stack. This may grow the stack, which is generally fine, but if we're handling a runtime panic, it's better to do as little as possible in case the runtime is in an inconsistent state. Hence, this commit rearranges the handling of unrecoverable panics so that it's done entirely on the system stack. This is mostly a matter of shuffling code a bit so everything can move into a systemstack block. The one slight subtlety is in the "panic during panic" case, where we now depend on startpanic_m's caller to print the stack rather than startpanic_m itself. To make this work, startpanic_m now returns a boolean indicating that the caller should avoid trying to print any panic messages and get right to the stack trace. Since the caller is already in a position to do this, this actually simplifies things a little. Change-Id: Id72febe8c0a9fb31d9369b600a1816d65a49bfed Reviewed-on: https://go-review.googlesource.com/93658 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:51 +00:00
Austin Clements	da022da900	cmd/compile: simplify OpSlicemask optimization The previous CL introduced isConstDelta. Use it to simplify the OpSlicemask optimization in the prove pass. This passes toolstash -cmp. Change-Id: If2aa762db4cdc0cd1c581a536340530a9831081b Reviewed-on: https://go-review.googlesource.com/87481 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:29 +00:00
Austin Clements	6436270dad	cmd/compile: add fence-post implications to prove This adds four new deductions to the prove pass, all related to adding or subtracting one from a value. This is the first hint of actual arithmetic relations in the prove pass. The most effective of these is x-1 >= w && x > min ⇒ x > w This helps eliminate bounds checks in code like if x > 0 { // do something with s[x-1] } Altogether, these deductions prove an additional 260 branches in std and cmd. Furthermore, they will let us eliminate some tricky compiler-inserted panics in the runtime that are interfering with static analysis. Fixes #23354. Change-Id: I7088223e0e0cd6ff062a75c127eb4bb60e6dce02 Reviewed-on: https://go-review.googlesource.com/87480 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:28 +00:00
Austin Clements	941fc129e2	cmd/compile: derive unsigned limits from signed limits in prove This adds a few simple deductions to the prove pass' fact table to derive unsigned concrete limits from signed concrete limits where possible. This tweak lets the pass prove 70 additional branch conditions in std and cmd. This is based on a comment from the recently-deleted factsTable.get: "// TODO: also use signed data if lim.min >= 0". Change-Id: Ib4340249e7733070f004a0aa31254adf5df8a392 Reviewed-on: https://go-review.googlesource.com/87479 Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:27 +00:00
Austin Clements	669db2cef5	cmd/compile: make prove pass use unsatisfiability Currently the prove pass uses implication queries. For each block, it collects the set of branch conditions leading to that block, and queries this fact table for whether any of these facts imply the block's own branch condition (or its inverse). This works remarkably well considering it doesn't do any deduction on these facts, but it has various downsides: 1. It requires an implementation both of adding facts to the table and determining implications. These are very nearly duals of each other, but require separate implementations. Likewise, the process of asserting facts of dominating branch conditions is very nearly the dual of the process of querying implied branch conditions. 2. It leads to less effective use of derived facts. For example, the prove pass currently derives facts about the relations between len and cap, but can't make use of these unless a branch condition is in the exact form of a derived fact. If one of these derived facts contradicts another fact, it won't notice or make use of this. This CL changes the approach of the prove pass to instead use contradiction instead of implication. Rather than ever querying a branch condition, it simply adds branch conditions to the fact table. If this leads to a contradiction (specifically, it makes the fact set unsatisfiable), that branch is impossible and can be cut. As a result, 1. We can eliminate the code for determining implications (factsTable.get disappears entirely). Also, there is now a single implementation of visiting and asserting branch conditions, since we don't have to flip them around to treat them as facts in one place and queries in another. 2. Derived facts can be used effectively. It doesn't matter why the fact table is unsatisfiable; a contradiction in any of the facts is enough. 3. As an added benefit, it's now quite easy to avoid traversing beyond provably-unreachable blocks. In contrast, the current implementation always visits all blocks. The prove pass already has nearly all of the mechanism necessary to compute unsatisfiability, which means this both simplifies the code and makes it more powerful. The only complication is that the current implication procedure has a hack for dealing with the 0 <= Args[0] condition of OpIsInBounds and OpIsSliceInBounds. We replace this with asserting the appropriate fact when we process one of these conditions. This seems much cleaner anyway, and works because we can now take advantage of derived facts. This has no measurable effect on compiler performance. Effectiveness: There is exactly one condition in all of std and cmd that this fails to prove that the old implementation could: (int64(^uint(0)>>1) < x) in encoding/gob. This can never be true because x is an int, and it's basically coincidence that the old code gets this. (For example, it fails to prove the similar (x < ^int64(^uint(0)>>1)) condition that immediately precedes it, and even though the conditions are logically unrelated, it wouldn't get the second one if it hadn't first processed the first!) It does, however, prove a few dozen additional branches. These come from facts that are added to the fact table about the relations between len and cap. These were almost never queried directly before, but could lead to contradictions, which the unsat-based approach is able to use. There are exactly two branches in std and cmd that this implementation proves in the other direction. This sounds scary, but is okay because both occur in already-unreachable blocks, so it doesn't matter what we chose. Because the fact table logic is sound but incomplete, it fails to prove that the block isn't reachable, even though it is able to prove that both outgoing branches are impossible. We could turn these blocks into BlockExit blocks, but it doesn't seem worth the trouble of the extra proof effort for something that happens twice in all of std and cmd. Tests: This CL updates test/prove.go to change the expected messages because it can no longer give a "reason" why it proved or disproved a condition. It also adds a new test of a branch it couldn't prove before. It mostly guts test/sliceopt.go, removing everything related to slice bounds optimizations and moving a few relevant tests to test/prove.go. Much of this test is actually unreachable. The new prove pass figures this out and doesn't try to prove anything about the unreachable parts. The output on the unreachable parts is already suspect because anything can be proved at that point, so it's really just a regression test for an algorithm the compiler no longer uses. This is a step toward fixing #23354. That issue is quite easy to fix once we can use derived facts effectively. Change-Id: Ia48a1b9ee081310579fe474e4a61857424ff8ce8 Reviewed-on: https://go-review.googlesource.com/87478 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:25 +00:00
Austin Clements	2e9cf5f66e	cmd/compile: simplify limit logic in prove This replaces the open-coded intersection of limits in the prove pass with a general limit intersection operation. This should get identical results except in one case where it's more precise: when handling an equality relation, if the value is outside the existing range, this will reduce the range to empty rather than resetting it. This will be important to a follow-up CL where we can take advantage of empty ranges. For #23354. Change-Id: I3d3d75924f61b1da1cb604b3a9d189b26fb3a14e Reviewed-on: https://go-review.googlesource.com/87477 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:24 +00:00
Austin Clements	44e20b64ef	cmd/compile: more String methods for prove types These aid in debugging. Change-Id: Ieb38c996765f780f6103f8c3292639d408c25123 Reviewed-on: https://go-review.googlesource.com/87476 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:23 +00:00
Austin Clements	491f409a32	cmd/compile: minor comment improvements/corrections Change-Id: Ie0934f1528d58d4971cdef726d3e2d23cf3935d3 Reviewed-on: https://go-review.googlesource.com/87475 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:21 +00:00
Matthew Dempsky	b55eedd173	Revert "cmd/compile: cleanup nodpc and nodfp" This reverts commit `dcac984b97`. Reason for revert: broke LR architectures (arm64, ppc64, s390x) Change-Id: I531d311c9053e81503c8c78d6cf044b318fc828b Reviewed-on: https://go-review.googlesource.com/99695 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-08 21:23:01 +00:00
Alberto Donizetti	010579c237	math/big: allocate less in Float.Sqrt The Newton sqrtInverse procedure we use to compute Float.Sqrt should not allocate a number of times proportional to the number of Newton iterations we need to reach the desired precision. At the beginning the function the target precision is known, so even if we do want to perform the early steps at low precisions (to save time), it's still possible to pre-allocate larger backing arrays, both for the temp variables in the loop and the variable that'll hold the final result. There's one complication. At the following line: u.Sub(three, u) the Sub method will allocate, because the receiver aliases one of the arguments, and the large backing array we initially allocated for u will be replaced by a smaller one allocated by Sub. We can work around this by introducing a second temp variable u2 that we use to hold the Sub call result. Overall, the sqrtInverse procedure still allocates a number of times proportional to the number of Newton steps, because unfortunately a few of the Mul calls in the Newton function allocate; but at least we allocate less in the function itself. FloatSqrt/256-4 1.97µs ± 1% 1.84µs ± 1% -6.61% (p=0.000 n=8+8) FloatSqrt/1000-4 4.80µs ± 3% 4.28µs ± 1% -10.78% (p=0.000 n=8+8) FloatSqrt/10000-4 40.0µs ± 1% 38.3µs ± 1% -4.15% (p=0.000 n=8+8) FloatSqrt/100000-4 955µs ± 1% 932µs ± 0% -2.49% (p=0.000 n=8+7) FloatSqrt/1000000-4 79.8ms ± 1% 79.4ms ± 1% ~ (p=0.105 n=8+8) name old alloc/op new alloc/op delta FloatSqrt/256-4 816B ± 0% 512B ± 0% -37.25% (p=0.000 n=8+8) FloatSqrt/1000-4 2.50kB ± 0% 1.47kB ± 0% -41.03% (p=0.000 n=8+8) FloatSqrt/10000-4 23.5kB ± 0% 18.2kB ± 0% -22.62% (p=0.000 n=8+8) FloatSqrt/100000-4 251kB ± 0% 173kB ± 0% -31.26% (p=0.000 n=8+8) FloatSqrt/1000000-4 4.61MB ± 0% 2.86MB ± 0% -37.90% (p=0.000 n=8+8) name old allocs/op new allocs/op delta FloatSqrt/256-4 12.0 ± 0% 8.0 ± 0% -33.33% (p=0.000 n=8+8) FloatSqrt/1000-4 19.0 ± 0% 9.0 ± 0% -52.63% (p=0.000 n=8+8) FloatSqrt/10000-4 35.0 ± 0% 14.0 ± 0% -60.00% (p=0.000 n=8+8) FloatSqrt/100000-4 55.0 ± 0% 23.0 ± 0% -58.18% (p=0.000 n=8+8) FloatSqrt/1000000-4 122 ± 0% 75 ± 0% -38.52% (p=0.000 n=8+8) Change-Id: I950dbf61a40267a6cca82ae72524c3024bcb149c Reviewed-on: https://go-review.googlesource.com/87659 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-08 19:12:35 +00:00
isharipo	d2a5263a9c	math/big: speedup nat.setBytes for bigger slices Set up to _S (number of bytes in Uint) bytes at time by using BigEndian.Uint32 and BigEndian.Uint64. The performance improves for slices bigger than _S bytes. This is the case for 128/256bit arith that initializes it's objects from bytes. name old time/op new time/op delta NatSetBytes/8-4 29.8ns ± 1% 11.4ns ± 0% -61.63% (p=0.000 n=9+8) NatSetBytes/24-4 109ns ± 1% 56ns ± 0% -48.75% (p=0.000 n=9+8) NatSetBytes/128-4 420ns ± 2% 110ns ± 1% -73.83% (p=0.000 n=10+10) NatSetBytes/7-4 26.2ns ± 1% 21.3ns ± 2% -18.63% (p=0.000 n=8+9) NatSetBytes/23-4 106ns ± 1% 67ns ± 1% -36.93% (p=0.000 n=9+10) NatSetBytes/127-4 410ns ± 2% 121ns ± 0% -70.46% (p=0.000 n=9+8) Found this optimization opportunity by looking at ethereum_corevm community benchmark cpuprofile. name old time/op new time/op delta OpDiv256-4 715ns ± 1% 596ns ± 1% -16.57% (p=0.008 n=5+5) OpDiv128-4 373ns ± 1% 314ns ± 1% -15.83% (p=0.008 n=5+5) OpDiv64-4 301ns ± 0% 285ns ± 1% -5.12% (p=0.008 n=5+5) Change-Id: I8e5a680ae6284c8233d8d7431d51253a8a740b57 Reviewed-on: https://go-review.googlesource.com/98775 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> Reviewed-by: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-08 18:50:10 +00:00
Matthew Dempsky	dcac984b97	cmd/compile: cleanup nodpc and nodfp Instead of creating a new &nodfp expression for every recover() call, or a new nodpc variable for every function instrumented by the race detector, this CL introduces two new uintptr-typed pseudo-variables callerSP and callerPC. These pseudo-variables act just like calls to the runtime's getcallersp() and getcallerpc() functions. For consistency, change runtime.gorecover's builtin stub's parameter type from "*int32" to "uintptr". Passes toolstash-check, but toolstash-check -race fails because of register allocator changes. Change-Id: I985d644653de2dac8b7b03a28829ad04dfd4f358 Reviewed-on: https://go-review.googlesource.com/99416 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-08 18:22:29 +00:00
Matthew Dempsky	6a5cfa8b63	cmd/compile: remove two out-of-phase calls to walk All calls to walkstmt/walkexpr/etc should be rooted from funccompile, whereas transformclosure and fninit are called by main. Passes toolstash-check. Change-Id: Ic880e2d2d83af09618ce4daa8e7716f6b389e53e Reviewed-on: https://go-review.googlesource.com/99418 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-08 18:22:13 +00:00

... 30 31 32 33 34 ...

37280 Commits