From 0adf6dce8a8895fa7dd6f366ba336d9471b32631 Mon Sep 17 00:00:00 2001 From: Shenghou Ma Date: Sat, 17 Oct 2015 18:21:44 -0400 Subject: [PATCH] runtime: disable prefetching on 386 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It doesn't seem to help on modern processors and it makes Go impossible to run on Pentium MMX (which is the documented minimum hardware requirement.) Old is with prefetch, new is w/o. Both are compiled with GO386=sse2. Benchmarking is done on Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz. name old time/op new time/op delta BinaryTree17-4 2.89s ± 2% 2.87s ± 0% ~ (p=0.061 n=11+10) Fannkuch11-4 3.65s ± 0% 3.65s ± 0% ~ (p=0.365 n=11+11) FmtFprintfEmpty-4 52.1ns ± 0% 52.1ns ± 0% ~ (p=0.065 n=10+9) FmtFprintfString-4 168ns ± 0% 167ns ± 0% -0.48% (p=0.000 n=8+10) FmtFprintfInt-4 167ns ± 0% 167ns ± 1% ~ (p=0.591 n=9+10) FmtFprintfIntInt-4 295ns ± 0% 292ns ± 0% -0.99% (p=0.000 n=9+10) FmtFprintfPrefixedInt-4 327ns ± 0% 326ns ± 0% -0.24% (p=0.007 n=10+10) FmtFprintfFloat-4 431ns ± 0% 431ns ± 0% -0.07% (p=0.000 n=10+11) FmtManyArgs-4 1.13µs ± 0% 1.13µs ± 0% -0.37% (p=0.009 n=11+11) GobDecode-4 9.36ms ± 1% 9.33ms ± 0% -0.31% (p=0.006 n=11+10) GobEncode-4 7.38ms ± 1% 7.38ms ± 1% ~ (p=0.797 n=11+11) Gzip-4 394ms ± 0% 395ms ± 1% ~ (p=0.519 n=11+11) Gunzip-4 65.4ms ± 0% 65.4ms ± 0% ~ (p=0.739 n=10+10) HTTPClientServer-4 52.4µs ± 1% 52.5µs ± 1% ~ (p=0.748 n=11+11) JSONEncode-4 19.0ms ± 0% 19.0ms ± 0% ~ (p=0.780 n=9+10) JSONDecode-4 59.6ms ± 0% 59.6ms ± 0% ~ (p=0.720 n=9+10) Mandelbrot200-4 4.09ms ± 0% 4.09ms ± 0% ~ (p=0.295 n=11+9) GoParse-4 3.45ms ± 1% 3.43ms ± 1% -0.35% (p=0.040 n=11+11) RegexpMatchEasy0_32-4 101ns ± 1% 101ns ± 1% ~ (p=1.000 n=11+11) RegexpMatchEasy0_1K-4 796ns ± 0% 796ns ± 0% ~ (p=0.954 n=10+8) RegexpMatchEasy1_32-4 110ns ± 0% 110ns ± 1% ~ (p=0.289 n=9+11) RegexpMatchEasy1_1K-4 991ns ± 0% 991ns ± 0% ~ (p=0.784 n=10+8) RegexpMatchMedium_32-4 131ns ± 0% 130ns ± 0% -0.42% (p=0.004 n=11+9) RegexpMatchMedium_1K-4 41.9µs ± 1% 41.6µs ± 0% ~ (p=0.067 n=11+9) RegexpMatchHard_32-4 2.34µs ± 0% 2.34µs ± 0% ~ (p=0.208 n=11+11) RegexpMatchHard_1K-4 70.9µs ± 0% 71.0µs ± 0% ~ (p=0.968 n=9+10) Revcomp-4 819ms ± 0% 818ms ± 0% ~ (p=0.251 n=10+11) Template-4 73.9ms ± 0% 73.8ms ± 0% -0.25% (p=0.013 n=10+11) TimeParse-4 414ns ± 0% 414ns ± 0% ~ (p=0.809 n=11+10) TimeFormat-4 485ns ± 0% 485ns ± 0% ~ (p=0.404 n=11+7) name old speed new speed delta GobDecode-4 82.0MB/s ± 1% 82.3MB/s ± 0% +0.31% (p=0.007 n=11+10) GobEncode-4 104MB/s ± 1% 104MB/s ± 1% ~ (p=0.797 n=11+11) Gzip-4 49.2MB/s ± 0% 49.1MB/s ± 1% ~ (p=0.507 n=11+11) Gunzip-4 297MB/s ± 0% 297MB/s ± 0% ~ (p=0.670 n=10+10) JSONEncode-4 102MB/s ± 0% 102MB/s ± 0% ~ (p=0.794 n=9+10) JSONDecode-4 32.6MB/s ± 0% 32.6MB/s ± 0% ~ (p=0.334 n=9+9) GoParse-4 16.8MB/s ± 1% 16.9MB/s ± 1% ~ (p=0.052 n=11+11) RegexpMatchEasy0_32-4 314MB/s ± 0% 314MB/s ± 1% ~ (p=0.618 n=11+11) RegexpMatchEasy0_1K-4 1.29GB/s ± 0% 1.29GB/s ± 0% ~ (p=0.315 n=10+10) RegexpMatchEasy1_32-4 290MB/s ± 1% 290MB/s ± 1% ~ (p=0.667 n=10+11) RegexpMatchEasy1_1K-4 1.03GB/s ± 0% 1.03GB/s ± 0% ~ (p=0.829 n=10+8) RegexpMatchMedium_32-4 7.63MB/s ± 0% 7.65MB/s ± 0% ~ (p=0.142 n=11+11) RegexpMatchMedium_1K-4 24.4MB/s ± 1% 24.6MB/s ± 0% ~ (p=0.063 n=11+9) RegexpMatchHard_32-4 13.7MB/s ± 0% 13.7MB/s ± 0% ~ (p=0.302 n=11+11) RegexpMatchHard_1K-4 14.4MB/s ± 0% 14.4MB/s ± 0% ~ (p=0.784 n=9+10) Revcomp-4 310MB/s ± 0% 311MB/s ± 0% ~ (p=0.243 n=10+11) Template-4 26.2MB/s ± 0% 26.3MB/s ± 0% +0.24% (p=0.009 n=10+11) Update #12970. Change-Id: Id185080687a60c229a5cb2e5220e7ca1b53910e2 Reviewed-on: https://go-review.googlesource.com/15999 Reviewed-by: Austin Clements Reviewed-by: Dmitry Vyukov --- src/runtime/asm_386.s | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/src/runtime/asm_386.s b/src/runtime/asm_386.s index effa661acd..8db8aa9eef 100644 --- a/src/runtime/asm_386.s +++ b/src/runtime/asm_386.s @@ -1530,23 +1530,15 @@ TEXT runtime·goexit(SB),NOSPLIT,$0-0 // traceback from goexit1 must hit code range of goexit BYTE $0x90 // NOP +// Prefetching doesn't seem to help. TEXT runtime·prefetcht0(SB),NOSPLIT,$0-4 - MOVL addr+0(FP), AX - PREFETCHT0 (AX) RET TEXT runtime·prefetcht1(SB),NOSPLIT,$0-4 - MOVL addr+0(FP), AX - PREFETCHT1 (AX) RET - TEXT runtime·prefetcht2(SB),NOSPLIT,$0-4 - MOVL addr+0(FP), AX - PREFETCHT2 (AX) RET TEXT runtime·prefetchnta(SB),NOSPLIT,$0-4 - MOVL addr+0(FP), AX - PREFETCHNTA (AX) RET