1
0
mirror of https://github.com/golang/go synced 2024-11-08 05:56:12 -07:00
Commit Graph

1159 Commits

Author SHA1 Message Date
Ben Shi
e351a16005 cmd/internal/obj/arm64: reject incorrect form of LDP/STP
"LDP (R0), (F0, F1)" and "STP (F1, F2), (R0)" are
silently accepted by the arm64 assembler without
any error message. And this CL fixes that bug.

fixes #26556.

Change-Id: Ib6fae81956deb39a4ffd95e9409acc8dad3ab2d2
Reviewed-on: https://go-review.googlesource.com/125637
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-07-30 15:31:06 +00:00
Ian Lance Taylor
274fde9a36 cmd/internal/buildid: close ELF file after reading note
Updates #26400

Change-Id: I1747d1f1018521cdfa4b3ed13412a944829967cf
Reviewed-on: https://go-review.googlesource.com/124235
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-07-16 23:37:14 +00:00
Than McIntosh
ec88f781c2 cmd/compile: call objabi.PathToPrefix when emitting abstract fn
When generating an abstract function DIE, call objabi.PathToPrefix on
the import path so as to be consistent with how the linker handles
import paths. This is intended to resolve another problem with DWARF
inline info generation in which there are multiple inconsistent
versions of an abstract function DIE for a function whose package path
is rewritten/canonicalized by objabi.PathToPrefix.

Fixes #26237

Change-Id: I4b64c090ae43a1ad87f47587a1a71f19bc5fc8e8
Reviewed-on: https://go-review.googlesource.com/123036
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-07-10 18:54:39 +00:00
Ian Lance Taylor
7254cfc37b cmd/go: revert "output coverage report even if there are no test files"
Original CL description:

    When using test -cover or -coverprofile the output for "no test files"
    is the same format as for "no tests to run".

Reverting because this CL changed cmd/go to build test binaries for
packages that have no tests, leading to extra work and confusion.

Updates #24570
Fixes #25789
Fixes #26157
Fixes #26242

Change-Id: Ibab1307d39dfaec0de9359d6d96706e3910c8efd
Reviewed-on: https://go-review.googlesource.com/122518
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2018-07-09 14:42:21 +00:00
Michael Munday
9fa988547a cmd/internal/obj/s390x: increase maximum number of loop iterations
The maximum number of 'spanz' iterations that the s390x assembler
performs to reach a fixed point for relative offsets was 10. This
turned out to be too aggressive for one example of auto-generated
fuzzing code. Increase the number of iterations by 10x to reduce
the likelihood that the limit will be hit again. This limit only
exists to help find bugs in the assembler.

master at tip does not fail with the example code in the issue, I
have therefore not submitted it as a test (it is also quite large).
I tested this change with the example code at the commit given and
it fixes the issue.

Fixes #25269.

Change-Id: I0e44948957a7faff51c7d27c0b7746ed6e2d47bb
Reviewed-on: https://go-review.googlesource.com/122235
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-07-05 07:21:50 +00:00
Tobias Klauser
2ee6bfbd7e cmd/internal/obj: follow convention for generated code comment
Follow the convertion (https://golang.org/s/generatedcode) for generated
code in stringer.go.

Change-Id: I7b5fbb04ba03e8ac77a9a0a402088669469de858
Reviewed-on: https://go-review.googlesource.com/122015
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-07-03 16:14:43 +00:00
Wei Xiao
0a7ac93c27 cmd/compile: improve atomic add intrinsics with ARMv8.1 new instruction
ARMv8.1 has added new instruction (LDADDAL) for atomic memory operations. This
CL improves existing atomic add intrinsics with the new instruction. Since the
new instruction is only guaranteed to be present after ARMv8.1, we guard its
usage with a conditional on CPU feature.

Performance result on ARMv8.1 machine:
name        old time/op  new time/op  delta
Xadd-224    1.05µs ± 6%  0.02µs ± 4%  -98.06%  (p=0.000 n=10+8)
Xadd64-224  1.05µs ± 3%  0.02µs ±13%  -98.10%  (p=0.000 n=9+10)
[Geo mean]  1.05µs       0.02µs       -98.08%

Performance result on ARMv8.0 machine:
name        old time/op  new time/op  delta
Xadd-46      538ns ± 1%   541ns ± 1%  +0.62%  (p=0.000 n=9+9)
Xadd64-46    505ns ± 1%   508ns ± 0%  +0.48%  (p=0.003 n=9+8)
[Geo mean]   521ns        524ns       +0.55%

Change-Id: If4b5d8d0e2d6f84fe1492a4f5de0789910ad0ee9
Reviewed-on: https://go-review.googlesource.com/81877
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-06-21 14:52:43 +00:00
Richard Musiol
e083dc6307 runtime, sycall/js: add support for callbacks from JavaScript
This commit adds support for JavaScript callbacks back into
WebAssembly. This is experimental API, just like the rest of the
syscall/js package. The time package now also uses this mechanism
to properly support timers without resorting to a busy loop.

JavaScript code can call into the same entry point multiple times.
The new RUN register is used to keep track of the program's
run state. Possible values are: starting, running, paused and exited.
If no goroutine is ready any more, the scheduler can put the
program into the "paused" state and the WebAssembly code will
stop running. When a callback occurs, the JavaScript code puts
the callback data into a queue and then calls into WebAssembly
to allow the Go code to continue running.

Updates #18892
Updates #25506

Change-Id: Ib8701cfa0536d10d69bd541c85b0e2a754eb54fb
Reviewed-on: https://go-review.googlesource.com/114197
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-06-14 21:50:53 +00:00
Lynn Boger
9e9ff565cd runtime/race: implement race detector for ppc64le
This adds the support to enable the race detector for ppc64le.

Added runtime/race_ppc64le.s to manage the calls from Go to the
LLVM tsan functions, mostly converting from the Go ABI to the
PPC64 ABI expected by Clang generated code.

Changed racewalk.go to call racefuncenterfp instead of racefuncenter
on ppc64le to allow the caller pc to be obtained in the asm code
before calling the tsan version.

Changed the set up code for racecallbackthunk so it doesn't use
the autogenerated save and restore of the link register since that
sequence uses registers inconsistent with the normal ppc64 ABI.

Made various changes to recognize that race is supported for
ppc64le.

Ensured that tls_g is updated and accessible from race_linux_ppc64le.s
so that the race ctx can be obtained and passed to tsan functions.

This enables the race tests for ppc64le in cmd/dist/test.go and
increases the timeout when running the benchmarks with the -race
option to avoid timing out.

Updates #24354, #23731

Change-Id: Ib97dc7ac313e6313c836dc7d2fb698f9d8fba3ef
Reviewed-on: https://go-review.googlesource.com/107935
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-06-11 17:45:36 +00:00
Dennis Kuhnert
ebb8a1f8e6 cmd/go: output coverage report even if there are no test files
When using test -cover or -coverprofile the output for "no test files"
is the same format as for "no tests to run".

Fixes #24570

Change-Id: If05609411676d42d94c1feac4bc839974fae2cc1
Reviewed-on: https://go-review.googlesource.com/115095
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-06-06 01:10:20 +00:00
Tim Cooper
161874da2a all: update comment URLs from HTTP to HTTPS, where possible
Each URL was manually verified to ensure it did not serve up incorrect
content.

Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2018-06-01 21:52:00 +00:00
Ben Shi
3d6e4ec0a8 cmd/internal/obj/arm64: fix two issues in the assembler
There are two issues in the arm64 assembler.

1. "CMPW $0x22220000, RSP" is encoded to 5b44a4d2ff031b6b, which
   is the combination of "MOVD $0x22220000, Rtmp" and
   "NEGSW Rtmp, ZR".
   The right encoding should be a combination of
   "MOVD $0x22220000, Rtmp" and "CMPW Rtmp, RSP".

2. "AND $0x22220000, R2, RSP" is encoded to 5b44a4d25f601b00,
   which is the combination of "MOVD $0x22220000, Rtmp" and
   an illegal instruction.
   The right behavior should be an error report of
   "illegal combination", since "AND Rtmp, RSP, RSP" is invalid
   in armv8.

This CL fixes the above 2 issues and adds more test cases.

fixes #25557

Change-Id: Ia510be26b58a229f5dfe8a5fa0b35569b2d566e7
Reviewed-on: https://go-review.googlesource.com/114796
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-06-01 15:24:42 +00:00
Tim Cooper
555eb70db2 all: regenerate stringer files
Change-Id: I34838320047792c4719837591e848b87ccb7f5ab
Reviewed-on: https://go-review.googlesource.com/115058
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-29 20:35:41 +00:00
isharipo
67fe8b5677 cmd/internal/obj/x86: add missing Yi8 for VEX ytabs
This change adds Yi8 forms for every ytab that had them before AVX-512 patch.
The rationale is backwards-compatibility.

EVEX forms remain strict and unchanged as they're not bound to any
backwards-compatibility issues.

Fixes #25510

Change-Id: Icd692266010ed64c9fe47cc837afc2edf2ad2d1d
Reviewed-on: https://go-review.googlesource.com/114136
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-05-23 15:34:19 +00:00
Martin Möhrmann
f045ddc624 internal/cpu: add experiment to disable CPU features with GODEBUGCPU
Needs the go compiler to be build with GOEXPERIMENT=debugcpu to be active.

The GODEBUGCPU environment variable can be used to disable usage of
specific processor features in the Go standard library.
This is useful for testing and benchmarking different code paths that
are guarded by internal/cpu variable checks.

Use of processor features can not be enabled through GODEBUGCPU.

To disable usage of AVX and SSE41 cpu features on GOARCH amd64 use:
GODEBUGCPU=avx=0,sse41=0

The special "all" option can be used to disable all options:
GODEBUGCPU=all=0

Updates #12805
Updates #15403

Change-Id: I699c2e6f74d98472b6fb4b1e5ffbf29b15697aab
Reviewed-on: https://go-review.googlesource.com/91737
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 18:49:31 +00:00
Austin Clements
c5ed10f3be runtime: support for debugger function calls
This adds a mechanism for debuggers to safely inject calls to Go
functions on amd64. Debuggers must participate in a protocol with the
runtime, and need to know how to lay out a call frame, but the runtime
support takes care of the details of handling live pointers in
registers, stack growth, and detecting the trickier conditions when it
is unsafe to inject a user function call.

Fixes #21678.
Updates derekparker/delve#119.

Change-Id: I56d8ca67700f1f77e19d89e7fc92ab337b228834
Reviewed-on: https://go-review.googlesource.com/109699
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:05 +00:00
Austin Clements
9f95c9db23 cmd/compile, cmd/internal/obj: record register maps in binary
This adds FUNCDATA and PCDATA that records the register maps much like
the existing live arguments maps and live locals maps. The register
map is indexed independently from the argument and locals maps since
changes in register liveness tend not to correlate with changes to
argument and local liveness.

This is the final CL toward adding safe-points everywhere. The
following CLs will optimize liveness analysis to bring down the cost.
The effect of this CL is:

name        old time/op       new time/op       delta
Template          195ms ± 2%        197ms ± 1%    ~     (p=0.136 n=9+9)
Unicode          98.4ms ± 2%       99.7ms ± 1%  +1.39%  (p=0.004 n=10+10)
GoTypes           685ms ± 1%        700ms ± 1%  +2.06%  (p=0.000 n=9+9)
Compiler          3.28s ± 1%        3.34s ± 0%  +1.71%  (p=0.000 n=9+8)
SSA               7.79s ± 1%        7.91s ± 1%  +1.55%  (p=0.000 n=10+9)
Flate             133ms ± 2%        133ms ± 2%    ~     (p=0.190 n=10+10)
GoParser          161ms ± 2%        164ms ± 3%  +1.83%  (p=0.015 n=10+10)
Reflect           450ms ± 1%        457ms ± 1%  +1.62%  (p=0.000 n=10+10)
Tar               183ms ± 2%        185ms ± 1%  +0.91%  (p=0.008 n=9+10)
XML               234ms ± 1%        238ms ± 1%  +1.60%  (p=0.000 n=9+9)
[Geo mean]        411ms             417ms       +1.40%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.47M ± 0%        1.51M ± 0%  +2.79%  (p=0.000 n=10+10)

Compared to just before "cmd/internal/obj: consolidate emitting entry
stack map", the cumulative effect of adding stack maps everywhere and
register maps is:

name        old time/op       new time/op       delta
Template          185ms ± 2%        197ms ± 1%   +6.42%  (p=0.000 n=10+9)
Unicode          96.3ms ± 3%       99.7ms ± 1%   +3.60%  (p=0.000 n=10+10)
GoTypes           658ms ± 0%        700ms ± 1%   +6.37%  (p=0.000 n=10+9)
Compiler          3.14s ± 1%        3.34s ± 0%   +6.53%  (p=0.000 n=9+8)
SSA               7.41s ± 2%        7.91s ± 1%   +6.71%  (p=0.000 n=9+9)
Flate             126ms ± 1%        133ms ± 2%   +6.15%  (p=0.000 n=10+10)
GoParser          153ms ± 1%        164ms ± 3%   +6.89%  (p=0.000 n=10+10)
Reflect           437ms ± 1%        457ms ± 1%   +4.59%  (p=0.000 n=10+10)
Tar               178ms ± 1%        185ms ± 1%   +4.18%  (p=0.000 n=10+10)
XML               223ms ± 1%        238ms ± 1%   +6.39%  (p=0.000 n=10+9)
[Geo mean]        394ms             417ms        +5.78%

name        old alloc/op      new alloc/op      delta
Template         34.5MB ± 0%       38.0MB ± 0%  +10.19%  (p=0.000 n=10+10)
Unicode          29.3MB ± 0%       30.3MB ± 0%   +3.56%  (p=0.000 n=8+9)
GoTypes           113MB ± 0%        125MB ± 0%  +10.89%  (p=0.000 n=10+10)
Compiler          510MB ± 0%        575MB ± 0%  +12.79%  (p=0.000 n=10+10)
SSA              1.46GB ± 0%       1.64GB ± 0%  +12.40%  (p=0.000 n=10+10)
Flate            23.9MB ± 0%       25.9MB ± 0%   +8.56%  (p=0.000 n=10+10)
GoParser         28.0MB ± 0%       30.8MB ± 0%  +10.08%  (p=0.000 n=10+10)
Reflect          77.6MB ± 0%       84.3MB ± 0%   +8.63%  (p=0.000 n=10+10)
Tar              34.1MB ± 0%       37.0MB ± 0%   +8.44%  (p=0.000 n=10+10)
XML              42.7MB ± 0%       47.2MB ± 0%  +10.75%  (p=0.000 n=10+10)
[Geo mean]       76.0MB            83.3MB        +9.60%

name        old allocs/op     new allocs/op     delta
Template           321k ± 0%         337k ± 0%   +4.98%  (p=0.000 n=10+10)
Unicode            337k ± 0%         340k ± 0%   +1.04%  (p=0.000 n=10+9)
GoTypes           1.13M ± 0%        1.18M ± 0%   +4.85%  (p=0.000 n=10+10)
Compiler          4.67M ± 0%        4.96M ± 0%   +6.25%  (p=0.000 n=10+10)
SSA               11.7M ± 0%        12.3M ± 0%   +5.69%  (p=0.000 n=10+10)
Flate              216k ± 0%         226k ± 0%   +4.52%  (p=0.000 n=10+9)
GoParser           271k ± 0%         283k ± 0%   +4.52%  (p=0.000 n=10+10)
Reflect            927k ± 0%         972k ± 0%   +4.78%  (p=0.000 n=10+10)
Tar                318k ± 0%         333k ± 0%   +4.56%  (p=0.000 n=10+10)
XML                376k ± 0%         395k ± 0%   +5.04%  (p=0.000 n=10+10)
[Geo mean]         730k              764k        +4.61%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.46M ± 0%        1.51M ± 0%   +3.66%  (p=0.000 n=10+10)

For #24543.

Change-Id: I91e003dc64151916b384274884bf02a2d6862547
Reviewed-on: https://go-review.googlesource.com/109353
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:03 +00:00
isharipo
5437cde96c cmd/asm: enable AVX512
- Uncomment tests for AVX512 encoder
- Permit instruction suffixes for x86
- Permit limited reg list [reg-reg] syntax for x86 for multi-source ops
- EVEX encoding support in obj/x86 (Z-cases, asmevex, etc.)
- optabs and ytabs generated by x86avxgen (https://golang.org/cl/107216)

Note: suffix formatting implemented with updated CConv function.
Now arch asm backend should register formatting function by
calling RegisterOpSuffix.

Updates #22779

Change-Id: I076a167ee49582700e058c56ad74e6696710c8c8
Reviewed-on: https://go-review.googlesource.com/113315
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-22 14:57:15 +00:00
Austin Clements
02495da6b6 cmd/internal/obj: consolidate emitting entry stack map
The obj package needs to emit the PCDATA to select the entry stack map
before calling morestack. Currently this is copied for every
architecture. Since we're about to change how this works, consolidate
all of these copies into a single helper function.

For #24543.

Change-Id: Ia92d94de78f8e23fd06dba747c43e03e5989f67b
Reviewed-on: https://go-review.googlesource.com/109346
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 14:43:35 +00:00
isharipo
d2c68bb65f cmd/internal/obj/x86: fix VPERMQ and VPERMPD ytab
Fixes invalid encoding of VPERMQ and VPERMPD that use
negative immediate argument.

Fixes #25418
Updates #25420

Change-Id: Idd8180c4c632a76b76f3a68efd5f930d94431994
Reviewed-on: https://go-review.googlesource.com/113615
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-05-17 16:20:25 +00:00
Tobias Klauser
ca3364836f cmd/internal/objfile, debug/macho: support disassembling arm64 Mach-O objects
Fixes #25423

Change-Id: I6bed0726b8f4c7d607a3df271b2ab1006e96fa75
Reviewed-on: https://go-review.googlesource.com/113356
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-16 15:32:50 +00:00
David Chase
7bac2a95f6 cmd/compile: plumb prologueEnd into DWARF
This marks the first instruction after the prologue for
consumption by debuggers, specifically Delve, who asked
for it.  gdb appears to ignore it, lldb appears to use it.

The bits for end-of-prologue and beginning-of-epilogue
are added to Pos (reducing maximum line number by 4x, to
1048575).  They're added in cmd/internal/obj/<ARCH>.go
(currently x86 only), so the compiler-proper need not
deal with them.

The linker currently does nothing with beginning-of-epilogue,
but the plumbing exists to make it easier in the future.

This also upgrades the line number table to DWARF version 3.

This CL includes a regression in the coverage for
testdata/i22558.gdb-dbg.nexts, this appears to be a gdb
artifact but the fix would be in the preceding CL in the
stack.

Change-Id: I3bda5f46a0ed232d137ad48f65a14835c742c506
Reviewed-on: https://go-review.googlesource.com/110416
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-14 14:12:31 +00:00
David Chase
c2c1822b12 cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).

Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).

Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.

Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".

The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices.  This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.

Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go.  There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.

Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)

This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.

The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.

This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.

Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-14 14:09:49 +00:00
Ben Shi
bec2f51b07 cmd/internal/obj/arm: fix wrong encoding of MUL
The arm assembler incorrectly encodes the following instructions.
"MUL R2, R4" -> 0xe0040492 ("MUL R4, R2, R4")
"MUL R2, R4, R4" -> 0xe0040492 ("MUL R4, R2, R4")

The CL fixes that issue.

fixes #25347

Change-Id: I883716c7bc51c5f64837ae7d81342f94540a58cb
Reviewed-on: https://go-review.googlesource.com/112737
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-14 01:53:39 +00:00
quasilyte
e405c9cca3 cmd/internal/obj/x86: use named consts for movtab Z-cases
Use 0-terminated opbyte sequences for Zlit-like movtabs instead of E=0xff.

movCodeFullPtr is unused (load full ptr is unsupported), but it should
be removed in a separate CL (if removed at all).

Passes toolstash-check.

Change-Id: I28436718d93b017153de0e50e3bcec344ea4ee05
Reviewed-on: https://go-review.googlesource.com/107076
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-11 14:40:34 +00:00
Richard Musiol
bf23a4e61d cmd/internal/obj/wasm: avoid invalid offsets for Load/Store
Offsets for Load and Store instructions have type i32. Bad index
expression offsets can cause an offset to be larger than MaxUint32,
which is not allowed. One example for this is the test test/index0.go.

Generate valid code by adding a guard to the responsible rewrite rule.
Also emit a proper error when using such a bad index in assembly code.

Change-Id: Ie90adcbf3ae3861c26680eb81790f28692913ccf
Reviewed-on: https://go-review.googlesource.com/111955
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-10 12:05:17 +00:00
Lynn Boger
28edaf4584 cmd/compile,test: combine byte loads and stores on ppc64le
CL 74410 added rules to combine consecutive byte loads and
stores when the byte order was little endian for ppc64le. This
is the corresponding change for bytes that are in big endian order.
These rules are all intended for a little endian target arch.

This adds new testcases in test/codegen/memcombine.go

Fixes #22496
Updates #24242

Benchmark improvement for encoding/binary:
name                      old time/op    new time/op    delta
ReadSlice1000Int32s-16      11.0µs ± 0%     9.0µs ± 0%  -17.47%  (p=0.029 n=4+4)
ReadStruct-16               2.47µs ± 1%    2.48µs ± 0%   +0.67%  (p=0.114 n=4+4)
ReadInts-16                  642ns ± 1%     630ns ± 1%   -2.02%  (p=0.029 n=4+4)
WriteInts-16                 654ns ± 0%     653ns ± 1%   -0.08%  (p=0.629 n=4+4)
WriteSlice1000Int32s-16     8.75µs ± 0%    8.20µs ± 0%   -6.19%  (p=0.029 n=4+4)
PutUint16-16                1.16ns ± 0%    0.93ns ± 0%  -19.83%  (p=0.029 n=4+4)
PutUint32-16                1.16ns ± 0%    0.93ns ± 0%  -19.83%  (p=0.029 n=4+4)
PutUint64-16                1.85ns ± 0%    0.93ns ± 0%  -49.73%  (p=0.029 n=4+4)
LittleEndianPutUint16-16    1.03ns ± 0%    0.93ns ± 0%   -9.71%  (p=0.029 n=4+4)
LittleEndianPutUint32-16    0.93ns ± 0%    0.93ns ± 0%     ~     (all equal)
LittleEndianPutUint64-16    0.93ns ± 0%    0.93ns ± 0%     ~     (all equal)
PutUvarint32-16             43.0ns ± 0%    43.1ns ± 0%   +0.12%  (p=0.429 n=4+4)
PutUvarint64-16              174ns ± 0%     175ns ± 0%   +0.29%  (p=0.429 n=4+4)

Updates made to functions in gcm.go to enable their matching. An existing
testcase prevents these functions from being replaced by those in encoding/binary
due to import dependencies.

Change-Id: Idb3bd1e6e7b12d86cd828fb29cb095848a3e485a
Reviewed-on: https://go-review.googlesource.com/98136
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-08 13:15:39 +00:00
Austin Clements
44286b17c5 runtime: replace system goroutine whitelist with symbol test
Currently isSystemGoroutine has a hard-coded list of known entry
points into system goroutines. This list is annoying to maintain. For
example, it's missing the ensureSigM goroutine.

Replace it with a check that simply looks for any goroutine with
runtime function as its entry point, with a few exceptions. This also
matches the definition recently added to the trace viewer (CL 81315).

Change-Id: Iaed723d4a6e8c2ffb7c0c48fbac1688b00b30f01
Reviewed-on: https://go-review.googlesource.com/81655
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-07 21:38:40 +00:00
fanzha02
9d72e8c686 cmd/internal/obj/arm64: fix illegal 4-operand instructions accepted arm64 bug
Current assmbler accepts MUL* related instructions with 4 operands,
such as instruction "MUL R1, R2, R3, R4", which is illegal.

The fix adds an actual field informantion to Optab, which has value
of C_NONE, C_REG, etc, so assembler can use p.From3Type for checking
in oplook.

Add test cases.

Fixes #25059

Change-Id: I0656319383c460696b392197bf5960b987f8fc97
Reviewed-on: https://go-review.googlesource.com/109295
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
2018-05-07 15:12:35 +00:00
Josh Bleecher Snyder
c9c73978e2 cmd/compile: use slice extension idiom in LSym.Grow
name        old alloc/op      new alloc/op      delta
Template         35.0MB ± 0%       35.0MB ± 0%  -0.05%  (p=0.008 n=5+5)
Unicode          29.3MB ± 0%       29.3MB ± 0%    ~     (p=0.310 n=5+5)
GoTypes           115MB ± 0%        115MB ± 0%  -0.08%  (p=0.008 n=5+5)
Compiler          519MB ± 0%        519MB ± 0%  -0.08%  (p=0.008 n=5+5)
SSA              1.59GB ± 0%       1.59GB ± 0%  -0.05%  (p=0.008 n=5+5)
Flate            24.2MB ± 0%       24.2MB ± 0%  -0.06%  (p=0.008 n=5+5)
GoParser         28.2MB ± 0%       28.1MB ± 0%  -0.04%  (p=0.016 n=5+5)
Reflect          78.8MB ± 0%       78.7MB ± 0%  -0.10%  (p=0.008 n=5+5)
Tar              34.5MB ± 0%       34.4MB ± 0%  -0.07%  (p=0.008 n=5+5)
XML              43.3MB ± 0%       43.2MB ± 0%  -0.09%  (p=0.008 n=5+5)
[Geo mean]       77.5MB            77.4MB       -0.06%

name        old allocs/op     new allocs/op     delta
Template           330k ± 0%         329k ± 0%  -0.32%  (p=0.008 n=5+5)
Unicode            337k ± 0%         336k ± 0%  -0.10%  (p=0.008 n=5+5)
GoTypes           1.15M ± 0%        1.14M ± 0%  -0.34%  (p=0.008 n=5+5)
Compiler          4.78M ± 0%        4.77M ± 0%  -0.25%  (p=0.008 n=5+5)
SSA               12.9M ± 0%        12.9M ± 0%  -0.12%  (p=0.008 n=5+5)
Flate              221k ± 0%         220k ± 0%  -0.32%  (p=0.008 n=5+5)
GoParser           275k ± 0%         274k ± 0%  -0.34%  (p=0.008 n=5+5)
Reflect            944k ± 0%         940k ± 0%  -0.42%  (p=0.008 n=5+5)
Tar                323k ± 0%         322k ± 0%  -0.31%  (p=0.008 n=5+5)
XML                384k ± 0%         383k ± 0%  -0.26%  (p=0.008 n=5+5)
[Geo mean]         749k              747k       -0.28%


Updates #21266

Change-Id: I926ee3ba009c068239db70cdee8fdf85b5ee6bb4
Reviewed-on: https://go-review.googlesource.com/109816
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-06 14:28:47 +00:00
Richard Musiol
3b137dd2df cmd/compile: add wasm architecture
This commit adds the wasm architecture to the compile command.
A later commit will contain the corresponding linker changes.

Design doc: https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4

The following files are generated:
- src/cmd/compile/internal/ssa/opGen.go
- src/cmd/compile/internal/ssa/rewriteWasm.go
- src/cmd/internal/obj/wasm/anames.go

Updates #18892

Change-Id: Ifb4a96a3e427aac2362a1c97967d5667450fba3b
Reviewed-on: https://go-review.googlesource.com/103295
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-04 17:56:12 +00:00
Brad Fitzpatrick
17fbb83693 cmd/go, cmd/compile: use Windows response files to avoid arg length limits
Fixes #18468

Change-Id: Ic88a8daf67db949e5b59f9aa466b37e7f7890713
Reviewed-on: https://go-review.googlesource.com/110395
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-04 01:01:50 +00:00
Matthew Dempsky
6f7fa68b91 cmd: re-generate SymKind.String for SDWARFMISC
SDWARFMISC was added in golang.org/cl/93664.

Change-Id: Ifab0a5effd8e64a2b7916004aa35d51030f23d15
Reviewed-on: https://go-review.googlesource.com/111261
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-03 21:52:05 +00:00
Ben Shi
fc48dcb15f cmd/internal/obj/arm64: add more atomic instructions
More atomic instructions were introduced in ARMv8.1. And this CL
adds support for them and corresponding test cases.

LDADD Rs, (Rb), Rt: (Rb) -> Rt, Rs+(Rb) -> (Rb)
LDAND Rs, (Rb), Rt: (Rb) -> Rt, Rs&(Rb) -> (Rb)
LDEOR Rs, (Rb), Rt: (Rb) -> Rt, Rs^(Rb) -> (Rb)
LDOR  Rs, (Rb), Rt: (Rb) -> Rt, Rs|(Rb) -> (Rb)

Change-Id: Ifb9df86583c4dc54fb96274852c3b93a197045e4
Reviewed-on: https://go-review.googlesource.com/110535
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-03 20:27:05 +00:00
Wei Xiao
20102594a0 cmd/compile: intrinsify runtime.getcallerpc on all link register architectures
Add a compiler intrinsic for getcallerpc on following architectures:
  arm
  mips mipsle mips64 mips64le
  ppc64 ppc64le
  s390x

Change-Id: I758f3d4742fc214b206bcd07d90408622c17dbef
Reviewed-on: https://go-review.googlesource.com/110835
Run-TryBot: Wei Xiao <Wei.Xiao@arm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-02 16:59:27 +00:00
Daniel Martí
9ecf899b29 cmd: remove some unnecessary gotos
Pick the low-hanging fruit, which are the gotos that don't go very far
and labels that aren't used often. All of them have easy replacements
with breaks and returns.

One slightly tricky rewrite is defaultlitreuse. We cannot use a defer
func to reset lineno, because one of its return paths does not reset
lineno, and thus broke toolstash -cmp.

Passes toolstash -cmp on std cmd.

Change-Id: Id1c0967868d69bb073addc7c5c3017ca91ff966f
Reviewed-on: https://go-review.googlesource.com/110063
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-05-01 10:46:08 +00:00
Balaram Makam
c789ce3f75 cmd/asm: add vector instructions for ChaCha20Poly1305 on ARM64
This change provides VZIP1, VZIP2, VTBL instruction for supporting
ChaCha20Poly1305 implementation later.

Change-Id: Ife7c87b8ab1a6495a444478eeb9d906ae4c5ffa9
Reviewed-on: https://go-review.googlesource.com/110015
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-30 20:07:37 +00:00
Austin Clements
1b44167d05 cmd/internal/obj/arm: fix/rationalize checkpool distance check
When deciding whether to flush the constant pool, the distance check
in checkpool can fail to account for padding inserted before the next
instruction by nacl.

For example, see this failure:
https://go-review.googlesource.com/c/go/+/109350/2#message-07085b591227824bb1d646a7192cbfa7e0b97066
Here, the pool should be flushed before a CALL instruction, but
checkpool only considers the CALL instruction to be 4 bytes and
doesn't account for the 8 extra bytes of alignment padding added
before it by asmoutnacl. As a result, it flushes the pool after the
CALL instruction, which is 4 bytes too late.

Furthermore, there's no explanation for the rather convoluted
expression used to decide if we need to emit the constant pool.

This CL modifies checkpool to take the PC following the tentative
instruction as an argument. The caller knows this already and this way
checkpool doesn't have to guess (and get it wrong in the presence of
padding). In the process, it rewrites the test to be structured and
commented.

Change-Id: I32a3d50ffb5a94d42be943e9bcd49036c7e9b95c
Reviewed-on: https://go-review.googlesource.com/110017
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-30 18:28:22 +00:00
Wei Xiao
bd8a88729c cmd/compile: intrinsify runtime.getcallerpc on arm64
Add a compiler intrinsic for getcallerpc on arm64 for better code generation.

Change-Id: I897e670a2b8ffa1a8c2fdc638f5b2c44bda26318
Reviewed-on: https://go-review.googlesource.com/109276
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-30 13:29:14 +00:00
bill_ofarrell
3c65bb5b90 cmd/asm: add s390x VMSLG instruction
This instruction was introduced on the z14 to accelerate "limbified"
multiplications for certain cryptographic algorithms. This change allows
it to be used in Go assembly.

Change-Id: Ic93dae7fec1756f662874c08a5abc435bce9dd9e
Reviewed-on: https://go-review.googlesource.com/109695
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-28 22:15:25 +00:00
fanzha02
d7f5c0360f cmd/internal/obj/arm64: reorder the assembler's optab entries
Current optab entries are unordered, because the new instructions
are added at the end of the optab. The patch reorders them by comments
in optab, such as arithmetic operations, logical operations and a
series of load/store etc.

The patch removes the VMOVS opcode because FMOVS already has the same
operation.

Change-Id: Iccdf89ecbb3875b9dfcb6e06be2cc19c7e5581a2
Reviewed-on: https://go-review.googlesource.com/109896
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-28 18:10:25 +00:00
Milan Knezevic
2959128dc5 cmd/compile: add softfloat support to mips64{,le}
mips64 softfloat support is based on mips implementation and introduces
new enviroment variable GOMIPS64.

GOMIPS64 is a GOARCH=mips64{,le} specific option, for a choice between
hard-float and soft-float. Valid values are 'hardfloat' (default) and
'softfloat'. It is passed to the assembler as
'GOMIPS64_{hardfloat,softfloat}'.

Change-Id: I7f73078627f7cb37c588a38fb5c997fe09c56134
Reviewed-on: https://go-review.googlesource.com/108475
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-27 14:50:17 +00:00
Josh Bleecher Snyder
62adf6fc2d cmd/internal/obj: convert unicode C to ASCII C
Hex before: d0 a1
Hex after: 43

Not sure where that came from.

Change-Id: I189e7e21f8faf480ba72846b956a149976f720f8
Reviewed-on: https://go-review.googlesource.com/109777
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-27 14:36:31 +00:00
Josh Bleecher Snyder
a76249c38d cmd/compile: increase initial allocation of LSym.R
Not a big win, but cheap.

name        old alloc/op      new alloc/op      delta
Template         34.4MB ± 0%       34.4MB ± 0%  -0.20%  (p=0.000 n=15+15)
Unicode          29.2MB ± 0%       29.3MB ± 0%  +0.17%  (p=0.000 n=15+15)
GoTypes           113MB ± 0%        113MB ± 0%  -0.22%  (p=0.000 n=15+15)
Compiler          509MB ± 0%        508MB ± 0%  -0.11%  (p=0.000 n=15+14)
SSA              1.46GB ± 0%       1.46GB ± 0%  -0.08%  (p=0.000 n=14+15)
Flate            23.8MB ± 0%       23.7MB ± 0%  -0.22%  (p=0.000 n=15+15)
GoParser         27.9MB ± 0%       27.8MB ± 0%  -0.21%  (p=0.000 n=14+15)
Reflect          77.2MB ± 0%       77.0MB ± 0%  -0.27%  (p=0.000 n=14+15)
Tar              34.0MB ± 0%       33.9MB ± 0%  -0.21%  (p=0.000 n=13+15)
XML              42.6MB ± 0%       42.5MB ± 0%  -0.15%  (p=0.000 n=15+15)
[Geo mean]       75.8MB            75.7MB       -0.15%

name        old allocs/op     new allocs/op     delta
Template           322k ± 0%         320k ± 0%  -0.60%  (p=0.000 n=15+15)
Unicode            337k ± 0%         336k ± 0%  -0.23%  (p=0.000 n=12+15)
GoTypes           1.13M ± 0%        1.12M ± 0%  -0.58%  (p=0.000 n=15+14)
Compiler          4.67M ± 0%        4.65M ± 0%  -0.38%  (p=0.000 n=14+15)
SSA               11.7M ± 0%        11.6M ± 0%  -0.25%  (p=0.000 n=15+15)
Flate              216k ± 0%         214k ± 0%  -0.67%  (p=0.000 n=15+15)
GoParser           271k ± 0%         270k ± 0%  -0.57%  (p=0.000 n=15+15)
Reflect            927k ± 0%         920k ± 0%  -0.72%  (p=0.000 n=13+14)
Tar                318k ± 0%         316k ± 0%  -0.57%  (p=0.000 n=15+15)
XML                376k ± 0%         375k ± 0%  -0.46%  (p=0.000 n=14+14)
[Geo mean]         731k              727k       -0.50%

Change-Id: I1417c5881e866fb3efe62a3d0fbe1134275da31a
Reviewed-on: https://go-review.googlesource.com/109755
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-27 04:49:14 +00:00
Balaram Makam
bf6530a25b cmd/asm: add VSRI instruction on ARM64
This change provides VSRI instruction for ChaCha20Poly1305 implementation.

Change-Id: Ifee727b6f3982b629b44a67cac5bbe87ca59027b
Reviewed-on: https://go-review.googlesource.com/109342
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-26 15:03:21 +00:00
Balaram Makam
f524268c40 cmd/compile: optimize ARM64 code with CMN/TST
Use CMN/TST to simplify comparisons. This can reduce the
register pressure by removing single def/use registers for example:
ADDW R0, R1, R8 -> CMNW R1, R0 ; CMN is an alias of ADDS.
CBZW R8, label  -> BEQ  label  ; single def/use of R8 removed.

Little change in performance of go1 benchmark on Amberwing:
name                   old time/op    new time/op    delta
RegexpMatchEasy0_32       247ns ± 0%     246ns ± 0%  -0.40%  (p=0.008 n=5+5)
RegexpMatchEasy0_1K       581ns ± 0%     580ns ± 0%    ~     (p=0.079 n=4+5)
RegexpMatchEasy1_32       244ns ± 0%     243ns ± 0%  -0.41%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K       804ns ± 0%     806ns ± 0%  +0.25%  (p=0.016 n=5+4)
RegexpMatchMedium_32      313ns ± 0%     311ns ± 0%  -0.64%  (p=0.008 n=5+5)
RegexpMatchMedium_1K     52.2µs ± 0%    51.9µs ± 0%  -0.51%  (p=0.008 n=5+5)
RegexpMatchHard_32       2.76µs ± 3%    2.74µs ± 0%    ~     (p=0.683 n=5+5)
RegexpMatchHard_1K       78.8µs ± 0%    78.9µs ± 0%  +0.04%  (p=0.008 n=5+5)
FmtFprintfEmpty          58.6ns ± 0%    57.7ns ± 0%  -1.54%  (p=0.008 n=5+5)
FmtFprintfString          118ns ± 0%     115ns ± 0%  -2.54%  (p=0.008 n=5+5)
FmtFprintfInt             119ns ± 0%     119ns ± 0%    ~     (all equal)
FmtFprintfIntInt          192ns ± 0%     192ns ± 0%    ~     (all equal)
FmtFprintfPrefixedInt     224ns ± 0%     205ns ± 0%  -8.48%  (p=0.008 n=5+5)
FmtFprintfFloat           336ns ± 0%     333ns ± 1%    ~     (p=0.683 n=5+5)
FmtManyArgs               779ns ± 1%     760ns ± 1%  -2.41%  (p=0.008 n=5+5)
Gzip                      437ms ± 0%     436ms ± 0%  -0.27%  (p=0.008 n=5+5)
HTTPClientServer         90.1µs ± 1%    91.1µs ± 0%  +1.19%  (p=0.008 n=5+5)
JSONEncode               20.1ms ± 0%    20.2ms ± 1%    ~     (p=0.690 n=5+5)
JSONDecode               94.5ms ± 1%    94.1ms ± 1%    ~     (p=0.095 n=5+5)
Mandelbrot200            5.37ms ± 0%    5.37ms ± 0%    ~     (p=0.421 n=5+5)
TimeParse                 450ns ± 0%     446ns ± 0%  -0.89%  (p=0.000 n=5+4)
TimeFormat                483ns ± 1%     473ns ± 0%  -2.19%  (p=0.008 n=5+5)
Template                 90.6ms ± 0%    89.7ms ± 0%  -0.93%  (p=0.008 n=5+5)
GoParse                  5.97ms ± 0%    6.01ms ± 0%  +0.65%  (p=0.008 n=5+5)
BinaryTree17              11.8s ± 0%     11.7s ± 0%  -0.28%  (p=0.016 n=5+5)
Revcomp                   669ms ± 0%     669ms ± 0%    ~     (p=0.222 n=5+5)
Fannkuch11                3.28s ± 0%     3.34s ± 0%  +1.72%  (p=0.016 n=4+5)
[Geo mean]               46.6µs         46.3µs       -0.74%

name                   old speed      new speed      delta
RegexpMatchEasy0_32     129MB/s ± 0%   130MB/s ± 0%  +0.32%  (p=0.016 n=5+4)
RegexpMatchEasy0_1K    1.76GB/s ± 0%  1.76GB/s ± 0%  +0.13%  (p=0.016 n=4+5)
RegexpMatchEasy1_32     131MB/s ± 0%   132MB/s ± 0%  +0.32%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K    1.27GB/s ± 0%  1.27GB/s ± 0%  -0.24%  (p=0.016 n=5+4)
RegexpMatchMedium_32   3.19MB/s ± 0%  3.21MB/s ± 0%  +0.63%  (p=0.008 n=5+5)
RegexpMatchMedium_1K   19.6MB/s ± 0%  19.7MB/s ± 0%  +0.51%  (p=0.029 n=4+4)
RegexpMatchHard_32     11.6MB/s ± 2%  11.7MB/s ± 0%    ~     (p=1.000 n=5+5)
RegexpMatchHard_1K     13.0MB/s ± 0%  13.0MB/s ± 0%    ~     (p=0.079 n=4+5)
Gzip                   44.4MB/s ± 0%  44.5MB/s ± 0%  +0.27%  (p=0.008 n=5+5)
JSONEncode             96.4MB/s ± 0%  96.2MB/s ± 1%    ~     (p=0.579 n=5+5)
JSONDecode             20.5MB/s ± 1%  20.6MB/s ± 1%    ~     (p=0.111 n=5+5)
Template               21.4MB/s ± 0%  21.6MB/s ± 0%  +0.94%  (p=0.008 n=5+5)
GoParse                9.70MB/s ± 0%  9.63MB/s ± 0%  -0.68%  (p=0.016 n=4+5)
Revcomp                 380MB/s ± 0%   380MB/s ± 0%    ~     (p=0.222 n=5+5)
[Geo mean]             55.3MB/s       55.4MB/s       +0.23%

Change-Id: I2e5338138991d9bc984e67b51212aa5d1b0f2a6b
Reviewed-on: https://go-review.googlesource.com/97335
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
2018-04-26 14:13:12 +00:00
Carlos Eduardo Seo
ebb67d993a cmd/compile, cmd/internal/obj/ppc64: make math.Round an intrinsic on ppc64x
This change implements math.Round as an intrinsic on ppc64x so it can be
done using a single instruction.

benchmark                   old ns/op     new ns/op     delta
BenchmarkRound-16           2.60          0.69          -73.46%

Change-Id: I9408363e96201abdfc73ced7bcd5f0c29db006a8
Reviewed-on: https://go-review.googlesource.com/109395
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2018-04-26 14:12:09 +00:00
quasilyte
70c5839fe0 cmd/internal/obj/x86: forbid mem args for MOV_DR and MOV_CR
Memory arguments for debug/control register moves are a
minefield for programmer: not useful, but can lead to errors.

See referenced issue for detailed explanation.

Fixes #24981

Change-Id: I918e81cd4a8b1dfcfc9023cdfc3de45abe29e749
Reviewed-on: https://go-review.googlesource.com/107075
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-24 16:09:50 +00:00
Matthew Dempsky
ca2f85fd3f cmd/compile: add indexed export format
This CL introduces a new indexed data format for package export
data. This improves on the previous (sequential) binary format by
allowing the compiler to selectively (and lazily) load only the data
that's actually needed for compilation.

In large Go projects, the package export data can become very large
due to transitive type declaration dependencies and inline
function/method bodies. By lazily loading these declarations and
bodies as needed, we avoid wasting time and memory processing
unnecessary and/or redundant data.

In the benchmarks below, "old" is -iexport=false and "new" is
-iexport=true. The suffixes indicate the compiler concurrency (-c) and
inlining (-l) settings used for the build (using -gcflags=all=-foo).
Benchmarks were run on an HP Z620.

Juju is "go build -a github.com/juju/juju/cmd/...":

name          old real-time/op  new real-time/op  delta
Juju/c=1/l=0        44.0s ± 1%        38.7s ± 9%  -11.97%  (p=0.001 n=7+7)
Juju/c=1/l=4        53.7s ± 3%        45.3s ± 4%  -15.53%  (p=0.001 n=7+7)
Juju/c=4/l=0        39.7s ± 8%        32.0s ± 4%  -19.38%  (p=0.001 n=7+7)
Juju/c=4/l=4        46.3s ± 4%        38.0s ± 4%  -18.06%  (p=0.001 n=7+7)

name          old user-time/op  new user-time/op  delta
Juju/c=1/l=0         371s ± 1%         300s ± 0%  -19.07%  (p=0.001 n=7+6)
Juju/c=1/l=4         482s ± 0%         374s ± 1%  -22.37%  (p=0.001 n=7+7)
Juju/c=4/l=0         410s ± 1%         340s ± 1%  -17.19%  (p=0.001 n=7+7)
Juju/c=4/l=4         532s ± 1%         424s ± 1%  -20.26%  (p=0.001 n=7+7)

name          old sys-time/op   new sys-time/op   delta
Juju/c=1/l=0        33.4s ± 1%        28.4s ± 2%  -15.02%  (p=0.001 n=7+7)
Juju/c=1/l=4        40.7s ± 2%        32.8s ± 3%  -19.51%  (p=0.001 n=7+7)
Juju/c=4/l=0        39.8s ± 2%        34.4s ± 2%  -13.74%  (p=0.001 n=7+7)
Juju/c=4/l=4        48.4s ± 2%        40.4s ± 2%  -16.50%  (p=0.001 n=7+7)

Kubelet is "go build -a k8s.io/kubernetes/cmd/kubelet":

name             old real-time/op  new real-time/op  delta
Kubelet/c=1/l=0        42.0s ± 1%        34.8s ± 1%  -17.27%  (p=0.008 n=5+5)
Kubelet/c=1/l=4        55.4s ± 3%        45.4s ± 3%  -18.06%  (p=0.002 n=6+6)
Kubelet/c=4/l=0        37.4s ± 3%        29.9s ± 1%  -20.25%  (p=0.004 n=6+5)
Kubelet/c=4/l=4        48.1s ± 2%        39.0s ± 5%  -18.93%  (p=0.002 n=6+6)

name             old user-time/op  new user-time/op  delta
Kubelet/c=1/l=0         291s ± 1%         233s ± 1%  -19.96%  (p=0.002 n=6+6)
Kubelet/c=1/l=4         385s ± 1%         298s ± 1%  -22.51%  (p=0.002 n=6+6)
Kubelet/c=4/l=0         325s ± 0%         268s ± 1%  -17.48%  (p=0.004 n=5+6)
Kubelet/c=4/l=4         429s ± 1%         343s ± 1%  -20.08%  (p=0.002 n=6+6)

name             old sys-time/op   new sys-time/op   delta
Kubelet/c=1/l=0        25.1s ± 2%        20.9s ± 4%  -16.69%  (p=0.002 n=6+6)
Kubelet/c=1/l=4        31.2s ± 3%        24.4s ± 0%  -21.67%  (p=0.010 n=6+4)
Kubelet/c=4/l=0        30.2s ± 2%        25.6s ± 1%  -15.34%  (p=0.002 n=6+6)
Kubelet/c=4/l=4        37.3s ± 1%        30.9s ± 2%  -17.11%  (p=0.002 n=6+6)

Change-Id: Ie43eb3bbe1392cbb61c86792a17a57b33b9561f0
Reviewed-on: https://go-review.googlesource.com/106796
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-24 01:05:27 +00:00
isharipo
8e3dd8ab88 cmd/internal/obj/x86: faster Assemble for non-NaCl hosts
Make span6 function (used as LinkArch.Assemble) faster
by avoiding redundant re-assemble rounds on hosts
that are not NaCl.

NaCl is excluded because it needs Prog.Isize to fix alignment.

For make.bash, there are around 50% of functions that can
be encoded in a single trip. With this change, those function
will be assembled with 1 round instead of 2.

compilebench results:

    name        old time/op       new time/op       delta
    Template          305ms ± 2%        299ms ± 2%  -1.99%  (p=0.001 n=10+10)
    Unicode           139ms ± 3%        138ms ± 4%    ~     (p=0.222 n=9+9)
    GoTypes           1.05s ± 1%        1.04s ± 1%  -1.34%  (p=0.000 n=10+9)
    Compiler          4.78s ± 1%        4.71s ± 1%  -1.45%  (p=0.000 n=9+9)
    SSA               12.2s ± 1%        12.0s ± 1%  -1.90%  (p=0.000 n=9+10)
    Flate             204ms ± 3%        202ms ± 3%    ~     (p=0.052 n=10+10)
    GoParser          248ms ± 1%        244ms ± 2%  -1.79%  (p=0.000 n=10+9)
    Reflect           671ms ± 1%        664ms ± 1%  -0.96%  (p=0.001 n=9+9)
    Tar               287ms ± 2%        285ms ± 3%    ~     (p=0.393 n=10+10)
    XML               362ms ± 1%        353ms ± 2%  -2.60%  (p=0.000 n=10+9)
    StdCmd            29.2s ± 1%        29.0s ± 1%  -0.63%  (p=0.021 n=10+8)
    [Geo mean]        888ms             875ms       -1.40%

    name        old user-time/op  new user-time/op  delta
    Template          393ms ± 5%        373ms ± 8%  -5.12%  (p=0.013 n=9+10)
    Unicode           185ms ± 6%        184ms ± 5%    ~     (p=0.825 n=10+10)
    GoTypes           1.33s ± 1%        1.31s ± 3%  -1.60%  (p=0.004 n=10+10)
    Compiler          5.98s ± 3%        5.92s ± 1%    ~     (p=0.050 n=10+10)
    SSA               15.5s ± 2%        15.3s ± 0%    ~     (p=0.156 n=10+9)
    Flate             255ms ± 5%        252ms ± 5%    ~     (p=0.362 n=10+10)
    GoParser          309ms ± 1%        304ms ± 3%  -1.79%  (p=0.021 n=7+10)
    Reflect           839ms ± 2%        833ms ± 1%    ~     (p=0.160 n=10+9)
    Tar               363ms ± 3%        358ms ± 4%    ~     (p=0.194 n=8+10)
    XML               446ms ± 3%        442ms ± 3%    ~     (p=0.503 n=10+10)
    [Geo mean]        791ms             779ms       -1.55%

Passes toolstash-check.

Change-Id: Ibcdb09f2c28907932581b7566f46d34be292594b
Reviewed-on: https://go-review.googlesource.com/108895
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-23 20:13:32 +00:00