1
0
mirror of https://github.com/golang/go synced 2024-11-08 18:46:16 -07:00
Commit Graph

70 Commits

Author SHA1 Message Date
Keith Randall
63e964e174 cmd/compile: provide types for all order-allocated temporaries
Ensure that we correctly type the stack temps for regular closures,
method function closures, and slice literals.

Then we don't need to override the dummy types later.
Furthermore, this allows order to reuse temporaries of these types.

OARRAYLIT doesn't need a temporary as far as I can tell, so I
removed that case from order.

Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9
Reviewed-on: https://go-review.googlesource.com/c/140306
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-15 16:07:52 +00:00
Keith Randall
389e942745 cmd/compile: reuse temporaries in order pass
Instead of allocating a new temporary each time one
is needed, keep a list of temporaries which are free
(have already been VARKILLed on every path) and use
one of them.

Should save a lot of stack space. In a function like this:

func main() {
     fmt.Printf("%d %d\n", 2, 3)
     fmt.Printf("%d %d\n", 4, 5)
     fmt.Printf("%d %d\n", 6, 7)
}

The three [2]interface{} arrays used to hold the ... args
all use the same autotmp, instead of 3 different autotmps
as happened previous to this CL.

Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3
Reviewed-on: https://go-review.googlesource.com/c/140301
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-14 05:21:00 +00:00
Keith Randall
0e9f8a21f8 runtime,cmd/compile: pass strings and slices to convT2{E,I} by value
When we pass these types by reference, we usually have to allocate
temporaries on the stack, initialize them, then pass their address
to the conversion functions. It's simpler to pass these types
directly by value.

This particularly applies to conversions needed for fmt.Printf
(to interface{} for constructing a [...]interface{}).

func f(a, b, c string) {
     fmt.Printf("%s %s\n", a, b)
     fmt.Printf("%s %s\n", b, c)
}

This function's stack frame shrinks from 200 to 136 bytes, and
its code shrinks from 535 to 453 bytes.

The go binary shrinks 0.3%.

Update #24286

Aside: for this function f, we don't really need to allocate
temporaries for the convT2E function. We could use the address
of a, b, and c directly. That might get similar (or maybe better?)
improvements. I investigated a bit, but it seemed complicated
to do it safely. This change was much easier.

Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
Reviewed-on: https://go-review.googlesource.com/c/135377
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-14 03:46:51 +00:00
Keith Randall
9a8372f8bd cmd/compile,runtime: remove ambiguously live logic
The previous CL introduced stack objects. This CL removes the old
ambiguously live liveness analysis. After this CL we're relying
on stack objects exclusively.

Update a bunch of liveness tests to reflect the new world.

Fixes #22350

Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31
Reviewed-on: https://go-review.googlesource.com/c/134156
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:16 +00:00
Kazuhiro Sera
ad644d2e86 all: fix typos detected by github.com/client9/misspell
Change-Id: Iadb3c5de8ae9ea45855013997ed70f7929a88661
GitHub-Last-Rev: ae85bcf82b
GitHub-Pull-Request: golang/go#26920
Reviewed-on: https://go-review.googlesource.com/128955
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-23 15:54:07 +00:00
Austin Clements
a367f44c18 cmd/compile: enable stack maps everywhere except unsafe points
This modifies issafepoint in liveness analysis to report almost every
operation as a safe point. There are four things we don't mark as
safe-points:

1. Runtime code (other than at calls).

2. go:nosplit functions (other than at calls).

3. Instructions between the load of the write barrier-enabled flag and
   the write.

4. Instructions leading up to a uintptr -> unsafe.Pointer conversion.

We'll optimize this in later CLs:

name        old time/op       new time/op       delta
Template          185ms ± 2%        190ms ± 2%   +2.95%  (p=0.000 n=10+10)
Unicode          96.3ms ± 3%       96.4ms ± 1%     ~     (p=0.905 n=10+9)
GoTypes           658ms ± 0%        669ms ± 1%   +1.72%  (p=0.000 n=10+9)
Compiler          3.14s ± 1%        3.18s ± 1%   +1.56%  (p=0.000 n=9+10)
SSA               7.41s ± 2%        7.59s ± 1%   +2.48%  (p=0.000 n=9+10)
Flate             126ms ± 1%        128ms ± 1%   +2.08%  (p=0.000 n=10+10)
GoParser          153ms ± 1%        157ms ± 2%   +2.38%  (p=0.000 n=10+10)
Reflect           437ms ± 1%        442ms ± 1%   +0.98%  (p=0.001 n=10+10)
Tar               178ms ± 1%        179ms ± 1%   +0.67%  (p=0.035 n=10+9)
XML               223ms ± 1%        229ms ± 1%   +2.58%  (p=0.000 n=10+10)
[Geo mean]        394ms             401ms        +1.75%

No effect on binary size because we're not yet emitting these extra
safe points.

For #24543.

Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859
Reviewed-on: https://go-review.googlesource.com/109348
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 14:43:37 +00:00
Matthew Dempsky
004260afde cmd/compile: open code select{send,recv,default}
Registration now looks like:

        var cases [4]runtime.scases
        var order [8]uint16
	cases[0].kind = caseSend
	cases[0].c = c1
	cases[0].elem = &v1
	if raceenabled || msanenabled {
		selectsetpc(&cases[0])
	}
	cases[1].kind = caseRecv
	cases[1].c = c2
	cases[1].elem = &v2
	if raceenabled || msanenabled {
		selectsetpc(&cases[1])
	}
	...

Change-Id: Ib9bcf426a4797fe4bfd8152ca9e6e08e39a70b48
Reviewed-on: https://go-review.googlesource.com/37934
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-01 03:17:44 +00:00
Matthew Dempsky
3aa53b3135 runtime: eliminate runtime.hselect
Now the registration phase looks like:

    var cases [4]runtime.scases
    var order [8]uint16
    selectsend(&cases[0], c1, &v1)
    selectrecv(&cases[1], c2, &v2, nil)
    selectrecv(&cases[2], c3, &v3, &ok)
    selectdefault(&cases[3])
    chosen := selectgo(&cases[0], &order[0], 4)

Primarily, this is just preparation for having the compiler open-code
selectsend, selectrecv, and selectdefault.

As a minor benefit, order can now be layed out separately on the stack
in the pointer-free segment, so it won't take up space in the
function's stack pointer maps.

Change-Id: I5552ba594201efd31fcb40084da20b42ea569a45
Reviewed-on: https://go-review.googlesource.com/37933
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-01 03:17:31 +00:00
Keith Randall
2413b54888 cmd/compile: mark the first word of an interface as a uintptr
The first word of an interface is a pointer, but for the purposes
of GC we don't need to treat it as such.
 1. If it is a non-empty interface, the pointer points to an itab
    which is always in persistentalloc space.
 2. If it is an empty interface, the pointer points to a _type.
   a. If it is a compile-time-allocated type, it points into
      the read-only data section.
   b. If it is a reflect-allocated type, it points into the Go heap.
      Reflect is responsible for keeping a reference to
      the underlying type so it won't be GCd.

If we ever have a moving GC, we need to change this for 2b (as
well as scan itabs to update their itab._type fields).

Write barriers on the first word of interfaces have already been removed.

Change-Id: I643e91d7ac4de980ac2717436eff94097c65d959
Reviewed-on: https://go-review.googlesource.com/97518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-02-27 22:58:32 +00:00
Martin Möhrmann
fbfc2031a6 cmd/compile: specialize map creation for small hint sizes
Handle make(map[any]any) and make(map[any]any, hint) where
hint <= BUCKETSIZE special to allow for faster map initialization
and to improve binary size by using runtime calls with fewer arguments.

Given hint is smaller or equal to BUCKETSIZE in which case
overLoadFactor(hint, 0)  is false and no buckets would be allocated by makemap:
* If hmap needs to be allocated on the stack then only hmap's hash0
  field needs to be initialized and no call to makemap is needed.
* If hmap needs to be allocated on the heap then a new special
  makehmap function will allocate hmap and intialize hmap's
  hash0 field.

Reduces size of the godoc by ~36kb.

AMD64
name         old time/op    new time/op    delta
NewEmptyMap    16.6ns ± 2%     5.5ns ± 2%  -66.72%  (p=0.000 n=10+10)
NewSmallMap    64.8ns ± 1%    56.5ns ± 1%  -12.75%  (p=0.000 n=9+10)

Updates #6853

Change-Id: I624e90da6775afaa061178e95db8aca674f44e9b
Reviewed-on: https://go-review.googlesource.com/61190
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-11-02 17:03:45 +00:00
Martin Möhrmann
743117a85e cmd/compile: simplify slice/array range loops for some element sizes
In range loops over slices and arrays besides a variable to track the
index an extra variable containing the address of the current element
is used. To compute a pointer to the next element the elements size is
added to the address.

On 386 and amd64 an element of size 1, 2, 4 or 8 bytes can by copied
from an array using a MOV instruction with suitable addressing mode
that uses the start address of the array, the index of the element and
element size as scaling factor. Thereby, for arrays and slices with
suitable element size we can avoid keeping and incrementing an extra
variable to compute the next elements address.

Shrinks cmd/go by 4 kilobytes.

AMD64:
name                   old time/op    new time/op    delta
BinaryTree17              2.66s ± 7%     2.54s ± 0%  -4.53%  (p=0.000 n=10+8)
Fannkuch11                3.02s ± 1%     3.02s ± 1%    ~     (p=0.579 n=10+10)
FmtFprintfEmpty          45.6ns ± 1%    42.2ns ± 1%  -7.46%  (p=0.000 n=10+10)
FmtFprintfString         69.8ns ± 1%    70.4ns ± 1%  +0.84%  (p=0.041 n=10+10)
FmtFprintfInt            80.1ns ± 1%    79.0ns ± 1%  -1.35%  (p=0.000 n=10+10)
FmtFprintfIntInt          127ns ± 1%     125ns ± 1%  -1.00%  (p=0.007 n=10+9)
FmtFprintfPrefixedInt     158ns ± 2%     152ns ± 1%  -4.11%  (p=0.000 n=10+10)
FmtFprintfFloat           218ns ± 1%     214ns ± 1%  -1.61%  (p=0.000 n=10+10)
FmtManyArgs               508ns ± 1%     504ns ± 1%  -0.93%  (p=0.001 n=9+10)
GobDecode                6.76ms ± 1%    6.78ms ± 1%    ~     (p=0.353 n=10+10)
GobEncode                5.84ms ± 1%    5.77ms ± 1%  -1.31%  (p=0.000 n=10+9)
Gzip                      223ms ± 1%     218ms ± 1%  -2.39%  (p=0.000 n=10+10)
Gunzip                   40.3ms ± 1%    40.4ms ± 3%    ~     (p=0.796 n=10+10)
HTTPClientServer         73.5µs ± 0%    73.3µs ± 0%  -0.28%  (p=0.000 n=10+9)
JSONEncode               12.7ms ± 1%    12.6ms ± 8%    ~     (p=0.173 n=8+10)
JSONDecode               57.5ms ± 1%    56.1ms ± 2%  -2.40%  (p=0.000 n=10+10)
Mandelbrot200            3.80ms ± 1%    3.86ms ± 6%    ~     (p=0.579 n=10+10)
GoParse                  3.25ms ± 1%    3.23ms ± 1%    ~     (p=0.052 n=10+10)
RegexpMatchEasy0_32      74.4ns ± 1%    76.9ns ± 1%  +3.39%  (p=0.000 n=10+10)
RegexpMatchEasy0_1K       243ns ± 2%     248ns ± 1%  +1.86%  (p=0.000 n=10+8)
RegexpMatchEasy1_32      71.0ns ± 2%    72.8ns ± 1%  +2.55%  (p=0.000 n=10+10)
RegexpMatchEasy1_1K       370ns ± 1%     383ns ± 0%  +3.39%  (p=0.000 n=10+9)
RegexpMatchMedium_32      107ns ± 0%     113ns ± 1%  +5.33%  (p=0.000 n=6+10)
RegexpMatchMedium_1K     35.0µs ± 1%    36.0µs ± 1%  +3.13%  (p=0.000 n=10+10)
RegexpMatchHard_32       1.65µs ± 1%    1.69µs ± 1%  +2.23%  (p=0.000 n=10+9)
RegexpMatchHard_1K       49.8µs ± 1%    50.6µs ± 1%  +1.59%  (p=0.000 n=10+10)
Revcomp                   398ms ± 1%     396ms ± 1%  -0.51%  (p=0.043 n=10+10)
Template                 63.4ms ± 1%    60.8ms ± 0%  -4.11%  (p=0.000 n=10+9)
TimeParse                 318ns ± 1%     322ns ± 1%  +1.10%  (p=0.005 n=10+10)
TimeFormat                323ns ± 1%     336ns ± 1%  +4.15%  (p=0.000 n=10+10)

Updates: #15809.

Change-Id: I55915aaf6d26768e12247f8a8edf14e7630726d1
Reviewed-on: https://go-review.googlesource.com/38061
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-13 14:52:04 +00:00
Michael Munday
744ebfde04 cmd/compile: eliminate stores to unread auto variables
This is a crude compiler pass to eliminate stores to auto variables
that are only ever written to.

Eliminates an unnecessary store to x from the following code:

func f() int {
	var x := 1
	return *(&x)
}

Fixes #19765.

Change-Id: If2c63a8ae67b8c590b6e0cc98a9610939a3eeffa
Reviewed-on: https://go-review.googlesource.com/38746
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-08-24 16:53:56 +00:00
Hugues Bruant
5d6b7fcaa1 runtime: add mapdelete_fast*
Add benchmarks for map delete with int32/int64/string key

Benchmark results on darwin/amd64

name                 old time/op  new time/op  delta
MapDelete/Int32/1-8   151ns ± 8%    99ns ± 3%  -34.39%  (p=0.008 n=5+5)
MapDelete/Int32/2-8   128ns ± 2%   111ns ±15%  -13.40%  (p=0.040 n=5+5)
MapDelete/Int32/4-8   128ns ± 5%   114ns ± 2%  -10.82%  (p=0.008 n=5+5)
MapDelete/Int64/1-8   144ns ± 0%   104ns ± 3%  -27.53%  (p=0.016 n=4+5)
MapDelete/Int64/2-8   153ns ± 1%   126ns ± 3%  -17.17%  (p=0.008 n=5+5)
MapDelete/Int64/4-8   178ns ± 3%   136ns ± 2%  -23.60%  (p=0.008 n=5+5)
MapDelete/Str/1-8     187ns ± 3%   171ns ± 3%   -8.54%  (p=0.008 n=5+5)
MapDelete/Str/2-8     221ns ± 3%   206ns ± 4%   -7.18%  (p=0.016 n=5+4)
MapDelete/Str/4-8     256ns ± 5%   232ns ± 2%   -9.36%  (p=0.016 n=4+5)

name                     old time/op    new time/op    delta
BinaryTree17-8              2.78s ± 7%     2.70s ± 1%    ~     (p=0.151 n=5+5)
Fannkuch11-8                3.21s ± 2%     3.19s ± 1%    ~     (p=0.310 n=5+5)
FmtFprintfEmpty-8          49.1ns ± 3%    50.2ns ± 2%    ~     (p=0.095 n=5+5)
FmtFprintfString-8         78.6ns ± 4%    80.2ns ± 5%    ~     (p=0.460 n=5+5)
FmtFprintfInt-8            79.7ns ± 1%    81.0ns ± 3%    ~     (p=0.103 n=5+5)
FmtFprintfIntInt-8          117ns ± 2%     119ns ± 0%    ~     (p=0.079 n=5+4)
FmtFprintfPrefixedInt-8     153ns ± 1%     146ns ± 3%  -4.19%  (p=0.024 n=5+5)
FmtFprintfFloat-8           239ns ± 1%     237ns ± 1%    ~     (p=0.246 n=5+5)
FmtManyArgs-8               506ns ± 2%     509ns ± 2%    ~     (p=0.238 n=5+5)
GobDecode-8                7.06ms ± 4%    6.86ms ± 1%    ~     (p=0.222 n=5+5)
GobEncode-8                6.01ms ± 5%    5.87ms ± 2%    ~     (p=0.222 n=5+5)
Gzip-8                      246ms ± 4%     236ms ± 1%  -4.12%  (p=0.008 n=5+5)
Gunzip-8                   37.7ms ± 4%    37.3ms ± 1%    ~     (p=0.841 n=5+5)
HTTPClientServer-8         64.9µs ± 1%    64.4µs ± 0%  -0.80%  (p=0.032 n=5+4)
JSONEncode-8               16.0ms ± 2%    16.2ms ±11%    ~     (p=0.548 n=5+5)
JSONDecode-8               53.2ms ± 2%    53.1ms ± 4%    ~     (p=1.000 n=5+5)
Mandelbrot200-8            4.33ms ± 2%    4.32ms ± 2%    ~     (p=0.841 n=5+5)
GoParse-8                  3.24ms ± 2%    3.27ms ± 4%    ~     (p=0.690 n=5+5)
RegexpMatchEasy0_32-8      86.2ns ± 1%    85.2ns ± 3%    ~     (p=0.286 n=5+5)
RegexpMatchEasy0_1K-8       198ns ± 2%     199ns ± 1%    ~     (p=0.310 n=5+5)
RegexpMatchEasy1_32-8      82.6ns ± 2%    81.8ns ± 1%    ~     (p=0.294 n=5+5)
RegexpMatchEasy1_1K-8       359ns ± 2%     354ns ± 1%  -1.39%  (p=0.048 n=5+5)
RegexpMatchMedium_32-8      123ns ± 2%     123ns ± 1%    ~     (p=0.905 n=5+5)
RegexpMatchMedium_1K-8     38.2µs ± 2%    38.6µs ± 8%    ~     (p=0.690 n=5+5)
RegexpMatchHard_32-8       1.92µs ± 2%    1.91µs ± 5%    ~     (p=0.460 n=5+5)
RegexpMatchHard_1K-8       57.6µs ± 1%    57.0µs ± 2%    ~     (p=0.310 n=5+5)
Revcomp-8                   483ms ± 7%     441ms ± 1%  -8.79%  (p=0.016 n=5+4)
Template-8                 58.0ms ± 1%    58.2ms ± 7%    ~     (p=0.310 n=5+5)
TimeParse-8                 324ns ± 6%     312ns ± 2%    ~     (p=0.087 n=5+5)
TimeFormat-8                330ns ± 1%     329ns ± 1%    ~     (p=0.968 n=5+5)

name                     old speed      new speed      delta
GobDecode-8               109MB/s ± 4%   112MB/s ± 1%    ~     (p=0.222 n=5+5)
GobEncode-8               128MB/s ± 5%   131MB/s ± 2%    ~     (p=0.222 n=5+5)
Gzip-8                   78.9MB/s ± 4%  82.3MB/s ± 1%  +4.25%  (p=0.008 n=5+5)
Gunzip-8                  514MB/s ± 4%   521MB/s ± 1%    ~     (p=0.841 n=5+5)
JSONEncode-8              121MB/s ± 2%   120MB/s ±10%    ~     (p=0.548 n=5+5)
JSONDecode-8             36.5MB/s ± 2%  36.6MB/s ± 4%    ~     (p=1.000 n=5+5)
GoParse-8                17.9MB/s ± 2%  17.7MB/s ± 4%    ~     (p=0.730 n=5+5)
RegexpMatchEasy0_32-8     371MB/s ± 1%   375MB/s ± 3%    ~     (p=0.310 n=5+5)
RegexpMatchEasy0_1K-8    5.15GB/s ± 1%  5.13GB/s ± 1%    ~     (p=0.548 n=5+5)
RegexpMatchEasy1_32-8     387MB/s ± 2%   391MB/s ± 1%    ~     (p=0.310 n=5+5)
RegexpMatchEasy1_1K-8    2.85GB/s ± 2%  2.89GB/s ± 1%    ~     (p=0.056 n=5+5)
RegexpMatchMedium_32-8   8.07MB/s ± 2%  8.06MB/s ± 1%    ~     (p=0.730 n=5+5)
RegexpMatchMedium_1K-8   26.8MB/s ± 2%  26.6MB/s ± 7%    ~     (p=0.690 n=5+5)
RegexpMatchHard_32-8     16.7MB/s ± 2%  16.7MB/s ± 5%    ~     (p=0.421 n=5+5)
RegexpMatchHard_1K-8     17.8MB/s ± 1%  18.0MB/s ± 2%    ~     (p=0.310 n=5+5)
Revcomp-8                 527MB/s ± 6%   577MB/s ± 1%  +9.44%  (p=0.016 n=5+4)
Template-8               33.5MB/s ± 1%  33.4MB/s ± 7%    ~     (p=0.310 n=5+5)

Updates #19495

Change-Id: Ib9ece1690813d9b4788455db43d30891e2138df5
Reviewed-on: https://go-review.googlesource.com/38172
Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-21 06:07:24 +00:00
Hugues Bruant
ec091b6af2 runtime: add mapassign_fast*
Add benchmarks for map assignment with int32/int64/string key

Benchmark results on darwin/amd64

name                  old time/op  new time/op  delta
MapAssignInt32_255-8  24.7ns ± 3%  17.4ns ± 2%  -29.75%  (p=0.000 n=10+10)
MapAssignInt32_64k-8  45.5ns ± 4%  37.6ns ± 4%  -17.18%  (p=0.000 n=10+10)
MapAssignInt64_255-8  26.0ns ± 3%  17.9ns ± 4%  -31.03%  (p=0.000 n=10+10)
MapAssignInt64_64k-8  46.9ns ± 5%  38.7ns ± 2%  -17.53%  (p=0.000 n=9+10)
MapAssignStr_255-8    47.8ns ± 3%  24.8ns ± 4%  -48.01%  (p=0.000 n=10+10)
MapAssignStr_64k-8    83.0ns ± 3%  51.9ns ± 3%  -37.45%  (p=0.000 n=10+9)

name                     old time/op    new time/op    delta
BinaryTree17-8              3.11s ±19%     2.78s ± 3%    ~     (p=0.095 n=5+5)
Fannkuch11-8                3.26s ± 1%     3.21s ± 2%    ~     (p=0.056 n=5+5)
FmtFprintfEmpty-8          50.3ns ± 1%    50.8ns ± 2%    ~     (p=0.246 n=5+5)
FmtFprintfString-8         82.7ns ± 4%    80.1ns ± 5%    ~     (p=0.238 n=5+5)
FmtFprintfInt-8            82.6ns ± 2%    81.9ns ± 3%    ~     (p=0.508 n=5+5)
FmtFprintfIntInt-8          124ns ± 4%     121ns ± 3%    ~     (p=0.111 n=5+5)
FmtFprintfPrefixedInt-8     158ns ± 6%     160ns ± 2%    ~     (p=0.341 n=5+5)
FmtFprintfFloat-8           249ns ± 2%     245ns ± 2%    ~     (p=0.095 n=5+5)
FmtManyArgs-8               513ns ± 2%     519ns ± 3%    ~     (p=0.151 n=5+5)
GobDecode-8                7.48ms ±12%    7.11ms ± 2%    ~     (p=0.222 n=5+5)
GobEncode-8                6.25ms ± 1%    6.03ms ± 2%  -3.56%  (p=0.008 n=5+5)
Gzip-8                      252ms ± 4%     252ms ± 4%    ~     (p=1.000 n=5+5)
Gunzip-8                   38.4ms ± 3%    38.6ms ± 2%    ~     (p=0.690 n=5+5)
HTTPClientServer-8         76.9µs ±41%    66.4µs ± 6%    ~     (p=0.310 n=5+5)
JSONEncode-8               16.5ms ± 3%    16.7ms ± 3%    ~     (p=0.421 n=5+5)
JSONDecode-8               54.6ms ± 1%    54.3ms ± 2%    ~     (p=0.548 n=5+5)
Mandelbrot200-8            4.45ms ± 3%    4.47ms ± 1%    ~     (p=0.841 n=5+5)
GoParse-8                  3.43ms ± 1%    3.32ms ± 2%  -3.28%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-8      88.2ns ± 3%    89.4ns ± 2%    ~     (p=0.333 n=5+5)
RegexpMatchEasy0_1K-8       205ns ± 1%     206ns ± 1%    ~     (p=0.905 n=5+5)
RegexpMatchEasy1_32-8      85.1ns ± 1%    85.5ns ± 5%    ~     (p=0.690 n=5+5)
RegexpMatchEasy1_1K-8       365ns ± 1%     371ns ± 9%    ~     (p=1.000 n=5+5)
RegexpMatchMedium_32-8      129ns ± 2%     128ns ± 3%    ~     (p=0.730 n=5+5)
RegexpMatchMedium_1K-8     39.8µs ± 0%    39.7µs ± 4%    ~     (p=0.730 n=4+5)
RegexpMatchHard_32-8       1.99µs ± 3%    2.05µs ±16%    ~     (p=0.794 n=5+5)
RegexpMatchHard_1K-8       59.3µs ± 1%    60.3µs ± 7%    ~     (p=1.000 n=5+5)
Revcomp-8                   1.36s ±63%     0.52s ± 5%    ~     (p=0.095 n=5+5)
Template-8                 62.6ms ±14%    60.5ms ± 5%    ~     (p=0.690 n=5+5)
TimeParse-8                 330ns ± 2%     324ns ± 2%    ~     (p=0.087 n=5+5)
TimeFormat-8                350ns ± 3%     340ns ± 1%  -2.86%  (p=0.008 n=5+5)

name                     old speed      new speed      delta
GobDecode-8               103MB/s ±11%   108MB/s ± 2%    ~     (p=0.222 n=5+5)
GobEncode-8               123MB/s ± 1%   127MB/s ± 2%  +3.71%  (p=0.008 n=5+5)
Gzip-8                   77.1MB/s ± 4%  76.9MB/s ± 3%    ~     (p=1.000 n=5+5)
Gunzip-8                  505MB/s ± 3%   503MB/s ± 2%    ~     (p=0.690 n=5+5)
JSONEncode-8              118MB/s ± 3%   116MB/s ± 3%    ~     (p=0.421 n=5+5)
JSONDecode-8             35.5MB/s ± 1%  35.8MB/s ± 2%    ~     (p=0.397 n=5+5)
GoParse-8                16.9MB/s ± 1%  17.4MB/s ± 2%  +3.45%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-8     363MB/s ± 3%   358MB/s ± 2%    ~     (p=0.421 n=5+5)
RegexpMatchEasy0_1K-8    4.98GB/s ± 1%  4.97GB/s ± 1%    ~     (p=0.548 n=5+5)
RegexpMatchEasy1_32-8     376MB/s ± 1%   375MB/s ± 5%    ~     (p=0.690 n=5+5)
RegexpMatchEasy1_1K-8    2.80GB/s ± 1%  2.76GB/s ± 9%    ~     (p=0.841 n=5+5)
RegexpMatchMedium_32-8   7.73MB/s ± 1%  7.76MB/s ± 3%    ~     (p=0.730 n=5+5)
RegexpMatchMedium_1K-8   25.8MB/s ± 0%  25.8MB/s ± 4%    ~     (p=0.651 n=4+5)
RegexpMatchHard_32-8     16.1MB/s ± 3%  15.7MB/s ±14%    ~     (p=0.794 n=5+5)
RegexpMatchHard_1K-8     17.3MB/s ± 1%  17.0MB/s ± 7%    ~     (p=0.984 n=5+5)
Revcomp-8                 273MB/s ±83%   488MB/s ± 5%    ~     (p=0.095 n=5+5)
Template-8               31.1MB/s ±13%  32.1MB/s ± 5%    ~     (p=0.690 n=5+5)

Updates #19495

Change-Id: I116e9a2a4594769318b22d736464de8a98499909
Reviewed-on: https://go-review.googlesource.com/38091
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-13 23:43:16 +00:00
Matthew Dempsky
c310c688ff cmd/compile, runtime: simplify multiway select implementation
This commit reworks multiway select statements to use normal control
flow primitives instead of the previous setjmp/longjmp-like behavior.
This simplifies liveness analysis and should prevent issues around
"returns twice" function calls within SSA passes.

test/live.go is updated because liveness analysis's CFG is more
representative of actual control flow. The case bodies are the only
real successors of the selectgo call, but previously the selectsend,
selectrecv, etc. calls were included in the successors list too.

Updates #19331.

Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83
Reviewed-on: https://go-review.googlesource.com/37661
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-07 20:14:17 +00:00
Josh Bleecher Snyder
504bc3ed24 cmd/compile, runtime: specialize convT2x, don't alloc for zero vals
Prior to this CL, all runtime conversions
from a concrete value to an interface went
through one of two runtime calls: convT2E or convT2I.
However, in practice, basic types are very common.
Specializing convT2x for those basic types allows
for a more efficient implementation for those types.
For basic scalars and strings, allocation and copying
can use the same methods as normal code.
For pointer-free types, allocation can occur without
zeroing, and copying can take place without GC calls.
For slices, copying is cheaper and simpler.

This CL adds twelve runtime routines:

convT2E16, convT2I16
convT2E32, convT2I32
convT2E64, convT2I64
convT2Estring, convT2Istring
convT2Eslice, convT2Islice
convT2Enoptr, convT2Inoptr

While compiling make.bash, 93% of all convT2x calls
are now to one of these specialized convT2x call.

Within specialized convT2x routines, it is cheap to check
for a zero value, in a way that it is not in general.
When we detect a zero value there, we return a pointer
to zeroVal, rather than allocating.

name                         old time/op  new time/op  delta
ConvT2Ezero/zero/16-8        17.9ns ± 2%   3.0ns ± 3%  -83.20%  (p=0.000 n=56+56)
ConvT2Ezero/zero/32-8        17.8ns ± 2%   3.0ns ± 3%  -83.15%  (p=0.000 n=59+60)
ConvT2Ezero/zero/64-8        20.1ns ± 1%   3.0ns ± 2%  -84.98%  (p=0.000 n=57+57)
ConvT2Ezero/zero/str-8       32.6ns ± 1%   3.0ns ± 4%  -90.70%  (p=0.000 n=59+60)
ConvT2Ezero/zero/slice-8     36.7ns ± 2%   3.0ns ± 2%  -91.78%  (p=0.000 n=59+59)
ConvT2Ezero/zero/big-8       91.9ns ± 2%  85.9ns ± 2%   -6.52%  (p=0.000 n=57+57)
ConvT2Ezero/nonzero/16-8     17.7ns ± 2%  12.7ns ± 3%  -28.38%  (p=0.000 n=55+60)
ConvT2Ezero/nonzero/32-8     17.8ns ± 1%  12.7ns ± 1%  -28.44%  (p=0.000 n=54+57)
ConvT2Ezero/nonzero/64-8     20.0ns ± 1%  15.0ns ± 1%  -24.90%  (p=0.000 n=56+58)
ConvT2Ezero/nonzero/str-8    32.6ns ± 1%  25.7ns ± 1%  -21.17%  (p=0.000 n=58+55)
ConvT2Ezero/nonzero/slice-8  36.8ns ± 2%  30.4ns ± 1%  -17.32%  (p=0.000 n=60+52)
ConvT2Ezero/nonzero/big-8    92.1ns ± 2%  85.9ns ± 2%   -6.70%  (p=0.000 n=57+59)

Benchmarks on a real program (the compiler):

name       old time/op      new time/op      delta
Template        227ms ± 5%       221ms ± 2%  -2.48%  (p=0.000 n=30+26)
Unicode         102ms ± 5%       100ms ± 3%  -1.30%  (p=0.009 n=30+26)
GoTypes         656ms ± 5%       659ms ± 4%    ~     (p=0.208 n=30+30)
Compiler        2.82s ± 2%       2.82s ± 1%    ~     (p=0.614 n=29+27)
Flate           128ms ± 2%       128ms ± 5%    ~     (p=0.783 n=27+28)
GoParser        158ms ± 3%       158ms ± 3%    ~     (p=0.261 n=28+30)
Reflect         408ms ± 7%       401ms ± 3%    ~     (p=0.075 n=30+30)
Tar             123ms ± 6%       121ms ± 8%    ~     (p=0.287 n=29+30)
XML             220ms ± 2%       220ms ± 4%    ~     (p=0.805 n=29+29)

name       old user-ns/op   new user-ns/op   delta
Template   281user-ms ± 4%  279user-ms ± 3%  -0.87%  (p=0.044 n=28+28)
Unicode    142user-ms ± 4%  141user-ms ± 3%  -1.04%  (p=0.015 n=30+27)
GoTypes    884user-ms ± 3%  886user-ms ± 2%    ~     (p=0.532 n=30+30)
Compiler   3.94user-s ± 3%  3.92user-s ± 1%    ~     (p=0.185 n=30+28)
Flate      165user-ms ± 2%  165user-ms ± 4%    ~     (p=0.780 n=27+29)
GoParser   209user-ms ± 2%  208user-ms ± 3%    ~     (p=0.453 n=28+30)
Reflect    533user-ms ± 6%  526user-ms ± 3%    ~     (p=0.057 n=30+30)
Tar        156user-ms ± 6%  154user-ms ± 6%    ~     (p=0.133 n=29+30)
XML        288user-ms ± 4%  288user-ms ± 4%    ~     (p=0.633 n=30+30)

name       old alloc/op     new alloc/op     delta
Template       41.0MB ± 0%      40.9MB ± 0%  -0.11%  (p=0.000 n=29+29)
Unicode        32.6MB ± 0%      32.6MB ± 0%    ~     (p=0.572 n=29+30)
GoTypes         122MB ± 0%       122MB ± 0%  -0.10%  (p=0.000 n=30+30)
Compiler        482MB ± 0%       481MB ± 0%  -0.07%  (p=0.000 n=30+29)
Flate          26.6MB ± 0%      26.6MB ± 0%    ~     (p=0.096 n=30+30)
GoParser       32.7MB ± 0%      32.6MB ± 0%  -0.06%  (p=0.011 n=28+28)
Reflect        84.2MB ± 0%      84.1MB ± 0%  -0.17%  (p=0.000 n=29+30)
Tar            27.7MB ± 0%      27.7MB ± 0%  -0.05%  (p=0.032 n=27+28)
XML            44.7MB ± 0%      44.7MB ± 0%    ~     (p=0.131 n=28+30)

name       old allocs/op    new allocs/op    delta
Template         373k ± 1%        370k ± 1%  -0.76%  (p=0.000 n=30+30)
Unicode          325k ± 1%        325k ± 1%    ~     (p=0.383 n=29+30)
GoTypes         1.16M ± 0%       1.15M ± 0%  -0.75%  (p=0.000 n=29+30)
Compiler        4.15M ± 0%       4.13M ± 0%  -0.59%  (p=0.000 n=30+29)
Flate            238k ± 1%        237k ± 1%  -0.62%  (p=0.000 n=30+30)
GoParser         304k ± 1%        302k ± 1%  -0.64%  (p=0.000 n=30+28)
Reflect         1.00M ± 0%       0.99M ± 0%  -1.10%  (p=0.000 n=29+30)
Tar              245k ± 1%        244k ± 1%  -0.59%  (p=0.000 n=27+29)
XML              391k ± 1%        389k ± 1%  -0.59%  (p=0.000 n=29+30)

Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f
Reviewed-on: https://go-review.googlesource.com/36476
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-28 19:23:33 +00:00
Robert Griesemer
3fd3171c2c cmd/compile/internal/syntax: removed gcCompat code needed to pass orig. tests
The gcCompat mode was introduced to match the new parser's node position
setup exactly with the positions used by the original parser. Some of the
gcCompat adjustments were required to satisfy syntax error test cases,
and the rest were required to make toolstash cmp pass.

This change removes the former gcCompat adjustments and instead adjusts
the respective test cases as necessary. In some cases this makes the error
lines consistent with the ones reported by gccgo.

Where it has changed, the position associated with a given syntactic construct
is the position (line/col number) of the left-most token belonging to the
construct.

Change-Id: I5b60c00c5999a895c4d6d6e9b383c6405ccf725c
Reviewed-on: https://go-review.googlesource.com/36695
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-10 01:22:30 +00:00
Keith Randall
8179b9b462 cmd/compile: make sure output params are live if there is a defer
If there is a defer, and that defer recovers, then the caller
can see all of the output parameters.  That means that we must
mark all the output parameters live at any point which might panic.

If there is no defer then this is not necessary.  This is implemented.

We could also detect whether there is a recover in any of the defers.
If not, we would need to mark only output params that the defer
actually references (and the closure mechanism already does that).
This is not implemented.

Fixes #18860.

Change-Id: If984fe6686eddce9408bf25e725dd17fc16b8578
Reviewed-on: https://go-review.googlesource.com/36030
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-03 15:21:47 +00:00
Josh Bleecher Snyder
c682d3239e cmd/compile: convert constants to interfaces without allocating
The order pass is responsible for ensuring that
values passed to runtime functions, including
convT2E/convT2I, are addressable.

Prior to this CL, this was always accomplished
by creating a temp, which frequently escaped to
the heap, causing allocations, perhaps most
notably in code like:

fmt.Println(1, 2, 3) // allocates three times

None of the runtime routines modify the contents
of the pointers they receive, so in the case of
constants, instead of creating a temp value,
we can create a static value.

(Marking the static value as read-only provides
protection against accidental attempts by the runtime
to modify the constant data.)

This improves code generation for code like:

panic("abc")
c <- 2 // c is a chan int

which can now simply refer to "abc" and 2,
rather than going by way of a temporary.

It also allows us to optimize convT2E/convT2I,
by recognizing static readonly values
and directly constructing the interface.

This CL adds ~0.5% to binary size, despite
decreasing the size of many functions,
because it also adds many static symbols.

This binary size regression could be recovered in
future (but currently unplanned) work.

There is a lot of content-duplication in these
symbols; this statement generates six new symbols,
three containing an int 1 and three containing
a pointer to the string "a":

fmt.Println(1, 1, 1, "a", "a", "a")

These symbols could be made content-addressable.

Furthermore, these symbols are small, so the
alignment and naming overhead is large.
As with the go.strings section, these symbols
could be hidden and have their alignment reduced.

The changes to test/live.go make it impossible
(at least with current optimization techniques)
to place the values being passed to the runtime
in static symbols, preserving autotmp creation.

Fixes #18704

Benchmarks from fmt and go-kit's logging package:

github.com/go-kit/kit/log

name                      old time/op    new time/op    delta
JSONLoggerSimple-8          1.91µs ± 2%    2.11µs ±22%     ~     (p=1.000 n=9+10)
JSONLoggerContextual-8      2.60µs ± 6%    2.43µs ± 2%   -6.29%  (p=0.000 n=9+10)
Discard-8                    101ns ± 2%      34ns ±14%  -66.33%  (p=0.000 n=10+9)
OneWith-8                    161ns ± 1%     102ns ±16%  -36.78%  (p=0.000 n=10+10)
TwoWith-8                    175ns ± 3%     106ns ± 7%  -39.36%  (p=0.000 n=10+9)
TenWith-8                    293ns ± 3%     227ns ±15%  -22.44%  (p=0.000 n=9+10)
LogfmtLoggerSimple-8         704ns ± 2%     608ns ± 2%  -13.65%  (p=0.000 n=10+9)
LogfmtLoggerContextual-8     962ns ± 1%     860ns ±17%  -10.57%  (p=0.003 n=9+10)
NopLoggerSimple-8            188ns ± 1%     120ns ± 1%  -36.39%  (p=0.000 n=9+10)
NopLoggerContextual-8        379ns ± 1%     243ns ± 0%  -35.77%  (p=0.000 n=9+10)
ValueBindingTimestamp-8      577ns ± 1%     499ns ± 1%  -13.51%  (p=0.000 n=10+10)
ValueBindingCaller-8         898ns ± 2%     844ns ± 2%   -6.00%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
JSONLoggerSimple-8            904B ± 0%      872B ± 0%   -3.54%  (p=0.000 n=10+10)
JSONLoggerContextual-8      1.20kB ± 0%    1.14kB ± 0%   -5.33%  (p=0.000 n=10+10)
Discard-8                    64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)
OneWith-8                    96.0B ± 0%     64.0B ± 0%  -33.33%  (p=0.000 n=10+10)
TwoWith-8                     160B ± 0%      128B ± 0%  -20.00%  (p=0.000 n=10+10)
TenWith-8                     672B ± 0%      640B ± 0%   -4.76%  (p=0.000 n=10+10)
LogfmtLoggerSimple-8          128B ± 0%       96B ± 0%  -25.00%  (p=0.000 n=10+10)
LogfmtLoggerContextual-8      304B ± 0%      240B ± 0%  -21.05%  (p=0.000 n=10+10)
NopLoggerSimple-8             128B ± 0%       96B ± 0%  -25.00%  (p=0.000 n=10+10)
NopLoggerContextual-8         304B ± 0%      240B ± 0%  -21.05%  (p=0.000 n=10+10)
ValueBindingTimestamp-8       159B ± 0%      127B ± 0%  -20.13%  (p=0.000 n=10+10)
ValueBindingCaller-8          112B ± 0%       80B ± 0%  -28.57%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
JSONLoggerSimple-8            19.0 ± 0%      17.0 ± 0%  -10.53%  (p=0.000 n=10+10)
JSONLoggerContextual-8        25.0 ± 0%      21.0 ± 0%  -16.00%  (p=0.000 n=10+10)
Discard-8                     3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.000 n=10+10)
OneWith-8                     3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.000 n=10+10)
TwoWith-8                     3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.000 n=10+10)
TenWith-8                     3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.000 n=10+10)
LogfmtLoggerSimple-8          4.00 ± 0%      2.00 ± 0%  -50.00%  (p=0.000 n=10+10)
LogfmtLoggerContextual-8      7.00 ± 0%      3.00 ± 0%  -57.14%  (p=0.000 n=10+10)
NopLoggerSimple-8             4.00 ± 0%      2.00 ± 0%  -50.00%  (p=0.000 n=10+10)
NopLoggerContextual-8         7.00 ± 0%      3.00 ± 0%  -57.14%  (p=0.000 n=10+10)
ValueBindingTimestamp-8       5.00 ± 0%      3.00 ± 0%  -40.00%  (p=0.000 n=10+10)
ValueBindingCaller-8          4.00 ± 0%      2.00 ± 0%  -50.00%  (p=0.000 n=10+10)

fmt

name                             old time/op    new time/op    delta
SprintfPadding-8                   88.9ns ± 3%    79.1ns ± 1%   -11.09%  (p=0.000 n=10+7)
SprintfEmpty-8                     12.6ns ± 3%    12.8ns ± 3%      ~     (p=0.136 n=10+10)
SprintfString-8                    38.7ns ± 5%    26.9ns ± 6%   -30.65%  (p=0.000 n=10+10)
SprintfTruncateString-8            56.7ns ± 2%    47.0ns ± 3%   -17.05%  (p=0.000 n=10+10)
SprintfQuoteString-8                164ns ± 2%     153ns ± 2%    -7.01%  (p=0.000 n=10+10)
SprintfInt-8                       38.9ns ±15%    26.5ns ± 2%   -31.93%  (p=0.000 n=10+9)
SprintfIntInt-8                    60.3ns ± 9%    38.2ns ± 1%   -36.67%  (p=0.000 n=10+8)
SprintfPrefixedInt-8               58.6ns ±13%    51.2ns ±11%   -12.66%  (p=0.001 n=10+10)
SprintfFloat-8                     71.4ns ± 3%    64.2ns ± 3%   -10.08%  (p=0.000 n=8+10)
SprintfComplex-8                    175ns ± 3%     159ns ± 2%    -9.03%  (p=0.000 n=10+10)
SprintfBoolean-8                   33.5ns ± 4%    25.7ns ± 5%   -23.28%  (p=0.000 n=10+10)
SprintfHexString-8                 65.3ns ± 3%    51.7ns ± 5%   -20.86%  (p=0.000 n=10+9)
SprintfHexBytes-8                  67.2ns ± 5%    67.9ns ± 4%      ~     (p=0.383 n=10+10)
SprintfBytes-8                      129ns ± 7%     124ns ± 7%      ~     (p=0.074 n=9+10)
SprintfStringer-8                   127ns ± 4%     126ns ± 8%      ~     (p=0.506 n=9+10)
SprintfStructure-8                  357ns ± 3%     359ns ± 3%      ~     (p=0.469 n=10+10)
ManyArgs-8                          203ns ± 6%     126ns ± 3%   -37.94%  (p=0.000 n=10+10)
FprintInt-8                         119ns ±10%      74ns ± 3%   -37.54%  (p=0.000 n=10+10)
FprintfBytes-8                      122ns ± 4%     120ns ± 3%      ~     (p=0.124 n=10+10)
FprintIntNoAlloc-8                 78.2ns ± 5%    74.1ns ± 3%    -5.28%  (p=0.000 n=10+10)
ScanInts-8                          349µs ± 1%     349µs ± 0%      ~     (p=0.606 n=9+8)
ScanRecursiveInt-8                 43.8ms ± 7%    40.1ms ± 2%    -8.42%  (p=0.000 n=10+10)
ScanRecursiveIntReaderWrapper-8    43.5ms ± 4%    40.4ms ± 2%    -7.16%  (p=0.000 n=10+9)

name                             old alloc/op   new alloc/op   delta
SprintfPadding-8                    24.0B ± 0%     16.0B ± 0%   -33.33%  (p=0.000 n=10+10)
SprintfEmpty-8                      0.00B          0.00B           ~     (all equal)
SprintfString-8                     21.0B ± 0%      5.0B ± 0%   -76.19%  (p=0.000 n=10+10)
SprintfTruncateString-8             32.0B ± 0%     16.0B ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfQuoteString-8                48.0B ± 0%     32.0B ± 0%   -33.33%  (p=0.000 n=10+10)
SprintfInt-8                        16.0B ± 0%      1.0B ± 0%   -93.75%  (p=0.000 n=10+10)
SprintfIntInt-8                     24.0B ± 0%      3.0B ± 0%   -87.50%  (p=0.000 n=10+10)
SprintfPrefixedInt-8                72.0B ± 0%     64.0B ± 0%   -11.11%  (p=0.000 n=10+10)
SprintfFloat-8                      16.0B ± 0%      8.0B ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfComplex-8                    48.0B ± 0%     32.0B ± 0%   -33.33%  (p=0.000 n=10+10)
SprintfBoolean-8                    8.00B ± 0%     4.00B ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfHexString-8                  96.0B ± 0%     80.0B ± 0%   -16.67%  (p=0.000 n=10+10)
SprintfHexBytes-8                    112B ± 0%      112B ± 0%      ~     (all equal)
SprintfBytes-8                      96.0B ± 0%     96.0B ± 0%      ~     (all equal)
SprintfStringer-8                   32.0B ± 0%     32.0B ± 0%      ~     (all equal)
SprintfStructure-8                   256B ± 0%      256B ± 0%      ~     (all equal)
ManyArgs-8                          80.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
FprintInt-8                         8.00B ± 0%     0.00B       -100.00%  (p=0.000 n=10+10)
FprintfBytes-8                      32.0B ± 0%     32.0B ± 0%      ~     (all equal)
FprintIntNoAlloc-8                  0.00B          0.00B           ~     (all equal)
ScanInts-8                         15.2kB ± 0%    15.2kB ± 0%      ~     (p=0.248 n=9+10)
ScanRecursiveInt-8                 21.6kB ± 0%    21.6kB ± 0%      ~     (all equal)
ScanRecursiveIntReaderWrapper-8    21.7kB ± 0%    21.7kB ± 0%      ~     (all equal)

name                             old allocs/op  new allocs/op  delta
SprintfPadding-8                     2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfEmpty-8                       0.00           0.00           ~     (all equal)
SprintfString-8                      2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfTruncateString-8              2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfQuoteString-8                 2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfInt-8                         2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfIntInt-8                      3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.000 n=10+10)
SprintfPrefixedInt-8                 2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfFloat-8                       2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfComplex-8                     2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfBoolean-8                     2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfHexString-8                   2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfHexBytes-8                    2.00 ± 0%      2.00 ± 0%      ~     (all equal)
SprintfBytes-8                       2.00 ± 0%      2.00 ± 0%      ~     (all equal)
SprintfStringer-8                    4.00 ± 0%      4.00 ± 0%      ~     (all equal)
SprintfStructure-8                   7.00 ± 0%      7.00 ± 0%      ~     (all equal)
ManyArgs-8                           8.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
FprintInt-8                          1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
FprintfBytes-8                       1.00 ± 0%      1.00 ± 0%      ~     (all equal)
FprintIntNoAlloc-8                   0.00           0.00           ~     (all equal)
ScanInts-8                          1.60k ± 0%     1.60k ± 0%      ~     (all equal)
ScanRecursiveInt-8                  1.71k ± 0%     1.71k ± 0%      ~     (all equal)
ScanRecursiveIntReaderWrapper-8     1.71k ± 0%     1.71k ± 0%      ~     (all equal)

Change-Id: I7ba72a25fea4140a0ba40a9f443103ed87cc69b5
Reviewed-on: https://go-review.googlesource.com/35554
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-02 21:02:23 +00:00
David Chase
7f1ff65c39 cmd/compile: insert scheduling checks on loop backedges
Loop breaking with a counter.  Benchmarked (see comments),
eyeball checked for sanity on popular loops.  This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.

Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test.  Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.

If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop.  Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.

This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.

Updates #17831.
Updates #10958.

Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-01-09 21:01:29 +00:00
David Chase
9c066bab64 cmd/compile: mark temps with new AutoTemp flag, and use it.
This is an extension of
https://go-review.googlesource.com/c/31662/
to mark all the temporaries, not just the ssa-generated ones.

Before-and-after ls -l `go tool -n compile` shows a 3%
reduction in size (or rather, a prior 3% inflation for
failing to filter temps out properly.)

Replaced name-dependent "is it a temp?" tests with calls to
*Node.IsAutoTmp(), which depends on AutoTemp.  Also replace
calls to istemp(n) with n.IsAutoTmp(), to reduce duplication
and clean up function name space.  Generated temporaries
now come with a "." prefix to avoid (apparently harmless)
clashes with legal Go variable names.

Fixes #17644.
Fixes #17240.

Change-Id: If1417f29c79a7275d7303ddf859b51472890fd43
Reviewed-on: https://go-review.googlesource.com/32255
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-31 19:38:50 +00:00
Keith Randall
442de98c14 cmd/compile,runtime: redo how map assignments work
To compile:
  m[k] = v
instead of:
  mapassign(maptype, m, &k, &v), do
do:
  *mapassign(maptype, m, &k) = v

mapassign returns a pointer to the value slot in the map.  It is just
like mapaccess except that it will allocate a new slot if k is not
already present in the map.

This makes map accesses faster but potentially larger (codewise).

It is faster because the write into the map is done when the compiler
knows the concrete type, so it can be done with a few store
instructions instead of calling typedmemmove.  We also potentially
avoid stack temporaries to hold v.

The code can be larger when the map has pointers in its value type,
since there is a write barrier call in addition to the mapassign call.
That makes the code at the callsite a bit bigger (go binary is 0.3%
bigger).

This CL is in preparation for doing operations like m[k] += v with
only a single runtime call.  That will roughly double the speed of
such operations.

Update #17133
Update #5147

Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5
Reviewed-on: https://go-review.googlesource.com/30815
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-12 20:41:23 +00:00
Than McIntosh
6c5e377d23 cmd/compile: relax liveness restrictions on ambiguously live
Update gc liveness to remove special conservative treatment
of ambiguously live vars, since there is no longer a need to
protect against GCDEBUG=gcdead.

Change-Id: Id6e2d03218f7d67911e8436d283005a124e6957f
Reviewed-on: https://go-review.googlesource.com/24896
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2016-10-03 18:07:32 +00:00
Cherry Zhang
d586aae1f4 test: errorcheck auto-generated functions
Add an "errorcheckwithauto" action which performs error check
including lines with auto-generated functions (excluded by
default). Comment "// ERRORAUTO" matches these lines.

Add testcase for CL 29570 (as an example).

Updates #16016, #17186.

Change-Id: Iaba3727336cd602f3dda6b9e5f97dafe0848e632
Reviewed-on: https://go-review.googlesource.com/29652
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-22 20:10:30 +00:00
Keith Randall
ca4089ad62 cmd/compile: args no longer live until end-of-function
We're dropping this behavior in favor of runtime.KeepAlive.
Implement runtime.KeepAlive as an intrinsic.

Update #15843

Change-Id: Ib60225bd30d6770ece1c3c7d1339a06aa25b1cbc
Reviewed-on: https://go-review.googlesource.com/28310
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-09-19 16:54:35 +00:00
Keith Randall
6129f37367 cmd/compile: inline convT2{I,E} when result doesn't escape
No point in calling a function when we can build the interface
using a known type (or itab) and the address of a local.

Get rid of third arg (preallocated stack space) to convT2{I,E}.

Makes go binary smaller by 0.2%

benchmark                   old ns/op     new ns/op     delta
BenchmarkEfaceInteger-8     16.7          10.1          -39.52%

Update #17118
Update #15375

Change-Id: I9724a1f802bfa1e3957bf1856b55558278e198a2
Reviewed-on: https://go-review.googlesource.com/29373
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-09-19 02:37:08 +00:00
Keith Randall
c199c76cb4 cmd/compile: turn live variable test off for ppc
ppc64 has an extraneous variable live in some situations.
We need a better tighten pass to get rid of this extra variable.
I'm working on it, but fix the test in the meantime.

Fixes build for ppc64.

Change-Id: I1efb9ccb234a64f2a1c228abd2b3195f67fbeb41
Reviewed-on: https://go-review.googlesource.com/29353
Reviewed-by: David Chase <drchase@google.com>
2016-09-16 19:41:42 +00:00
Keith Randall
b265d51789 test,cmd/compile: remove _ssa file suffix
Everything is SSA now.

Update #16357

Change-Id: I436dbe367b863ee81a3695a7d653ba4bfc5b0f6c
Reviewed-on: https://go-review.googlesource.com/29232
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-15 20:47:01 +00:00
Keith Randall
f13701bf2f test: make SSA tests unconditional
Delete legacy backend tests, make SSA tests unconditional.

Next CL will remove _ssa from the file names.

Update #16357

Change-Id: I2a7f5dcbc69455f63b5e6e6b2725df26ab86c8dd
Reviewed-on: https://go-review.googlesource.com/29231
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-15 20:21:26 +00:00
Michael Munday
6ec993adc3 cmd/compile: add SSA backend for s390x and enable by default
The new SSA backend modifies the ABI slightly: R0 is now a usable
general purpose register.

Fixes #16677.

Change-Id: I367435ce921e0c7e79e021c80cf8ef5d1d1466cf
Reviewed-on: https://go-review.googlesource.com/28978
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-13 19:39:38 +00:00
Keith Randall
83c73a85db cmd/compile: ignore contentEscapes for marking nodes as escaping
Redo of CL 28575 with fixed test.
We're in a pre-KeepAlive world for a bit yet, the old tests
were in a client which was in a post-KeepAlive world.

Change-Id: I114fd630339d761ab3306d1d99718d3cb973678d
Reviewed-on: https://go-review.googlesource.com/28582
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-07 06:59:22 +00:00
Brad Fitzpatrick
bdb3b790c6 Revert of cmd/compile: ignore contentEscapes for marking nodes as escaping
Reason for revert: broke the build due to cherrypick;
relies on an unsubmitted parent CL.

Original issue's description:
> cmd/compile: ignore contentEscapes for marking nodes as escaping
> 
> We can still stack allocate and VarKill nodes which don't
> escape but their content does.
> 
> Fixes #16996
> 
> Change-Id: If8aa0fcf2c327b4cb880a3d5af8d213289e6f6bf
> Reviewed-on: https://go-review.googlesource.com/28575
> Run-TryBot: Keith Randall <khr@golang.org>
> TryBot-Result: Gobot Gobot <gobot@golang.org>
> Reviewed-by: David Chase <drchase@google.com>
> 

Change-Id: Ie1a325209de14d70af6acb2d78269b7a0450da7a
Reviewed-on: https://go-review.googlesource.com/28578
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-07 03:23:44 +00:00
Keith Randall
923a74ce77 cmd/compile: ignore contentEscapes for marking nodes as escaping
We can still stack allocate and VarKill nodes which don't
escape but their content does.

Fixes #16996

Change-Id: If8aa0fcf2c327b4cb880a3d5af8d213289e6f6bf
Reviewed-on: https://go-review.googlesource.com/28575
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-09-07 02:07:03 +00:00
Cherry Zhang
7f27f1dfdd cmd/compile: add MIPS64 optimizations, SSA on by default
Add the following optimizations:
- fold constants
- fold address into load/store
- simplify extensions and conditional branches
- remove nil checks

Turn on SSA on MIPS64 by default, and toggle the tests.

Fixes #16359.

Change-Id: I7f1e38c2509e22e42cd024e712990ebbe47176bd
Reviewed-on: https://go-review.googlesource.com/27870
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-26 19:45:06 +00:00
David Chase
5b9ff11c3d cmd/compile: ppc64le working, not optimized enough
This time with the cherry-pick from the proper patch of
the old CL.

Stack size increased.
Corrected NaN-comparison glitches.
Marked g register as clobbered by calls.
Fixed shared libraries.

live_ssa.go still disabled because of differences.
Presumably turning on more optimization will fix
both the stack size and the live_ssa.go glitches.

Enhanced debugging output for shared libs test.

Rebased onto master.

Updates #16010.

Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4
Reviewed-on: https://go-review.googlesource.com/27159
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-08-18 16:34:47 +00:00
Cherry Zhang
d99cee79b9 [dev.ssa] cmd/compile, etc.: more ARM64 optimizations, and enable SSA by default
Add more ARM64 optimizations:
- use hardware zero register when it is possible.
- use shifted ops.
  The assembler supports shifted ops but not documented, nor knows
  how to print it. This CL adds them.
- enable fast division.
  This was disabled because it makes the old backend generate slower
  code. But with SSA it generates faster code.

Turn on SSA by default, also adjust tests.

Change-Id: I7794479954c83bb65008dcb457bc1e21d7496da6
Reviewed-on: https://go-review.googlesource.com/26950
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-08-15 03:37:34 +00:00
Keith Randall
c069bc4996 [dev.ssa] cmd/compile: implement GO386=387
Last part of the 386 SSA port.

Modify the x86 backend to simulate SSE registers and
instructions with 387 registers and instructions.
The simulation isn't terribly performant, but it works,
and the old implementation wasn't very performant either.
Leaving to people who care about 387 to optimize if they want.

Turn on SSA backend for 386 by default.

Fixes #16358

Change-Id: I678fb59132620b2c47e993c1c10c4c21135f70c0
Reviewed-on: https://go-review.googlesource.com/25271
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-08-10 17:41:01 +00:00
Keith Randall
69a755b602 [dev.ssa] cmd/compile: port SSA backend to amd64p32
It's not a new backend, just a PtrSize==4 modification
of the existing AMD64 backend.

Change-Id: Icc63521a5cf4ebb379f7430ef3f070894c09afda
Reviewed-on: https://go-review.googlesource.com/25586
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-08-09 15:48:26 +00:00
Cherry Zhang
6b6de15d32 [dev.ssa] cmd/compile: support NaCl in SSA for ARM
NaCl code runs in sandbox and there are restrictions for its
instruction uses
(https://developer.chrome.com/native-client/reference/sandbox_internals/arm-32-bit-sandbox).

Like the legacy backend, on NaCl,
- don't use R9, which is used as NaCl's "thread pointer".
- don't use Duff's device.
- don't use indexed load/stores.
- the assembler rewrites DIV/MOD to runtime calls, which on NaCl
  clobbers R12, so R12 is marked as clobbered for DIV/MOD.
- other restrictions are satisfied by the assembler.

Enable SSA specific tests on nacl/arm, and disable non-SSA ones.

Updates #15365.

Change-Id: I9262693ec6756b89ca29d3ae4e52a96fe5403b02
Reviewed-on: https://go-review.googlesource.com/24859
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-07-16 03:13:45 +00:00
Cherry Zhang
42181ad852 [dev.ssa] cmd/compile: enable SSA on ARM by default
As Josh mentioned in CL 24716, there has been requests for using SSA
for ARM. SSA can still be disabled by setting -ssa=0 for cmd/compile,
or partially enabled with GOSSAFUNC, GOSSAPKG, and GOSSAHASH.

Not enable SSA by default on NaCl, which is not supported yet.

Enable SSA-specific tests on ARM: live_ssa.go and nilptr3_ssa.go;
disable non-SSA tests: live.go, nilptr3.go, and slicepot.go.

Updates #15365.

Change-Id: Ic2ca8d166aeca8517b9d262a55e92f2130683a16
Reviewed-on: https://go-review.googlesource.com/23953
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2016-07-06 15:05:50 +00:00
Emmanuel Odeke
53fd522c0d all: make copyright headers consistent with one space after period
Follows suit with https://go-review.googlesource.com/#/c/20111.

Generated by running
$ grep -R 'Go Authors.  All' * | cut -d":" -f1 | while read F;do perl -pi -e 's/Go
Authors.  All/Go Authors. All/g' $F;done

The code in cmd/internal/unvendor wasn't changed.

Fixes #15213

Change-Id: I4f235cee0a62ec435f9e8540a1ec08ae03b1a75f
Reviewed-on: https://go-review.googlesource.com/21819
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-02 13:43:18 +00:00
David Chase
729abfa35c [dev.ssa] cmd/compile: default compile+test with SSA
Some tests disabled, some bifurcated into _ssa and not,
with appropriate logging added to compiler.

"tests/live.go" in particular needs attention.

SSA-specific testing removed, since it's all SSA now.

Added "-run_skips" option to tests/run.go to simplify
checking whether a test still fails (or how it fails)
on a skipped platform.

The compiler now compiles with SSA by default.
If you don't want SSA, specify GOSSAHASH=n (or N) as
an environment variable.  Function names ending in "_ssa"
are always SSA-compiled.

GOSSAFUNC=fname retains its "SSA for fname, log to ssa.html"
GOSSAPKG=pkg only has an effect when GOSSAHASH=n
GOSSAHASH=10101 etc retains its name-hash-matching behavior
for purposes of debugging.

See #13068

Change-Id: I8217bfeb34173533eaeb391b5f6935483c7d6b43
Reviewed-on: https://go-review.googlesource.com/16299
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
2015-10-30 20:35:20 +00:00
Dmitry Vyukov
7647741246 test: add -update_errors flag to run script
The flag updates error annotations in test files from actual compiler output.
This is useful when doing compiler changes that add/remove/change lots of errors,
or when adding lots of new tests.
Also I noticed at least 2 cases where annotation were sub-optimal:
1. The annotation was "leaking param p" when the actual error is
"leaking param p to result ~r1".
2. The annotation was "leaking param m" when the actual errors
are "leaking param m" and "leaking param mv1".

For now it works only for errorcheck mode.

Also, apply the update to escape and liveness tests.
Some files have gccgo-specific errors of the form "gc error|gccgo error",
so it is risky to run update on all files. Gccgo-specific error
does not necessary contain '|', it can be just truncated.

Change-Id: Iaaae767f859dcb8321a8cb4970b2b70969e8a345
Reviewed-on: https://go-review.googlesource.com/5310
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-04-10 11:33:42 +00:00
Josh Bleecher Snyder
77a2113925 cmd/gc: evaluate concrete == interface without allocating
Consider an interface value i of type I and concrete value c of type C.

Prior to this CL, i==c was evaluated as
	I(c) == i

Evaluating I(c) can allocate.

This CL changes the evaluation of i==c to
	x, ok := i.(C); ok && x == c

The new generated code is shorter and does not allocate directly.

If C is small, as it is in every instance in the stdlib,
the new code also uses less stack space
and makes one runtime call instead of two.

If C is very large, the original implementation is used.
The cutoff for "very large" is 1<<16,
following the stack vs heap cutoff used elsewhere.

This kind of comparison occurs in 38 places in the stdlib,
mostly in the net and os packages.

benchmark                     old ns/op     new ns/op     delta
BenchmarkEqEfaceConcrete      29.5          7.92          -73.15%
BenchmarkEqIfaceConcrete      32.1          7.90          -75.39%
BenchmarkNeEfaceConcrete      29.9          7.90          -73.58%
BenchmarkNeIfaceConcrete      35.9          7.90          -77.99%

Fixes #9370.

Change-Id: I7c4555950bcd6406ee5c613be1f2128da2c9a2b7
Reviewed-on: https://go-review.googlesource.com/2096
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
2015-02-12 22:23:38 +00:00
Dmitry Vyukov
b3be360f16 cmd/gc: allocate non-escaping maps on stack
Extend escape analysis to make(map[k]v).
If it does not escape, allocate temp buffer for hmap and one bucket on stack.

There are 75 cases of non-escaping maps in std lib.

benchmark                                    old allocs     new allocs     delta
BenchmarkConcurrentStmtQuery                 16161          15161          -6.19%
BenchmarkConcurrentTxQuery                   17658          16658          -5.66%
BenchmarkConcurrentTxStmtQuery               16157          15156          -6.20%
BenchmarkConcurrentRandom                    13637          13114          -3.84%
BenchmarkManyConcurrentQueries               22             20             -9.09%
BenchmarkDecodeComplex128Slice               250            188            -24.80%
BenchmarkDecodeFloat64Slice                  250            188            -24.80%
BenchmarkDecodeInt32Slice                    250            188            -24.80%
BenchmarkDecodeStringSlice                   2250           2188           -2.76%
BenchmarkNewEmptyMap                         1              0              -100.00%
BenchmarkNewSmallMap                         2              0              -100.00%

benchmark                old ns/op     new ns/op     delta
BenchmarkNewEmptyMap     124           55.7          -55.08%
BenchmarkNewSmallMap     317           148           -53.31%

benchmark                old allocs     new allocs     delta
BenchmarkNewEmptyMap     1              0              -100.00%
BenchmarkNewSmallMap     2              0              -100.00%

benchmark                old bytes     new bytes     delta
BenchmarkNewEmptyMap     48            0             -100.00%
BenchmarkNewSmallMap     192           0             -100.00%

Fixes #5449

Change-Id: I24fa66f949d2f138885d9e66a0d160240dc9e8fa
Reviewed-on: https://go-review.googlesource.com/3508
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
2015-02-12 09:53:52 +00:00
Russ Cox
9ef4e56108 [dev.garbage] all: merge dev.power64 (7667e41f3ced) into dev.garbage
Now the only difference between dev.cc and dev.garbage
is the runtime conversion on the one side and the
garbage collection on the other. They both have the
same set of changes from default and dev.power64.

LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/172570043
2014-11-14 12:09:42 -05:00
Russ Cox
75d3f62b3c [dev.garbage] cmd/gc, runtime: add locks around print statements
Now each C printf, Go print, or Go println is guaranteed
not to be interleaved with other calls of those functions.
This should help when debugging concurrent failures.

LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/169120043
2014-11-05 14:42:54 -05:00
Austin Clements
a5e1e1599c [dev.power64] test: "fix" live.go test on power64x
On power64x, this one line in live.go reports that t is live
because of missing optimization passes.  This isn't what this
test is trying to test, so shuffle bad40 so that it still
accomplishes the intent of the test without also depending on
optimization.

LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/167110043
2014-11-03 17:25:36 -05:00
Russ Cox
454d1b0e8b cmd/gc: fix call order in array literal of slice literal of make chan
Fixes #8761.

LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/144530045
2014-09-30 12:48:47 -04:00
Russ Cox
fcb4cabba4 cmd/gc: emit write barriers
A write *p = x that needs a write barrier (not all do)
now turns into runtime.writebarrierptr(p, x)
or one of the other variants.

The write barrier implementations are trivial.
The goal here is to emit the calls in the correct places
and to incur the cost of those function calls in the Go 1.4 cycle.

Performance on the Go 1 benchmark suite below.
Remember, the goal is to slow things down (and be correct).

We will look into optimizations in separate CLs, as part of
the process of comparing Go 1.3 against tip in order to make
sure Go 1.4 runs at least as fast as Go 1.3.

benchmark                          old ns/op      new ns/op      delta
BenchmarkBinaryTree17              3118336716     3452876110     +10.73%
BenchmarkFannkuch11                3184497677     3211552284     +0.85%
BenchmarkFmtFprintfEmpty           89.9           107            +19.02%
BenchmarkFmtFprintfString          236            287            +21.61%
BenchmarkFmtFprintfInt             246            278            +13.01%
BenchmarkFmtFprintfIntInt          395            458            +15.95%
BenchmarkFmtFprintfPrefixedInt     343            378            +10.20%
BenchmarkFmtFprintfFloat           477            525            +10.06%
BenchmarkFmtManyArgs               1446           1707           +18.05%
BenchmarkGobDecode                 14398047       14685958       +2.00%
BenchmarkGobEncode                 12557718       12947104       +3.10%
BenchmarkGzip                      453462345      472413285      +4.18%
BenchmarkGunzip                    114226016      115127398      +0.79%
BenchmarkHTTPClientServer          114689         112122         -2.24%
BenchmarkJSONEncode                24914536       26135942       +4.90%
BenchmarkJSONDecode                86832877       103620289      +19.33%
BenchmarkMandelbrot200             4833452        4898780        +1.35%
BenchmarkGoParse                   4317976        4835474        +11.98%
BenchmarkRegexpMatchEasy0_32       150            166            +10.67%
BenchmarkRegexpMatchEasy0_1K       393            402            +2.29%
BenchmarkRegexpMatchEasy1_32       125            142            +13.60%
BenchmarkRegexpMatchEasy1_1K       1010           1236           +22.38%
BenchmarkRegexpMatchMedium_32      232            301            +29.74%
BenchmarkRegexpMatchMedium_1K      76963          102721         +33.47%
BenchmarkRegexpMatchHard_32        3833           5463           +42.53%
BenchmarkRegexpMatchHard_1K        119668         161614         +35.05%
BenchmarkRevcomp                   763449047      706768534      -7.42%
BenchmarkTemplate                  124954724      134834549      +7.91%
BenchmarkTimeParse                 517            511            -1.16%
BenchmarkTimeFormat                501            514            +2.59%

benchmark                         old MB/s     new MB/s     speedup
BenchmarkGobDecode                53.31        52.26        0.98x
BenchmarkGobEncode                61.12        59.28        0.97x
BenchmarkGzip                     42.79        41.08        0.96x
BenchmarkGunzip                   169.88       168.55       0.99x
BenchmarkJSONEncode               77.89        74.25        0.95x
BenchmarkJSONDecode               22.35        18.73        0.84x
BenchmarkGoParse                  13.41        11.98        0.89x
BenchmarkRegexpMatchEasy0_32      213.30       191.72       0.90x
BenchmarkRegexpMatchEasy0_1K      2603.92      2542.74      0.98x
BenchmarkRegexpMatchEasy1_32      254.00       224.93       0.89x
BenchmarkRegexpMatchEasy1_1K      1013.53      827.98       0.82x
BenchmarkRegexpMatchMedium_32     4.30         3.31         0.77x
BenchmarkRegexpMatchMedium_1K     13.30        9.97         0.75x
BenchmarkRegexpMatchHard_32       8.35         5.86         0.70x
BenchmarkRegexpMatchHard_1K       8.56         6.34         0.74x
BenchmarkRevcomp                  332.92       359.62       1.08x
BenchmarkTemplate                 15.53        14.39        0.93x

LGTM=rlh
R=rlh
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/136380043
2014-09-11 12:17:45 -04:00