1
0
mirror of https://github.com/golang/go synced 2024-11-18 11:34:45 -07:00
Commit Graph

23500 Commits

Author SHA1 Message Date
Rick Hudson
c4fe503119 runtime: reduce thrashing of gs between ps
One important use case is a pipeline computation that pass values
from one Goroutine to the next and then exits or is placed in a
wait state. If GOMAXPROCS > 1 a Goroutine running on P1 will enable
another Goroutine and then immediately make P1 available to execute
it. We need to prevent other Ps from stealing the G that P1 is about
to execute. Otherwise the Gs can thrash between Ps causing unneeded
synchronization and slowing down throughput.

Fix this by changing the stealing logic so that when a P attempts to
steal the only G on some other P's run queue, it will pause
momentarily to allow the victim P to schedule the G.

As part of optimizing stealing we also use a per P victim queue
move stolen gs. This eliminates the zeroing of a stack local victim
queue which turned out to be expensive.

This CL is a necessary but not sufficient prerequisite to changing
the default value of GOMAXPROCS to something > 1 which is another
CL/discussion.

For highly serialized programs, such as GoroutineRing below this can
make a large difference. For larger and more parallel programs such
as the x/benchmarks there is no noticeable detriment.

~/work/code/src/rsc.io/benchstat/benchstat old.txt new.txt
name                old mean              new mean              delta
GoroutineRing       30.2µs × (0.98,1.01)  30.1µs × (0.97,1.04)     ~    (p=0.941)
GoroutineRing-2      113µs × (0.91,1.07)    30µs × (0.98,1.03)  -73.17% (p=0.004)
GoroutineRing-4      144µs × (0.98,1.02)    32µs × (0.98,1.01)  -77.69% (p=0.000)
GoroutineRingBuf    32.7µs × (0.97,1.03)  32.5µs × (0.97,1.02)     ~    (p=0.795)
GoroutineRingBuf-2   120µs × (0.92,1.08)    33µs × (1.00,1.00)  -72.48% (p=0.004)
GoroutineRingBuf-4   138µs × (0.92,1.06)    33µs × (1.00,1.00)  -76.21% (p=0.003)

The bench benchmarks show little impact.
    	  	      old  	 new
garbage	      	      7032879	 7011696
httpold		        25509	   25301
splayold	      1022073	 1019499
jsonold		     28230624   28081433

Change-Id: I228c48fed8d85c9bbef16a7edc53ab7898506f50
Reviewed-on: https://go-review.googlesource.com/9872
Reviewed-by: Austin Clements <austin@google.com>
2015-05-13 12:55:24 +00:00
Mikio Hara
3b38626f7d net: don't run IP stack required tests on IP stack unimplemented kernels
Fixes #10787.

Change-Id: I35c96808a713dafb1f0fea301fa3f3528fe6a5bf
Reviewed-on: https://go-review.googlesource.com/9948
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2015-05-13 05:16:19 +00:00
Mikio Hara
6f7961da28 net, internal/syscall/unix: add SocketConn, SocketPacketConn
FileConn and FilePacketConn APIs accept user-configured socket
descriptors to make them work together with runtime-integrated network
poller, but there's a limitation. The APIs reject protocol sockets that
are not supported by standard library. It's very hard for the net,
syscall packages to look after all platform, feature-specific sockets.

This change allows various platform, feature-specific socket descriptors
to use runtime-integrated network poller by using SocketConn,
SocketPacketConn APIs that bridge between the net, syscall packages and
platforms.

New exposed APIs:
pkg net, func SocketConn(*os.File, SocketAddr) (Conn, error)
pkg net, func SocketPacketConn(*os.File, SocketAddr) (PacketConn, error)
pkg net, type SocketAddr interface { Addr, Raw }
pkg net, type SocketAddr interface, Addr([]uint8) Addr
pkg net, type SocketAddr interface, Raw(Addr) []uint8

Fixes #10565.

Change-Id: Iec57499b3d84bb5cb0bcf3f664330c535eec11e3
Reviewed-on: https://go-review.googlesource.com/9275
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-05-13 01:04:23 +00:00
Ian Lance Taylor
08ba7dbdfd syscall: mkerrors.sh: don't define _FILE_OFFSET_BITS if __LP64__
If __LP64__ is defined then the type "long" is 64-bits, and there is
no need to explicitly request _FILE_OFFSET_BITS == 64.  This changes
the definitions of F_GETLK, F_SETLK, and F_SETLKW on PPC to the values
that the kernel requires.  The values used in C when _FILE_OFFSET_BITS
== 64 are corrected by the glibc fcntl function before making the
system call.

With this change, regenerate ppc64le files on Ubuntu trusty.

Change-Id: I8dddbd8a6bae877efff818f5c5dd06291ade3238
Reviewed-on: https://go-review.googlesource.com/9962
Reviewed-by: Minux Ma <minux@golang.org>
2015-05-13 00:40:49 +00:00
Hyang-Ah (Hana) Kim
a4f4a46c28 misc/cgo/testcshared: fix test for android.
On android the generated header files are located in
pkg/$(go env GOOS)_$(go env GOARCH)_testcshared.
The test was broken since https://go-review.googlesource.com/9798.

The installation path differs based on codegenArgs
(around src/cmd/go/build.go line 389), and the codegenArgs
is platform dependent.

Change-Id: I01ae9cb957fb7676e399f3b8c067f24c5bd20b9d
Reviewed-on: https://go-review.googlesource.com/9980
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-05-12 23:46:33 +00:00
Shenghou Ma
c06b856555 testing: fix typo
Fixes #10794.

Change-Id: Id91485394ddbadc28c800e1d0c3ec281ba6cd098
Reviewed-on: https://go-review.googlesource.com/9990
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-12 23:39:00 +00:00
Ian Lance Taylor
1828d03ad5 syscall: mksysnum_linux.pl: run syscall numbers through GCC
This will skip system call numbers that are ifdef'ed out in unistd.h,
as occurs on PPC.

Change-Id: I88e640e4621c7a8cc266433f34a7b4be71543ec9
Reviewed-on: https://go-review.googlesource.com/9966
Reviewed-by: Minux Ma <minux@golang.org>
2015-05-12 22:01:25 +00:00
Austin Clements
350fd548b3 runtime: don't run runq tests on the system stack
Running these tests on the system stack is problematic because they
allocate Ps, which are large enough to overflow the system stack if
they are stack-allocated. It used to be necessary to run these tests
on the system stack because they were written in C, but since this is
no longer the case, we can fix this problem by simply not running the
tests on the system stack.

This also means we no longer need the hack in one of these tests that
forces the allocated Ps to escape to the heap, so eliminate that as
well.

Change-Id: I9064f5f8fd7f7b446ff39a22a70b172cfcb2dc57
Reviewed-on: https://go-review.googlesource.com/9923
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
2015-05-12 19:58:08 +00:00
Russ Cox
5ed4bb6db1 cmd/5g: fix build
The line in cgen.go was lost during the ginscmp CL.
The ggen.go change is not strictly necessary, but
it makes the 5g -S output for x[0] match what it said
before the ginscmp CL.

Change-Id: I5890a9ec1ac69a38509416eda5aea13b8b12b94a
Reviewed-on: https://go-review.googlesource.com/9929
Reviewed-by: Russ Cox <rsc@golang.org>
2015-05-12 19:55:50 +00:00
Andrew Williams
9b379d7e04 syscall: relocate linux death signal code
Fix bug on Linux SysProcAttr handling: setting both Pdeathsig and
Credential caused Pdeathsig to be ignored. This is because the kernel
clears the deathsignal field when performing a setuid/setgid
system call.

Avoid this by moving Pdeathsig handling after Credential handling.

Fixes #9686

Change-Id: Id01896ad4e979b8c448e0061f00aa8762ca0ac94
Reviewed-on: https://go-review.googlesource.com/3290
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-05-12 19:34:46 +00:00
Russ Cox
8552047a32 cmd/internal/gc: optimize append + write barrier
The code generated for x = append(x, v) is roughly:

	t := x
	if len(t)+1 > cap(t) {
		t = grow(t)
	}
	t[len(t)] = v
	len(t)++
	x = t

We used to generate this code as Go pseudocode during walk.
Generate it instead as actual instructions during gen.

Doing so lets us apply a few optimizations. The most important
is that when, as in the above example, the source slice and the
destination slice are the same, the code can instead do:

	t := x
	if len(t)+1 > cap(t) {
		t = grow(t)
		x = {base(t), len(t)+1, cap(t)}
	} else {
		len(x)++
	}
	t[len(t)] = v

That is, in the fast path that does not reallocate the array,
only the updated length needs to be written back to x,
not the array pointer and not the capacity. This is more like
what you'd write by hand in C. It's faster in general, since
the fast path elides two of the three stores, but it's especially
faster when the form of x is such that the base pointer write
would turn into a write barrier. No write, no barrier.

name                   old mean              new mean              delta
BinaryTree17            5.68s × (0.97,1.04)   5.81s × (0.98,1.03)   +2.35% (p=0.023)
Fannkuch11              4.41s × (0.98,1.03)   4.35s × (1.00,1.00)     ~    (p=0.090)
FmtFprintfEmpty        92.7ns × (0.91,1.16)  86.0ns × (0.94,1.11)   -7.31% (p=0.038)
FmtFprintfString        281ns × (0.96,1.08)   276ns × (0.98,1.04)     ~    (p=0.219)
FmtFprintfInt           288ns × (0.97,1.06)   274ns × (0.98,1.06)   -4.94% (p=0.002)
FmtFprintfIntInt        493ns × (0.97,1.04)   506ns × (0.99,1.01)   +2.65% (p=0.009)
FmtFprintfPrefixedInt   423ns × (0.97,1.04)   391ns × (0.99,1.01)   -7.52% (p=0.000)
FmtFprintfFloat         598ns × (0.99,1.01)   566ns × (0.99,1.01)   -5.27% (p=0.000)
FmtManyArgs            1.89µs × (0.98,1.05)  1.91µs × (0.99,1.01)     ~    (p=0.231)
GobDecode              14.8ms × (0.98,1.03)  15.3ms × (0.99,1.02)   +3.01% (p=0.000)
GobEncode              12.3ms × (0.98,1.01)  11.5ms × (0.97,1.03)   -5.93% (p=0.000)
Gzip                    656ms × (0.99,1.05)   645ms × (0.99,1.01)     ~    (p=0.055)
Gunzip                  142ms × (1.00,1.00)   142ms × (1.00,1.00)   -0.32% (p=0.034)
HTTPClientServer       91.2µs × (0.97,1.04)  90.5µs × (0.97,1.04)     ~    (p=0.468)
JSONEncode             32.6ms × (0.97,1.08)  32.0ms × (0.98,1.03)     ~    (p=0.190)
JSONDecode              114ms × (0.97,1.05)   114ms × (0.99,1.01)     ~    (p=0.887)
Mandelbrot200          6.11ms × (0.98,1.04)  6.04ms × (1.00,1.01)     ~    (p=0.167)
GoParse                6.66ms × (0.97,1.04)  6.47ms × (0.97,1.05)   -2.81% (p=0.014)
RegexpMatchEasy0_32     159ns × (0.99,1.00)   171ns × (0.93,1.07)   +7.19% (p=0.002)
RegexpMatchEasy0_1K     538ns × (1.00,1.01)   550ns × (0.98,1.01)   +2.30% (p=0.000)
RegexpMatchEasy1_32     138ns × (1.00,1.00)   135ns × (0.99,1.02)   -1.60% (p=0.000)
RegexpMatchEasy1_1K     869ns × (0.99,1.01)   879ns × (1.00,1.01)   +1.08% (p=0.000)
RegexpMatchMedium_32    252ns × (0.99,1.01)   243ns × (1.00,1.00)   -3.71% (p=0.000)
RegexpMatchMedium_1K   72.7µs × (1.00,1.00)  70.3µs × (1.00,1.00)   -3.34% (p=0.000)
RegexpMatchHard_32     3.85µs × (1.00,1.00)  3.82µs × (1.00,1.01)   -0.81% (p=0.000)
RegexpMatchHard_1K      118µs × (1.00,1.00)   117µs × (1.00,1.00)   -0.56% (p=0.000)
Revcomp                 920ms × (0.97,1.07)   917ms × (0.97,1.04)     ~    (p=0.808)
Template                129ms × (0.98,1.03)   114ms × (0.99,1.01)  -12.06% (p=0.000)
TimeParse               619ns × (0.99,1.01)   622ns × (0.99,1.01)     ~    (p=0.062)
TimeFormat              661ns × (0.98,1.04)   665ns × (0.99,1.01)     ~    (p=0.524)

See next CL for combination with a similar optimization for slice.
The benchmarks that are slower in this CL are still faster overall
with the combination of the two.

Change-Id: I2a7421658091b2488c64741b4db15ab6c3b4cb7e
Reviewed-on: https://go-review.googlesource.com/9812
Reviewed-by: David Chase <drchase@google.com>
2015-05-12 17:55:09 +00:00
Russ Cox
f8d14fc3a0 cmd/internal/gc: add backend ginscmp function to emit a comparison
This lets us abstract away which arguments can be constants and so on
and lets the back ends reverse the order of arguments if that helps.

Change-Id: I283ec1d694f2dd84eba22e5eb4aad78a2d2d9eb0
Reviewed-on: https://go-review.googlesource.com/9810
Reviewed-by: David Chase <drchase@google.com>
2015-05-12 17:54:57 +00:00
Rob Pike
6439010e52 encoding/gob: add "too big" check when writing a message
Messages that are too big are rejected when read, so they should
be rejected when written too.

Fixes #10518.

Change-Id: I96678fbe2d94f51b957fe26faef33cd8df3823dd
Reviewed-on: https://go-review.googlesource.com/9965
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-12 17:52:39 +00:00
David du Colombier
7de86a1b1c runtime: terminate exit status buffer on Plan 9
The status buffer built by the exit function
was not nil-terminated.

Fixes #10789.

Change-Id: I2d34ac50a19d138176c4b47393497ba7070d5b61
Reviewed-on: https://go-review.googlesource.com/9953
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-12 16:35:58 +00:00
David du Colombier
f85a05581e runtime: fix signal handling on Plan 9
Once added to the signal queue, the pointer passed to the
signal handler could no longer be valid. Instead of passing
the pointer to the note string, we recopy the value of the
note string to a static array in the signal queue.

Fixes #10784.

Change-Id: Iddd6837b58a14dfaa16b069308ae28a7b8e0965b
Reviewed-on: https://go-review.googlesource.com/9950
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-12 16:35:46 +00:00
Russ Cox
18d98bc9cb cmd/internal/gc: avoid turning 'x = f()' into 'tmp = f(); x = tmp' for simple x
This slows down more things than I expected, but it also speeds things up,
and it reduces stack frame sizes and the load on the optimizer, so it's still
likely a net win.

name                                    old mean                new mean        delta
BenchmarkBinaryTree17              13.2s × (0.98,1.03)     13.2s × (0.98,1.02)  ~ (p=0.795)
BenchmarkFannkuch11                4.41s × (1.00,1.00)     4.45s × (0.99,1.01)  +0.88% (p=0.000)
BenchmarkFmtFprintfEmpty          86.4ns × (0.99,1.01)    90.1ns × (0.95,1.05)  +4.31% (p=0.000)
BenchmarkFmtFprintfString          318ns × (0.96,1.07)     337ns × (0.98,1.03)  +6.05% (p=0.000)
BenchmarkFmtFprintfInt             332ns × (0.97,1.04)     320ns × (0.97,1.02)  -3.42% (p=0.000)
BenchmarkFmtFprintfIntInt          562ns × (0.96,1.04)     574ns × (0.96,1.06)  +2.00% (p=0.013)
BenchmarkFmtFprintfPrefixedInt     442ns × (0.96,1.06)     450ns × (0.97,1.05)  +1.73% (p=0.039)
BenchmarkFmtFprintfFloat           640ns × (0.99,1.02)     659ns × (0.99,1.03)  +3.01% (p=0.000)
BenchmarkFmtManyArgs              2.19µs × (0.97,1.06)    2.21µs × (0.98,1.02)  ~ (p=0.104)
BenchmarkGobDecode                20.0ms × (0.98,1.03)    19.7ms × (0.97,1.04)  -1.35% (p=0.035)
BenchmarkGobEncode                17.8ms × (0.96,1.04)    18.0ms × (0.96,1.06)  ~ (p=0.131)
BenchmarkGzip                      653ms × (0.99,1.02)     652ms × (0.99,1.01)  ~ (p=0.572)
BenchmarkGunzip                    143ms × (0.99,1.02)     142ms × (1.00,1.01)  -0.52% (p=0.005)
BenchmarkHTTPClientServer          110µs × (0.98,1.03)     108µs × (0.99,1.02)  -1.90% (p=0.000)
BenchmarkJSONEncode               40.0ms × (0.98,1.05)    41.5ms × (0.97,1.06)  +3.89% (p=0.000)
BenchmarkJSONDecode                118ms × (0.99,1.01)     118ms × (0.98,1.01)  +0.69% (p=0.010)
BenchmarkMandelbrot200            6.03ms × (1.00,1.01)    6.03ms × (1.00,1.01)  ~ (p=0.924)
BenchmarkGoParse                  8.43ms × (0.92,1.11)    8.56ms × (0.93,1.05)  ~ (p=0.242)
BenchmarkRegexpMatchEasy0_32       180ns × (0.91,1.07)     163ns × (1.00,1.00)  -9.33% (p=0.000)
BenchmarkRegexpMatchEasy0_1K       550ns × (0.98,1.02)     558ns × (0.99,1.01)  +1.44% (p=0.000)
BenchmarkRegexpMatchEasy1_32       152ns × (0.94,1.05)     139ns × (0.98,1.02)  -8.51% (p=0.000)
BenchmarkRegexpMatchEasy1_1K       909ns × (0.98,1.06)     868ns × (0.99,1.02)  -4.52% (p=0.000)
BenchmarkRegexpMatchMedium_32      262ns × (0.97,1.03)     253ns × (0.99,1.02)  -3.31% (p=0.000)
BenchmarkRegexpMatchMedium_1K     73.8µs × (0.98,1.04)    72.7µs × (1.00,1.01)  -1.61% (p=0.001)
BenchmarkRegexpMatchHard_32       3.87µs × (0.99,1.02)    3.87µs × (1.00,1.01)  ~ (p=0.791)
BenchmarkRegexpMatchHard_1K        118µs × (0.98,1.04)     117µs × (0.99,1.02)  ~ (p=0.110)
BenchmarkRevcomp                   1.00s × (0.94,1.10)     0.99s × (0.94,1.09)  ~ (p=0.433)
BenchmarkTemplate                  140ms × (0.97,1.04)     140ms × (0.99,1.01)  ~ (p=0.303)
BenchmarkTimeParse                 622ns × (0.99,1.02)     625ns × (0.99,1.01)  +0.51% (p=0.001)
BenchmarkTimeFormat                731ns × (0.98,1.04)     719ns × (0.99,1.01)  -1.66% (p=0.000)

Change-Id: Ibc3edb59a178adafda50156f46a341f69a17d83f
Reviewed-on: https://go-review.googlesource.com/9721
Reviewed-by: David Chase <drchase@google.com>
2015-05-12 16:26:47 +00:00
Russ Cox
3f209abb29 cmd/internal/gc: detect bad append(f()) during type check
Today's earlier fix can stay, but it's a band-aid over the real problem,
which is that bad code was slipping through the type checker
into the back end (and luckily causing a type error there).

I discovered this because my new append does not use the same
temporaries and failed the test as written.

Fixes #9521.

Change-Id: I7e33e2ea15743406e15c6f3fdf73e1edecda69bd
Reviewed-on: https://go-review.googlesource.com/9921
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-05-12 16:26:35 +00:00
Jens Frederich
29dc4b40f8 cmd/go: "go get" don't ignore git default branch
Any Git branch can be the default branch not only master. Removing
hardwired 'checkout master', and using 'checkout {tag}' is the best
choice. It works with and without a master branch. Furthermore it
resolves the Github default branch issue. Changing Github default
branch is effectively changing HEAD.

Fixes #9032

Change-Id: I19a1221bcefe0806e7556c124c6da7ac0c2160b5
Reviewed-on: https://go-review.googlesource.com/5312
Reviewed-by: Russ Cox <rsc@golang.org>
2015-05-12 16:12:46 +00:00
Patrick Mezard
51021cc83f time: fix registry zone info lookup on Windows
registry.ReadSubKeyNames requires QUERY access right in addition to
ENUMERATE_SUB_KEYS.

This was making TestLocalZoneAbbr fail on Windows 7 in Paris/Madrid
timezone. It succeeded on Windows 8 because timezone name changed from
"Paris/Madrid" to "Romance Standard Time", the latter being matched by
an abbrs entry.

Change-Id: I791287ba9d1b3556246fa4e9e1604a1fbba1f5e6
Reviewed-on: https://go-review.googlesource.com/9809
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-12 14:28:40 +00:00
Alex Brainman
71bf182028 net: relax error checking in TestAcceptIgnoreSomeErrors
TestAcceptIgnoreSomeErrors was created to test that network
accept function ignores some errors. But conditions created
by the test also affects network reads. Change the test to
ignore these read errors when acceptable.

Fixes #10785

Change-Id: I3da85cb55bd3e78c1980ad949e53e82391f9b41e
Reviewed-on: https://go-review.googlesource.com/9942
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-12 04:02:25 +00:00
Shenghou Ma
7bbd4f780b syscall: fix running mkall.sh on linux/{ppc64,ppc64le}
Change-Id: I58c6e914d0e977d5748c87d277e30c933ed86f99
Reviewed-on: https://go-review.googlesource.com/9924
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-05-12 01:54:37 +00:00
Michael Hudson-Doyle
77fc03f4cd cmd/internal/ld, runtime: abort on shared library ABI mismatch
This:

1) Defines the ABI hash of a package (as the SHA1 of the __.PKGDEF)
2) Defines the ABI hash of a shared library (sort the packages by import
   path, concatenate the hashes of the packages and SHA1 that)
3) When building a shared library, compute the above value and define a
   global symbol that points to a go string that has the hash as its value.
4) When linking against a shared library, read the abi hash from the
   library and put both the value seen at link time and a reference
   to the global symbol into the moduledata.
5) During runtime initialization, check that the hash seen at link time
   still matches the hash the global symbol points to.

Change-Id: Iaa54c783790e6dde3057a2feadc35473d49614a5
Reviewed-on: https://go-review.googlesource.com/8773
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
2015-05-12 01:30:40 +00:00
Michael Hudson-Doyle
be0cb9224b runtime: fix addmoduledata to follow the platform ABI
addmoduledata is called from a .init_array function and need to follow the
platform ABI. It contains accesses to global data which are rewritten to use
R15 by the assembler, and as R15 is callee-save we need to save it.

Change-Id: I03893efb1576aed4f102f2465421f256f3bb0f30
Reviewed-on: https://go-review.googlesource.com/9941
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-05-12 00:50:32 +00:00
Mikio Hara
64b1aa12b3 net: drop unnecessary cast
Change-Id: I9b058472f5b4943db6e6f1c1243411ce61624c18
Reviewed-on: https://go-review.googlesource.com/9916
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-11 22:03:56 +00:00
Rahul Chaudhry
754e98cb82 cmd/dist: de-dup iOS detection
Change-Id: I89778988baec1cf4a35d9342c7dbe8c4c08ff3cd
Reviewed-on: https://go-review.googlesource.com/9893
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2015-05-11 20:42:57 +00:00
Alan Donovan
b1d144e158 go/constant: rename go/constants
Change-Id: I4b1ce33253890de9bc64fee9b476fe52eec87fc0
Reviewed-on: https://go-review.googlesource.com/9920
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2015-05-11 19:37:41 +00:00
Alexandre Cesaro
2b03610842 mime: Export RFC 2047 code
Fixes #4943
Fixes #4687
Fixes #7079

Change-Id: Ia96f07d650a3af935cd75fd7e3253f4af2977429
Reviewed-on: https://go-review.googlesource.com/7890
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-11 18:50:32 +00:00
Rob Pike
e3a9a08a0b fmt: allow for space and plus flags when computing widths
Fixes #10770.
Fixes #10771.

This time maybe for sure?

Change-Id: I43d6e5fd6846cf58427fec183832d500a932df59
Reviewed-on: https://go-review.googlesource.com/9896
Reviewed-by: Russ Cox <rsc@golang.org>
2015-05-11 18:34:19 +00:00
Josh Bleecher Snyder
e92a7247fa fmt: skip malloc test under race detector
Fixes #10778.

Change-Id: I09aab55dec429ec4a023e5ad591b929563cef0d9
Reviewed-on: https://go-review.googlesource.com/9855
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-11 17:45:26 +00:00
Didier Spezia
7c0db1b7e2 cmd/gc: do not display ~b identifiers in error messages
Instead of errors like:

./blank2.go:15: cannot use ~b1 (type []int) as type int in assignment

we now have:

./blank2.go:15: cannot use _ (type []int) as type int in assignment

Less confusing for users.

Fixes #9521

Change-Id: Ieab9859040e8e0df95deeaee7eeb408d3be61c0f
Reviewed-on: https://go-review.googlesource.com/9902
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-05-11 17:44:31 +00:00
Michael Hudson-Doyle
3475ec7f36 cmd/internal/ld: change Cpos to not flush the output buffer
DWARF generation appears to assume Cpos is cheap and this makes linking godoc
about 8% faster and linking the standard library into a single shared library
about 22% faster on my machine.

Updates #10571

Change-Id: I3f81efd0174e356716e7971c4f59810b72378177
Reviewed-on: https://go-review.googlesource.com/9913
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-05-11 17:08:36 +00:00
Daniel Morsing
516f0d1c90 net/http: silence race detector on client header timeout test
When running the client header timeout test, there is a race between
us timing out and waiting on the remaining requests to be serviced. If
the client times out before the server blocks on the channel in the
handler, we will be simultaneously adding to a waitgroup with the
value 0 and waiting on it when we call TestServer.Close().

This is largely a theoretical race. We have to time out before we
enter the handler and the only reason we would time out if we're
blocked on the channel. Nevertheless, make the race detector happy
by turning the close into a channel send. This turns the defer call
into a synchronization point and we can be sure that we've entered
the handler before we close the server.

Fixes #10780

Change-Id: Id73b017d1eb7503e446aa51538712ef49f2f5c9e
Reviewed-on: https://go-review.googlesource.com/9905
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-05-11 16:41:03 +00:00
Russ Cox
4212a3c3d9 runtime: use heap bitmap for typedmemmove
The current implementation of typedmemmove walks the ptrmask
in the type to find out where pointers are. This led to turning off
GC programs for the Go 1.5 dev cycle, so that there would always
be a ptrmask. Instead of also interpreting the GC programs,
interpret the heap bitmap, which we know must be available and
up to date. (There is no point to write barriers when writing outside
the heap.)

This CL is only about correctness. The next CL will optimize the code.

Change-Id: Id1305c7c071fd2734ab96634b0e1c745b23fa793
Reviewed-on: https://go-review.googlesource.com/9886
Reviewed-by: Austin Clements <austin@google.com>
2015-05-11 16:38:21 +00:00
Russ Cox
266a842f55 runtime: zero entire bitmap for object, even past dead marker
We want typedmemmove to use the heap bitmap to determine
where pointers are, instead of reinterpreting the type information.
The heap bitmap is simpler to access.

In general, typedmemmove will need to be able to look up the bits
for any word and find valid pointer information, so fill even after the
dead marker. Not filling after the dead marker was an optimization
I introduced only a few days ago, when reintroducing the dead marker
code. At the time I said it probably wouldn't last, and it didn't.

Change-Id: I6ba01bff17ddee1ff429f454abe29867ec60606e
Reviewed-on: https://go-review.googlesource.com/9885
Reviewed-by: Austin Clements <austin@google.com>
2015-05-11 16:37:46 +00:00
Russ Cox
e375ca2a25 runtime: reorder bits in heap bitmap bytes
The runtime deals with 1-bit pointer bitmaps and 2-bit heap bitmaps
that have entries for both pointers and mark bits.

Each byte in a 1-bit pointer bitmap looks like pppppppp (all pointer bits).
Each byte in a 2-bit heap bitmap looks like mpmpmpmp (mark, pointer, ...).
This means that when converting from 1-bit to 2-bit, as we do
during malloc, we have to pick up 4 bits in pppp form and use
shifts to create the mpmpmpmp form.

This CL changes the 2-bit heap bitmap form to mmmmpppp,
so that 4 bits picked up in 1-bit form can be used directly in
the low bits of the heap bitmap byte, without expansion.
This simplifies the code, and it also happens to be faster.

name                    old mean              new mean              delta
SetTypePtr              14.0ns × (0.98,1.09)  14.0ns × (0.98,1.08)     ~    (p=0.966)
SetTypePtr8             16.5ns × (0.99,1.05)  15.3ns × (0.96,1.16)   -6.86% (p=0.012)
SetTypePtr16            21.3ns × (0.98,1.05)  18.8ns × (0.94,1.14)  -11.49% (p=0.000)
SetTypePtr32            34.6ns × (0.93,1.22)  27.7ns × (0.91,1.26)  -20.08% (p=0.001)
SetTypePtr64            55.7ns × (0.97,1.11)  41.6ns × (0.98,1.04)  -25.30% (p=0.000)
SetTypePtr126           98.0ns × (1.00,1.00)  67.7ns × (0.99,1.05)  -30.88% (p=0.000)
SetTypePtr128           98.6ns × (1.00,1.01)  68.6ns × (0.99,1.03)  -30.44% (p=0.000)
SetTypePtrSlice          781ns × (0.99,1.01)   571ns × (0.99,1.04)  -26.93% (p=0.000)
SetTypeNode1            13.1ns × (0.99,1.01)  12.1ns × (0.99,1.01)   -7.45% (p=0.000)
SetTypeNode1Slice        113ns × (0.99,1.01)    94ns × (1.00,1.00)  -16.35% (p=0.000)
SetTypeNode8            32.7ns × (1.00,1.00)  29.8ns × (0.99,1.01)   -8.97% (p=0.000)
SetTypeNode8Slice        266ns × (1.00,1.00)   204ns × (1.00,1.00)  -23.40% (p=0.000)
SetTypeNode64           58.0ns × (0.98,1.08)  42.8ns × (1.00,1.01)  -26.24% (p=0.000)
SetTypeNode64Slice      1.55µs × (0.99,1.02)  0.96µs × (1.00,1.00)  -37.84% (p=0.000)
SetTypeNode64Dead       13.1ns × (0.99,1.01)  12.1ns × (1.00,1.00)   -7.33% (p=0.000)
SetTypeNode64DeadSlice  1.52µs × (1.00,1.01)  1.08µs × (1.00,1.01)  -28.95% (p=0.000)
SetTypeNode124          97.9ns × (1.00,1.00)  67.1ns × (1.00,1.01)  -31.49% (p=0.000)
SetTypeNode124Slice     2.87µs × (0.99,1.02)  1.75µs × (1.00,1.01)  -39.15% (p=0.000)
SetTypeNode126          98.4ns × (1.00,1.01)  68.1ns × (1.00,1.01)  -30.79% (p=0.000)
SetTypeNode126Slice     2.91µs × (0.99,1.01)  1.77µs × (0.99,1.01)  -39.09% (p=0.000)
SetTypeNode1024          732ns × (1.00,1.00)   511ns × (0.87,1.42)  -30.14% (p=0.000)
SetTypeNode1024Slice    23.1µs × (1.00,1.00)  13.9µs × (0.99,1.02)  -39.83% (p=0.000)

Change-Id: I12e3b850a4e6fa6c8146b8635ff728f3ef658819
Reviewed-on: https://go-review.googlesource.com/9828
Reviewed-by: Austin Clements <austin@google.com>
2015-05-11 16:37:36 +00:00
Russ Cox
363fd1dd1b runtime: move a few atomic fields up
Moving them up makes them properly aligned on 32-bit systems.
There are some odd fields above them right now
(like fixalloc and mutex maybe).

Change-Id: I57851a5bbb2e7cc339712f004f99bb6c0cce0ca5
Reviewed-on: https://go-review.googlesource.com/9889
Reviewed-by: Austin Clements <austin@google.com>
2015-05-11 16:08:57 +00:00
Russ Cox
fc595b78d2 cmd/internal/gc: mark panicindex calls as not returning
Most of the calls to panicindex are already
marked as not returning, but these two were missed
at some point.

Performance changes below.

name                   old mean              new mean              delta
BinaryTree17            5.70s × (0.98,1.04)   5.68s × (0.97,1.04)    ~    (p=0.681)
Fannkuch11              4.32s × (1.00,1.00)   4.41s × (0.98,1.03)  +1.98% (p=0.018)
FmtFprintfEmpty        92.6ns × (0.91,1.11)  92.7ns × (0.91,1.16)    ~    (p=0.969)
FmtFprintfString        280ns × (0.97,1.05)   281ns × (0.96,1.08)    ~    (p=0.860)
FmtFprintfInt           284ns × (0.99,1.02)   288ns × (0.97,1.06)    ~    (p=0.207)
FmtFprintfIntInt        488ns × (0.98,1.01)   493ns × (0.97,1.04)    ~    (p=0.271)
FmtFprintfPrefixedInt   418ns × (0.98,1.04)   423ns × (0.97,1.04)    ~    (p=0.311)
FmtFprintfFloat         597ns × (1.00,1.00)   598ns × (0.99,1.01)    ~    (p=0.789)
FmtManyArgs            1.87µs × (0.99,1.01)  1.89µs × (0.98,1.05)    ~    (p=0.158)
GobDecode              14.6ms × (0.99,1.01)  14.8ms × (0.98,1.03)  +1.51% (p=0.015)
GobEncode              12.3ms × (0.98,1.03)  12.3ms × (0.98,1.01)    ~    (p=0.474)
Gzip                    647ms × (1.00,1.01)   656ms × (0.99,1.05)    ~    (p=0.104)
Gunzip                  142ms × (1.00,1.00)   142ms × (1.00,1.00)    ~    (p=0.110)
HTTPClientServer       89.6µs × (0.99,1.03)  91.2µs × (0.97,1.04)    ~    (p=0.061)
JSONEncode             31.7ms × (0.99,1.01)  32.6ms × (0.97,1.08)  +2.87% (p=0.038)
JSONDecode              111ms × (1.00,1.01)   114ms × (0.97,1.05)  +2.47% (p=0.040)
Mandelbrot200          6.01ms × (1.00,1.00)  6.11ms × (0.98,1.04)    ~    (p=0.073)
GoParse                6.54ms × (0.99,1.02)  6.66ms × (0.97,1.04)    ~    (p=0.064)
RegexpMatchEasy0_32     159ns × (0.99,1.02)   159ns × (0.99,1.00)    ~    (p=0.693)
RegexpMatchEasy0_1K     540ns × (0.99,1.03)   538ns × (1.00,1.01)    ~    (p=0.360)
RegexpMatchEasy1_32     137ns × (0.99,1.01)   138ns × (1.00,1.00)    ~    (p=0.511)
RegexpMatchEasy1_1K     867ns × (1.00,1.01)   869ns × (0.99,1.01)    ~    (p=0.193)
RegexpMatchMedium_32    252ns × (1.00,1.00)   252ns × (0.99,1.01)    ~    (p=0.076)
RegexpMatchMedium_1K   72.7µs × (1.00,1.00)  72.7µs × (1.00,1.00)    ~    (p=0.963)
RegexpMatchHard_32     3.84µs × (1.00,1.00)  3.85µs × (1.00,1.00)    ~    (p=0.371)
RegexpMatchHard_1K      117µs × (1.00,1.01)   118µs × (1.00,1.00)    ~    (p=0.898)
Revcomp                 909ms × (0.98,1.03)   920ms × (0.97,1.07)    ~    (p=0.368)
Template                128ms × (0.99,1.01)   129ms × (0.98,1.03)  +1.41% (p=0.042)
TimeParse               619ns × (0.98,1.01)   619ns × (0.99,1.01)    ~    (p=0.730)
TimeFormat              651ns × (1.00,1.01)   661ns × (0.98,1.04)    ~    (p=0.097)

Change-Id: I0ec5baff41f5d282307137ce0d927e6301e4fa10
Reviewed-on: https://go-review.googlesource.com/9811
Reviewed-by: David Chase <drchase@google.com>
2015-05-11 15:22:56 +00:00
Russ Cox
dcf6e20606 cmd/internal/gc: drop unused Reslice field from Node
Dead code.

This field is left over from Go 1.4, when we elided the fake write
barrier in this case. Today, it's unused (always false).
The upcoming append/slice changes handle this case again,
but without needing this field.

Change-Id: Ic6f160b64efdc1bbed02097ee03050f8cd0ab1b8
Reviewed-on: https://go-review.googlesource.com/9789
Reviewed-by: David Chase <drchase@google.com>
2015-05-11 15:22:26 +00:00
Russ Cox
c70b4b5f7e cmd/internal/gc: show register dump before crashing on register left allocated
If you are using -h to get a stack trace at the site of the failure,
Yyerror will never return. Dump the register allocation sites
before calling Yyerror.

Change-Id: I51266c03e06cb5084c2eaa89b367b9ed85ba286a
Reviewed-on: https://go-review.googlesource.com/9788
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
2015-05-11 15:22:11 +00:00
Russ Cox
8f037fa1ab runtime: fix TestLFStack on 386
The new(uint64) was moving to the stack, which may not be aligned.

Change-Id: Iad070964202001b52029494d43e299fed980f939
Reviewed-on: https://go-review.googlesource.com/9787
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2015-05-11 15:21:54 +00:00
Russ Cox
351897d9d4 cmd/internal/gc: emit branches in -g mode
The -g mode is a debugging mode that prints instructions
as they are constructed. Gbranch was just missing the print.

Change-Id: I3fb45fd9bd3996ed96df5be903b9fd6bd97148b0
Reviewed-on: https://go-review.googlesource.com/9827
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-05-11 14:55:36 +00:00
Russ Cox
1635ab7dfe runtime: remove wbshadow mode
The write barrier shadow heap was very useful for
developing the write barriers initially, but it's no longer used,
clunky, and dragging the rest of the implementation down.

The gccheckmark mode will find bugs due to missed barriers
when they result in missed marks; wbshadow mode found the
missed barriers more aggressively, but it required an entire
separate copy of the heap. The gccheckmark mode requires
no extra memory, making it more useful in practice.

Compared to previous CL:
name                   old mean              new mean              delta
BinaryTree17            5.91s × (0.96,1.06)   5.72s × (0.97,1.03)  -3.12% (p=0.000)
Fannkuch11              4.32s × (1.00,1.00)   4.36s × (1.00,1.00)  +0.91% (p=0.000)
FmtFprintfEmpty        89.0ns × (0.93,1.10)  86.6ns × (0.96,1.11)    ~    (p=0.077)
FmtFprintfString        298ns × (0.98,1.06)   283ns × (0.99,1.04)  -4.90% (p=0.000)
FmtFprintfInt           286ns × (0.98,1.03)   283ns × (0.98,1.04)  -1.09% (p=0.032)
FmtFprintfIntInt        498ns × (0.97,1.06)   480ns × (0.99,1.02)  -3.65% (p=0.000)
FmtFprintfPrefixedInt   408ns × (0.98,1.02)   396ns × (0.99,1.01)  -3.00% (p=0.000)
FmtFprintfFloat         587ns × (0.98,1.01)   562ns × (0.99,1.01)  -4.34% (p=0.000)
FmtManyArgs            1.94µs × (0.99,1.02)  1.89µs × (0.99,1.01)  -2.85% (p=0.000)
GobDecode              15.8ms × (0.98,1.03)  15.7ms × (0.99,1.02)    ~    (p=0.251)
GobEncode              12.0ms × (0.96,1.09)  11.8ms × (0.98,1.03)  -1.87% (p=0.024)
Gzip                    648ms × (0.99,1.01)   647ms × (0.99,1.01)    ~    (p=0.688)
Gunzip                  143ms × (1.00,1.01)   143ms × (1.00,1.01)    ~    (p=0.203)
HTTPClientServer       90.3µs × (0.98,1.01)  89.1µs × (0.99,1.02)  -1.30% (p=0.000)
JSONEncode             31.6ms × (0.99,1.01)  31.7ms × (0.98,1.02)    ~    (p=0.219)
JSONDecode              107ms × (1.00,1.01)   111ms × (0.99,1.01)  +3.58% (p=0.000)
Mandelbrot200          6.03ms × (1.00,1.01)  6.01ms × (1.00,1.00)    ~    (p=0.077)
GoParse                6.53ms × (0.99,1.03)  6.54ms × (0.99,1.02)    ~    (p=0.585)
RegexpMatchEasy0_32     161ns × (1.00,1.01)   161ns × (0.98,1.05)    ~    (p=0.948)
RegexpMatchEasy0_1K     541ns × (0.99,1.01)   559ns × (0.98,1.01)  +3.32% (p=0.000)
RegexpMatchEasy1_32     138ns × (1.00,1.00)   137ns × (0.99,1.01)  -0.55% (p=0.001)
RegexpMatchEasy1_1K     887ns × (0.99,1.01)   878ns × (0.99,1.01)  -0.98% (p=0.000)
RegexpMatchMedium_32    253ns × (0.99,1.01)   252ns × (0.99,1.01)  -0.39% (p=0.001)
RegexpMatchMedium_1K   72.8µs × (1.00,1.00)  72.7µs × (1.00,1.00)    ~    (p=0.485)
RegexpMatchHard_32     3.85µs × (1.00,1.01)  3.85µs × (1.00,1.01)    ~    (p=0.283)
RegexpMatchHard_1K      117µs × (1.00,1.01)   117µs × (1.00,1.00)    ~    (p=0.175)
Revcomp                 922ms × (0.97,1.08)   903ms × (0.98,1.05)  -2.15% (p=0.021)
Template                126ms × (0.99,1.01)   126ms × (0.99,1.01)    ~    (p=0.943)
TimeParse               628ns × (0.99,1.01)   634ns × (0.99,1.01)  +0.92% (p=0.000)
TimeFormat              668ns × (0.99,1.01)   698ns × (0.98,1.03)  +4.53% (p=0.000)

It's nice that the microbenchmarks are the ones helped the most,
because those were the ones hurt the most by the conversion from
4-bit to 2-bit heap bitmaps. This CL brings the overall effect of that
process to (compared to CL 9706 patch set 1):

name                   old mean              new mean              delta
BinaryTree17            5.87s × (0.94,1.09)   5.72s × (0.97,1.03)  -2.57% (p=0.011)
Fannkuch11              4.32s × (1.00,1.00)   4.36s × (1.00,1.00)  +0.87% (p=0.000)
FmtFprintfEmpty        89.1ns × (0.95,1.16)  86.6ns × (0.96,1.11)    ~    (p=0.090)
FmtFprintfString        283ns × (0.98,1.02)   283ns × (0.99,1.04)    ~    (p=0.681)
FmtFprintfInt           284ns × (0.98,1.04)   283ns × (0.98,1.04)    ~    (p=0.620)
FmtFprintfIntInt        486ns × (0.98,1.03)   480ns × (0.99,1.02)  -1.27% (p=0.002)
FmtFprintfPrefixedInt   400ns × (0.99,1.02)   396ns × (0.99,1.01)  -0.84% (p=0.001)
FmtFprintfFloat         566ns × (0.99,1.01)   562ns × (0.99,1.01)  -0.80% (p=0.000)
FmtManyArgs            1.91µs × (0.99,1.02)  1.89µs × (0.99,1.01)  -1.10% (p=0.000)
GobDecode              15.5ms × (0.98,1.05)  15.7ms × (0.99,1.02)  +1.55% (p=0.005)
GobEncode              11.9ms × (0.97,1.03)  11.8ms × (0.98,1.03)  -0.97% (p=0.048)
Gzip                    648ms × (0.99,1.01)   647ms × (0.99,1.01)    ~    (p=0.627)
Gunzip                  143ms × (1.00,1.00)   143ms × (1.00,1.01)    ~    (p=0.482)
HTTPClientServer       89.2µs × (0.99,1.02)  89.1µs × (0.99,1.02)    ~    (p=0.740)
JSONEncode             32.3ms × (0.97,1.06)  31.7ms × (0.98,1.02)  -1.95% (p=0.002)
JSONDecode              106ms × (0.99,1.01)   111ms × (0.99,1.01)  +4.22% (p=0.000)
Mandelbrot200          6.02ms × (1.00,1.00)  6.01ms × (1.00,1.00)    ~    (p=0.417)
GoParse                6.57ms × (0.97,1.06)  6.54ms × (0.99,1.02)    ~    (p=0.404)
RegexpMatchEasy0_32     162ns × (1.00,1.00)   161ns × (0.98,1.05)    ~    (p=0.088)
RegexpMatchEasy0_1K     561ns × (0.99,1.02)   559ns × (0.98,1.01)  -0.47% (p=0.034)
RegexpMatchEasy1_32     145ns × (0.95,1.04)   137ns × (0.99,1.01)  -5.56% (p=0.000)
RegexpMatchEasy1_1K     864ns × (0.99,1.04)   878ns × (0.99,1.01)  +1.57% (p=0.000)
RegexpMatchMedium_32    255ns × (0.99,1.04)   252ns × (0.99,1.01)  -1.43% (p=0.001)
RegexpMatchMedium_1K   73.9µs × (0.98,1.04)  72.7µs × (1.00,1.00)  -1.55% (p=0.004)
RegexpMatchHard_32     3.92µs × (0.98,1.04)  3.85µs × (1.00,1.01)  -1.80% (p=0.003)
RegexpMatchHard_1K      120µs × (0.98,1.04)   117µs × (1.00,1.00)  -2.13% (p=0.001)
Revcomp                 936ms × (0.95,1.08)   903ms × (0.98,1.05)  -3.58% (p=0.002)
Template                130ms × (0.98,1.04)   126ms × (0.99,1.01)  -2.98% (p=0.000)
TimeParse               638ns × (0.98,1.05)   634ns × (0.99,1.01)    ~    (p=0.198)
TimeFormat              674ns × (0.99,1.01)   698ns × (0.98,1.03)  +3.69% (p=0.000)

Change-Id: Ia0e9b50b1d75a3c0c7556184cd966305574fe07c
Reviewed-on: https://go-review.googlesource.com/9706
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-05-11 14:55:11 +00:00
Russ Cox
54af9a3ba5 runtime: reintroduce ``dead'' space during GC scan
Reintroduce an optimization discarded during the initial conversion
from 4-bit heap bitmaps to 2-bit heap bitmaps: when we reach the
place in the bitmap where there are no more pointers, mark that position
for the GC so that it can avoid scanning past that place.

During heapBitsSetType we can also avoid initializing heap bitmap
beyond that location, which gives a bit of a win compared to Go 1.4.
This particular optimization (not initializing the heap bitmap) may not last:
we might change typedmemmove to use the heap bitmap, in which
case it would all need to be initialized. The early stop in the GC scan
will stay no matter what.

Compared to Go 1.4 (github.com/rsc/go, branch go14bench):
name                    old mean              new mean              delta
SetTypeNode64           80.7ns × (1.00,1.01)  57.4ns × (1.00,1.01)  -28.83% (p=0.000)
SetTypeNode64Dead       80.5ns × (1.00,1.01)  13.1ns × (0.99,1.02)  -83.77% (p=0.000)
SetTypeNode64Slice      2.16µs × (1.00,1.01)  1.54µs × (1.00,1.01)  -28.75% (p=0.000)
SetTypeNode64DeadSlice  2.16µs × (1.00,1.01)  1.52µs × (1.00,1.00)  -29.74% (p=0.000)

Compared to previous CL:
name                    old mean              new mean              delta
SetTypeNode64           56.7ns × (1.00,1.00)  57.4ns × (1.00,1.01)   +1.19% (p=0.000)
SetTypeNode64Dead       57.2ns × (1.00,1.00)  13.1ns × (0.99,1.02)  -77.15% (p=0.000)
SetTypeNode64Slice      1.56µs × (1.00,1.01)  1.54µs × (1.00,1.01)   -0.89% (p=0.000)
SetTypeNode64DeadSlice  1.55µs × (1.00,1.01)  1.52µs × (1.00,1.00)   -2.23% (p=0.000)

This is the last CL in the sequence converting from the 4-bit heap
to the 2-bit heap, with all the same optimizations reenabled.
Compared to before that process began (compared to CL 9701 patch set 1):

name                    old mean              new mean              delta
BinaryTree17             5.87s × (0.94,1.09)   5.91s × (0.96,1.06)    ~    (p=0.578)
Fannkuch11               4.32s × (1.00,1.00)   4.32s × (1.00,1.00)    ~    (p=0.474)
FmtFprintfEmpty         89.1ns × (0.95,1.16)  89.0ns × (0.93,1.10)    ~    (p=0.942)
FmtFprintfString         283ns × (0.98,1.02)   298ns × (0.98,1.06)  +5.33% (p=0.000)
FmtFprintfInt            284ns × (0.98,1.04)   286ns × (0.98,1.03)    ~    (p=0.208)
FmtFprintfIntInt         486ns × (0.98,1.03)   498ns × (0.97,1.06)  +2.48% (p=0.000)
FmtFprintfPrefixedInt    400ns × (0.99,1.02)   408ns × (0.98,1.02)  +2.23% (p=0.000)
FmtFprintfFloat          566ns × (0.99,1.01)   587ns × (0.98,1.01)  +3.69% (p=0.000)
FmtManyArgs             1.91µs × (0.99,1.02)  1.94µs × (0.99,1.02)  +1.81% (p=0.000)
GobDecode               15.5ms × (0.98,1.05)  15.8ms × (0.98,1.03)  +1.94% (p=0.002)
GobEncode               11.9ms × (0.97,1.03)  12.0ms × (0.96,1.09)    ~    (p=0.263)
Gzip                     648ms × (0.99,1.01)   648ms × (0.99,1.01)    ~    (p=0.992)
Gunzip                   143ms × (1.00,1.00)   143ms × (1.00,1.01)    ~    (p=0.585)
HTTPClientServer        89.2µs × (0.99,1.02)  90.3µs × (0.98,1.01)  +1.24% (p=0.000)
JSONEncode              32.3ms × (0.97,1.06)  31.6ms × (0.99,1.01)  -2.29% (p=0.000)
JSONDecode               106ms × (0.99,1.01)   107ms × (1.00,1.01)  +0.62% (p=0.000)
Mandelbrot200           6.02ms × (1.00,1.00)  6.03ms × (1.00,1.01)    ~    (p=0.250)
GoParse                 6.57ms × (0.97,1.06)  6.53ms × (0.99,1.03)    ~    (p=0.243)
RegexpMatchEasy0_32      162ns × (1.00,1.00)   161ns × (1.00,1.01)  -0.80% (p=0.000)
RegexpMatchEasy0_1K      561ns × (0.99,1.02)   541ns × (0.99,1.01)  -3.67% (p=0.000)
RegexpMatchEasy1_32      145ns × (0.95,1.04)   138ns × (1.00,1.00)  -5.04% (p=0.000)
RegexpMatchEasy1_1K      864ns × (0.99,1.04)   887ns × (0.99,1.01)  +2.57% (p=0.000)
RegexpMatchMedium_32     255ns × (0.99,1.04)   253ns × (0.99,1.01)  -1.05% (p=0.012)
RegexpMatchMedium_1K    73.9µs × (0.98,1.04)  72.8µs × (1.00,1.00)  -1.51% (p=0.005)
RegexpMatchHard_32      3.92µs × (0.98,1.04)  3.85µs × (1.00,1.01)  -1.88% (p=0.002)
RegexpMatchHard_1K       120µs × (0.98,1.04)   117µs × (1.00,1.01)  -2.02% (p=0.001)
Revcomp                  936ms × (0.95,1.08)   922ms × (0.97,1.08)    ~    (p=0.234)
Template                 130ms × (0.98,1.04)   126ms × (0.99,1.01)  -2.99% (p=0.000)
TimeParse                638ns × (0.98,1.05)   628ns × (0.99,1.01)  -1.54% (p=0.004)
TimeFormat               674ns × (0.99,1.01)   668ns × (0.99,1.01)  -0.80% (p=0.001)

The slowdown of the first few benchmarks seems to be due to the new
atomic operations for certain small size allocations. But the larger
benchmarks mostly improve, probably due to the decreased memory
pressure from having half as much heap bitmap.

CL 9706, which removes the (never used anymore) wbshadow mode,
gets back what is lost in the early microbenchmarks.

Change-Id: I37423a209e8ec2a2e92538b45cac5422a6acd32d
Reviewed-on: https://go-review.googlesource.com/9705
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-05-11 14:51:40 +00:00
Russ Cox
feb8a3b616 runtime: optimize heapBitsSetType
For the conversion of the heap bitmap from 4-bit to 2-bit fields,
I replaced heapBitsSetType with the dumbest thing that could possibly work:
two atomic operations (atomicand8+atomicor8) per 2-bit field.

This CL replaces that code with a proper implementation that
avoids the atomics whenever possible. Benchmarks vs base CL
(before the conversion to 2-bit heap bitmap) and vs Go 1.4 below.

Compared to Go 1.4, SetTypePtr (a 1-pointer allocation)
is 10ns slower because a race against the concurrent GC requires the
use of an atomicor8 that used to be an ordinary write. This slowdown
was present even in the base CL.

Compared to both Go 1.4 and base, SetTypeNode8 (a 10-word allocation)
is 10ns slower because it too needs a new atomic, because with the
denser representation, the byte on the end of the allocation is now shared
with the object next to it; this was not true with the 4-bit representation.

Excluding these two (fundamental) slowdowns due to the use of atomics,
the new code is noticeably faster than both Go 1.4 and the base CL.

The next CL will reintroduce the ``typeDead'' optimization.

Stats are from 5 runs on a MacBookPro10,2 (late 2012 Core i5).

Compared to base CL (** = new atomic)
name                  old mean              new mean              delta
SetTypePtr            14.1ns × (0.99,1.02)  14.7ns × (0.93,1.10)     ~    (p=0.175)
SetTypePtr8           18.4ns × (1.00,1.01)  18.6ns × (0.81,1.21)     ~    (p=0.866)
SetTypePtr16          28.7ns × (1.00,1.00)  22.4ns × (0.90,1.27)  -21.88% (p=0.015)
SetTypePtr32          52.3ns × (1.00,1.00)  33.8ns × (0.93,1.24)  -35.37% (p=0.001)
SetTypePtr64          79.2ns × (1.00,1.00)  55.1ns × (1.00,1.01)  -30.43% (p=0.000)
SetTypePtr126          118ns × (1.00,1.00)   100ns × (1.00,1.00)  -15.97% (p=0.000)
SetTypePtr128          130ns × (0.92,1.19)    98ns × (1.00,1.00)  -24.36% (p=0.008)
SetTypePtrSlice        726ns × (0.96,1.08)   760ns × (1.00,1.00)     ~    (p=0.152)
SetTypeNode1          14.1ns × (0.94,1.15)  12.0ns × (1.00,1.01)  -14.60% (p=0.020)
SetTypeNode1Slice      135ns × (0.96,1.07)    88ns × (1.00,1.00)  -34.53% (p=0.000)
SetTypeNode8          20.9ns × (1.00,1.01)  32.6ns × (1.00,1.00)  +55.37% (p=0.000) **
SetTypeNode8Slice      414ns × (0.99,1.02)   244ns × (1.00,1.00)  -41.09% (p=0.000)
SetTypeNode64         80.0ns × (1.00,1.00)  57.4ns × (1.00,1.00)  -28.23% (p=0.000)
SetTypeNode64Slice    2.15µs × (1.00,1.01)  1.56µs × (1.00,1.00)  -27.43% (p=0.000)
SetTypeNode124         119ns × (0.99,1.00)   100ns × (1.00,1.00)  -16.11% (p=0.000)
SetTypeNode124Slice   3.40µs × (1.00,1.00)  2.93µs × (1.00,1.00)  -13.80% (p=0.000)
SetTypeNode126         120ns × (1.00,1.01)    98ns × (1.00,1.00)  -18.19% (p=0.000)
SetTypeNode126Slice   3.53µs × (0.98,1.08)  3.02µs × (1.00,1.00)  -14.49% (p=0.002)
SetTypeNode1024        726ns × (0.97,1.09)   740ns × (1.00,1.00)     ~    (p=0.451)
SetTypeNode1024Slice  24.9µs × (0.89,1.37)  23.1µs × (1.00,1.00)     ~    (p=0.476)

Compared to Go 1.4 (** = new atomic)
name                  old mean               new mean              delta
SetTypePtr            5.71ns × (0.89,1.19)  14.68ns × (0.93,1.10)  +157.24% (p=0.000) **
SetTypePtr8           19.3ns × (0.96,1.10)   18.6ns × (0.81,1.21)      ~    (p=0.638)
SetTypePtr16          30.7ns × (0.99,1.03)   22.4ns × (0.90,1.27)   -26.88% (p=0.005)
SetTypePtr32          51.5ns × (1.00,1.00)   33.8ns × (0.93,1.24)   -34.40% (p=0.001)
SetTypePtr64          83.6ns × (0.94,1.12)   55.1ns × (1.00,1.01)   -34.12% (p=0.001)
SetTypePtr126          137ns × (0.87,1.26)    100ns × (1.00,1.00)   -27.10% (p=0.028)
SetTypePtrSlice        865ns × (0.80,1.23)    760ns × (1.00,1.00)      ~    (p=0.243)
SetTypeNode1          15.2ns × (0.88,1.12)   12.0ns × (1.00,1.01)   -20.89% (p=0.014)
SetTypeNode1Slice      156ns × (0.93,1.16)     88ns × (1.00,1.00)   -43.57% (p=0.001)
SetTypeNode8          23.8ns × (0.90,1.18)   32.6ns × (1.00,1.00)   +36.76% (p=0.003) **
SetTypeNode8Slice      502ns × (0.92,1.10)    244ns × (1.00,1.00)   -51.46% (p=0.000)
SetTypeNode64         85.6ns × (0.94,1.11)   57.4ns × (1.00,1.00)   -32.89% (p=0.001)
SetTypeNode64Slice    2.36µs × (0.91,1.14)   1.56µs × (1.00,1.00)   -33.96% (p=0.002)
SetTypeNode124         130ns × (0.91,1.12)    100ns × (1.00,1.00)   -23.49% (p=0.004)
SetTypeNode124Slice   3.81µs × (0.90,1.22)   2.93µs × (1.00,1.00)   -23.09% (p=0.025)

There are fewer benchmarks vs Go 1.4 because unrolling directly
into the heap bitmap is not yet implemented, so those would not
be meaningful comparisons.

These benchmarks were not present in Go 1.4 as distributed.
The backport to Go 1.4 is in github.com/rsc/go's go14bench branch,
commit 71d5ee5.

Change-Id: I95ed05a22bf484b0fc9efad549279e766c98d2b6
Reviewed-on: https://go-review.googlesource.com/9704
Reviewed-by: Rick Hudson <rlh@golang.org>
2015-05-11 14:51:20 +00:00
Russ Cox
0234dfd493 runtime: use 2-bit heap bitmap (in place of 4-bit)
Previous CLs changed the representation of the non-heap type bitmaps
to be 1-bit bitmaps (pointer or not). Before this CL, the heap bitmap
stored a 2-bit type for each word and a mark bit and checkmark bit
for the first word of the object. (There used to be additional per-word bits.)

Reduce heap bitmap to 2-bit, with 1 dedicated to pointer or not,
and the other used for mark, checkmark, and "keep scanning forward
to find pointers in this object." See comments for details.

This CL replaces heapBitsSetType with very slow but obviously correct code.
A followup CL will optimize it. (Spoiler: the new code is faster than Go 1.4 was.)

Change-Id: I999577a133f3cfecacebdec9cdc3573c235c7fb9
Reviewed-on: https://go-review.googlesource.com/9703
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2015-05-11 14:43:45 +00:00
Russ Cox
6d8a147bef runtime: use 1-bit pointer bitmaps in type representation
The type information in reflect.Type and the GC programs is now
1 bit per word, down from 2 bits.

The in-memory unrolled type bitmap representation are now
1 bit per word, down from 4 bits.

The conversion from the unrolled (now 1-bit) bitmap to the
heap bitmap (still 4-bit) is not optimized. A followup CL will
work on that, after the heap bitmap has been converted to 2-bit.

The typeDead optimization, in which a special value denotes
that there are no more pointers anywhere in the object, is lost
in this CL. A followup CL will bring it back in the final form of
heapBitsSetType.

Change-Id: If61e67950c16a293b0b516a6fd9a1c755b6d5549
Reviewed-on: https://go-review.googlesource.com/9702
Reviewed-by: Austin Clements <austin@google.com>
2015-05-11 14:43:33 +00:00
Russ Cox
7d9e16abc6 runtime: add benchmark of heapBitsSetType
There was an old benchmark that measured this indirectly
via allocation, but I don't understand how to factor out the
allocation cost when interpreting the numbers.

Replace with a benchmark that only calls heapBitsSetType,
that does not allocate. This was not possible when the
benchmark was first written, because heapBitsSetType had
not been factored out of mallocgc.

Change-Id: I30f0f02362efab3465a50769398be859832e6640
Reviewed-on: https://go-review.googlesource.com/9701
Reviewed-by: Austin Clements <austin@google.com>
2015-05-11 14:40:27 +00:00
Daniel Morsing
db6f88a84b runtime: enable profiling on g0
Since we now have stack information for code running on the
systemstack, we can traceback over it. To make cpu profiles useful,
add a case in gentraceback to jump over systemstack switches.

Fixes #10609.

Change-Id: I21f47fcc802c07c5d4a1ada56374314e388a6dc7
Reviewed-on: https://go-review.googlesource.com/9506
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-05-11 08:44:30 +00:00
Patrick Mezard
19e81a9b3b internal/syscall/windows/registry: handle invalid integer values
I have around twenty of such values on a Windows 7 development machine.
regedit displays (translated): "invalid 32-bits DWORD value".

Change-Id: Ib37a414ee4c85e891b0a25fed2ddad9e105f5f4e
Reviewed-on: https://go-review.googlesource.com/9901
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2015-05-11 06:18:59 +00:00
Shenghou Ma
dce432b388 misc/trace: add license for the trace-viewer
The trace-viewer doesn't use the Go license, so it makes sense
to include the license text into the README.md file.

While we're at here, reformat existing text using real Markdown
syntax.

Change-Id: I13e42d3cc6a0ca7e64e3d46ad460dc0460f7ed09
Reviewed-on: https://go-review.googlesource.com/9882
Reviewed-by: Rob Pike <r@golang.org>
2015-05-11 06:09:10 +00:00