qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-23 20:20:01 -07:00

Author	SHA1	Message	Date
Austin Clements	bb6320535d	runtime: replace STW for enabling write barriers with ragged barrier Currently, we use a full stop-the-world around enabling write barriers. This is to ensure that all Gs have enabled write barriers before any blackening occurs (either in gcBgMarkWorker() or in gcAssistAlloc()). However, there's no need to bring the whole world to a synchronous stop to ensure this. This change replaces the STW with a ragged barrier that ensures each P has individually observed that write barriers should be enabled before GC performs any blackening. Change-Id: If2f129a6a55bd8bdd4308067af2b739f3fb41955 Reviewed-on: https://go-review.googlesource.com/8207 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:37 +00:00
Austin Clements	57afa76471	runtime: add ragged global barrier function This adds forEachP, which performs a general-purpose ragged global barrier. forEachP takes a callback and invokes it for every P at a GC safe point. Ps that are idle or in a syscall are considered to be at a continuous safe point. forEachP ensures that these Ps do not change state by forcing all syscall Ps into idle and holding the sched.lock. To ensure that Ps do not enter syscall or idle without running the safe-point function, this adds checks for a pending callback every place there is currently a gcwaiting check. We'll use forEachP to replace the STW around enabling the write barrier and to replace the current asynchronous per-M wbuf cache with a cooperatively managed per-P gcWork cache. Change-Id: Ie944f8ce1fead7c79bf271d2f42fcd61a41bb3cc Reviewed-on: https://go-review.googlesource.com/8206 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:33 +00:00
Josh Bleecher Snyder	81c2233b4a	Revert "cmd/dist: consolidate runtime CPU tests" This reverts commit `a9e50a6b35`. Change-Id: I3c5e459f1030e36bc249910facdae12303a44151 Reviewed-on: https://go-review.googlesource.com/9394 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2015-04-27 17:56:04 +00:00
Josh Bleecher Snyder	a9e50a6b35	cmd/dist: consolidate runtime CPU tests Instead of running: go test -short runtime -cpu=1 go test -short runtime -cpu=2 go test -short runtime -cpu=4 Run just: go test -short runtime -cpu=1,2,4 This is a return to the Go 1.4.2 behavior. We lose incremental display of progress and per-cpu timing information, but we don't have to recompile and relink the runtime test, which is slow. This cuts about 10s off all.bash. Updates #10571. Change-Id: I6e8c7149780d47439f8bcfa888e6efc84290c60a Reviewed-on: https://go-review.googlesource.com/9350 Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-27 17:36:18 +00:00
Josh Bleecher Snyder	2692f48330	cmd/internal/ld: remove pointless allocs Reduces allocs linking cmd/go and runtime.test by ~13%. No functional changes. The most easily addressed sources of allocations after this are expandpkg, rdstring, and symbuf string conversion. These can be reduced by interning strings, but that increases the overall memory footprint. Change-Id: Ifedefc9f2a0403bcc75460d6b139e8408374e058 Reviewed-on: https://go-review.googlesource.com/9391 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-27 17:08:56 +00:00
Roger Peppe	4a3e000a48	encoding/xml: do not escape newlines There is no need to escape newlines in char data - it makes the XML larger and harder to read. Change-Id: I1c1fcee1bdffc705c7428f89ca90af8085d6fb73 Reviewed-on: https://go-review.googlesource.com/9310 Reviewed-by: Nigel Tao <nigeltao@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-27 15:38:04 +00:00
Austin Clements	b0b1a66052	runtime: reset spinning in mspinning if work was ready()ed This fixes a bug where the runtime ready()s a goroutine while setting up a new M that's initially marked as spinning, causing the scheduler to later panic when it finds work in the run queue of a P associated with a spinning M. Specifically, the sequence of events that can lead to this is: 1) sysmon calls handoffp to hand off a P stolen from a syscall. 2) handoffp sees no pending work on the P, so it calls startm with spinning set. 3) startm calls newm, which in turn calls allocm to allocate a new M. 4) allocm "borrows" the P we're handing off in order to do allocation and performs this allocation. 5) This allocation may assist the garbage collector, and this assist may detect the end of concurrent mark and ready() the main GC goroutine to signal this. 6) This ready()ing puts the GC goroutine on the run queue of the borrowed P. 7) newm starts the OS thread, which runs mstart and subsequently mstart1, which marks the M spinning because startm was called with spinning set. 8) mstart1 enters the scheduler, which panics because there's work on the run queue, but the M is marked spinning. To fix this, before marking the M spinning in step 7, add a check to see if work was been added to the P's run queue. If this is the case, undo the spinning instead. Fixes #10573. Change-Id: I4670495ae00582144a55ce88c45ae71de597cfa5 Reviewed-on: https://go-review.googlesource.com/9332 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-04-27 12:49:54 +00:00
Austin Clements	2a46f55b35	runtime: panic when idling a P with runnable Gs This adds a check that we never put a P on the idle list when it has work on its local run queue. Change-Id: Ifcfab750de60c335148a7f513d4eef17be03b6a7 Reviewed-on: https://go-review.googlesource.com/9324 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-27 12:49:49 +00:00
Josh Bleecher Snyder	fd5540e7e5	runtime: tighten select permutation generation This is the optimization made to math/rand in CL 21030043. Change-Id: I231b24fa77cac1fe74ba887db76313b5efaab3e8 Reviewed-on: https://go-review.googlesource.com/9269 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-27 02:36:24 +00:00
John Dethridge	3787950a92	debug/dwarf: update class_string.go to add ClassReferenceSig using stringer. Change-Id: I677a5ee273a4d285a8adff71ffcfeac34afc887f Reviewed-on: https://go-review.googlesource.com/9235 Reviewed-by: Austin Clements <austin@google.com>	2015-04-27 02:05:20 +00:00
Adam Langley	cba882ea9b	crypto/tls: call GetCertificate if Certificates is empty. This change causes the GetCertificate callback to be called if Certificates is empty. Previously this configuration would result in an error. This allows people to have servers that depend entirely on dynamic certificate selection, even when the client doesn't send SNI. Fixes #9208. Change-Id: I2f5a5551215958b88b154c64a114590300dfc461 Reviewed-on: https://go-review.googlesource.com/8792 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Adam Langley <agl@golang.org>	2015-04-26 22:00:35 +00:00
Jonathan Rudenberg	ac2bf8ad06	crypto/tls: add OCSP response to ConnectionState The OCSP response is currently only exposed via a method on Conn, which makes it inaccessible when using wrappers like net/http. The ConnectionState structure is typically available even when using wrappers and contains many of the other handshake details, so this change exposes the stapled OCSP response in that structure. Change-Id: If8dab49292566912c615d816321b4353e711f71f Reviewed-on: https://go-review.googlesource.com/9361 Reviewed-by: Adam Langley <agl@golang.org> Run-TryBot: Adam Langley <agl@golang.org>	2015-04-26 22:00:13 +00:00
David Leon Gil	d86b8d34d0	crypto/elliptic: don't unmarshal points that are off the curve At present, Unmarshal does not check that the point it unmarshals is actually on the curve. (It may be on the curve's twist.) This can, as Daniel Bernstein has pointed out at great length, lead to quite devastating attacks. And 3 out of the 4 curves supported by crypto/elliptic have twists with cofactor != 1; P-224, in particular, has a sufficiently large cofactor that it is likely that conventional dlog attacks might be useful. This closes #2445, filed by Watson Ladd. To explain why this was (partially) rejected before being accepted: In the general case, for curves with cofactor != 1, verifying subgroup membership is required. (This is expensive and hard-to-implement.) But, as recent discussion during the CFRG standardization process has brought out, small-subgroup attacks are much less damaging than a twist attack. Change-Id: I284042eb9954ff9b7cde80b8b693b1d468c7e1e8 Reviewed-on: https://go-review.googlesource.com/2421 Reviewed-by: Adam Langley <agl@golang.org>	2015-04-26 21:11:50 +00:00
Paul van Brouwershaven	54bb4b9fd7	crypto/x509: CertificateRequest signature verification This implements a method for x509.CertificateRequest to prevent certain attacks and to allow a CA/RA to properly check the validity of the binding between an end entity and a key pair, to prove that it has possession of (i.e., is able to use) the private key corresponding to the public key for which a certificate is requested. RFC 2986 section 3 states: "A certification authority fulfills the request by authenticating the requesting entity and verifying the entity's signature, and, if the request is valid, constructing an X.509 certificate from the distinguished name and public key, the issuer name, and the certification authority's choice of serial number, validity period, and signature algorithm." Change-Id: I37795c3b1dfdfdd455d870e499b63885eb9bda4f Reviewed-on: https://go-review.googlesource.com/7371 Reviewed-by: Adam Langley <agl@golang.org>	2015-04-26 21:07:10 +00:00
Jonathan Rudenberg	bff1417543	crypto/tls: add support for session ticket key rotation This change adds a new method to tls.Config, SetSessionTicketKeys, that changes the key used to encrypt session tickets while the server is running. Additional keys may be provided that will be used to maintain continuity while rotating keys. If a ticket encrypted with an old key is provided by the client, the server will resume the session and provide the client with a ticket encrypted using the new key. Fixes #9994 Change-Id: Idbc16b10ff39616109a51ed39a6fa208faad5b4e Reviewed-on: https://go-review.googlesource.com/9072 Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com> Reviewed-by: Adam Langley <agl@golang.org>	2015-04-26 20:57:28 +00:00
Håvard Haugen	14a4649fe2	cmd/pprof: handle empty profile gracefully The command "go tool pprof -top $GOROOT/bin/go /dev/null" now logs that profile is empty instead of panicking. Fixes #9207 Change-Id: I3d55c179277cb19ad52c8f24f1aca85db53ee08d Reviewed-on: https://go-review.googlesource.com/2571 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-04-26 20:12:17 +00:00
Jonathan Rudenberg	02e69c4b53	crypto/tls: add support for Certificate Transparency This change adds support for serving and receiving Signed Certificate Timestamps as described in RFC 6962. The server is now capable of serving SCTs listed in the Certificate structure. The client now asks for SCTs and, if any are received, they are exposed in the ConnectionState structure. Fixes #10201 Change-Id: Ib3adae98cb4f173bc85cec04d2bdd3aa0fec70bb Reviewed-on: https://go-review.googlesource.com/8988 Reviewed-by: Adam Langley <agl@golang.org> Run-TryBot: Adam Langley <agl@golang.org> Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>	2015-04-26 16:53:11 +00:00
Justin Nuß	2db58f8f2d	encoding/csv: Preallocate records slice Currently parseRecord will always start with a nil slice and then resize the slice on append. For input with a fixed number of fields per record we can preallocate the slice to avoid having to resize the slice. This change implements this optimization by using FieldsPerRecord as capacity if it's > 0 and also adds a benchmark to better show the differences. benchmark old ns/op new ns/op delta BenchmarkRead 19741 17909 -9.28% benchmark old allocs new allocs delta BenchmarkRead 59 41 -30.51% benchmark old bytes new bytes delta BenchmarkRead 6276 5844 -6.88% Change-Id: I7c2abc9c80a23571369bcfcc99a8ffc474eae7ab Reviewed-on: https://go-review.googlesource.com/8880 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-26 16:28:51 +00:00
David Crawshaw	a5b693b431	runtime: signal forwarding for darwin/amd64 Follows the linux signal forwarding semantics from http://golang.org/cl/8712, sharing the implementation of sigfwdgo. Forwarding for 386, arm, and arm64 will follow. Change-Id: I6bf30d563d19da39b6aec6900c7fe12d82ed4f62 Reviewed-on: https://go-review.googlesource.com/9302 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-26 13:46:13 +00:00
Michael Hudson-Doyle	c20ff36fe2	cmd/internal/ld: R_TLS_LE is fine on Darwin too Sorry about this. Fixes #10575 Change-Id: I2de23be68e7d822d182e5a0d6a00c607448d861e Reviewed-on: https://go-review.googlesource.com/9341 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-26 04:53:51 +00:00
Matt T. Proud	b6a0450bec	testing/quick: align tests with reflect.Kind. This commit is largely cosmetic in the sense that it is the remnants of a change proposal I had prepared for testing/quick, until I discovered that `3e9ed27` already implemented the feature I was looking for: quick.Value() for reflect.Kind Array. What you see is a merger and manual cleanup; the cosmetic cleanups are as follows: (1.) Keeping the TestCheckEqual and its associated input functions in the same order as type kinds defined in reflect.Kind. Since `3e9ed27` was committed, the test case began to diverge from the constant's ordering. (2.) The `Intptr` derivatives existed to exercise quick.Value with reflect.Kind's `Ptr` constant. All `Intptr` (unrelated to `uintptr`) in the test have been migrated to ensure the parallelism of the listings and to convey that `Intptr` is not special. (3.) Correct a misspelling (transposition) of "alias", whereby it is named as "Alais". Change-Id: I441450db16b8bb1272c52b0abcda3794dcd0599d Reviewed-on: https://go-review.googlesource.com/8804 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-26 02:40:40 +00:00
Michael Hudson-Doyle	264858c46e	cmd/8l, cmd/internal/ld, cmd/internal/obj/x86: stop incorrectly using the term "inital exec" The long comment block in obj6.go:progedit talked about the two code sequences for accessing g as "local exec" and "initial exec", but really they are both forms of local exec. This stuff is confusing enough without using the wrong words for things, so rewrite it to talk about 2-instruction and 1-instruction sequences. Unfortunately the confusion has made it into code, with the R_TLS_IE relocation now doing double duty as meaning actual initial exec when externally linking and boring old local exec when linking internally (half of this is my fault). So this stops using R_TLS_IE in the local exec case. There is a chance this might break plan9 or windows, but I don't think so. Next step is working out what the heck is going on on ARM... Change-Id: I09da4388210cf49dbc99fd25f5172bbe517cee57 Reviewed-on: https://go-review.googlesource.com/9273 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2015-04-25 18:13:15 +00:00
Rick Hudson	ada8cdb9f6	runtime: Fix bug due to elided return. A previous change to mbitmap.go dropped a return on a path the seems not to be excersized. This was a mistake that this CL fixes. Change-Id: I715ee4ef08f5bf8d9f53cee84e8fb31a237e2d43 Reviewed-on: https://go-review.googlesource.com/9295 Reviewed-by: Austin Clements <austin@google.com>	2015-04-24 21:52:30 +00:00
Michael Hudson-Doyle	ccc76dba60	cmd/internal/ld: fix R_TLS handling now Xsym is not read from object file I think this should fix the arm build. A proper fix involves making the handling of tlsg less fragile, I'll try that tomorrow. Update #10557 Change-Id: I9b1b666737fb40aebb6f284748509afa8483cce5 Reviewed-on: https://go-review.googlesource.com/9272 Reviewed-by: Dave Cheney <dave@cheney.net> Run-TryBot: Dave Cheney <dave@cheney.net>	2015-04-24 20:57:49 +00:00
Austin Clements	1b4025f4bd	runtime: replace per-M workbuf cache with per-P gcWork cache Currently, each M has a cache of the most recently used workbuf. This is used primarily by the write barrier so it doesn't have to access the global workbuf lists on every write barrier. It's also used by stack scanning because it's convenient. This cache is important for write barrier performance, but this particular approach has several downsides. It's faster than no cache, but far from optimal (as the benchmarks below show). It's complex: access to the cache is sprinkled through most of the workbuf list operations and it requires special care to transform into and back out of the gcWork cache that's actually used for scanning and marking. It requires atomic exchanges to take ownership of the cached workbuf and to return it to the M's cache even though it's almost always used by only the current M. Since it's per-M, flushing these caches is O(# of Ms), which may be high. And it has some significant subtleties: for example, in general the cache shouldn't be used after the harvestwbufs() in mark termination because it could hide work from mark termination, but stack scanning can happen after this and will* use the cache (but it turns out this is okay because it will always be followed by a getfull(), which drains the cache). This change replaces this cache with a per-P gcWork object. This gcWork cache can be used directly by scanning and marking (as long as preemption is disabled, which is a general requirement of gcWork). Since it's per-P, it doesn't require synchronization, which simplifies things and means the only atomic operations in the write barrier are occasionally fetching new work buffers and setting a mark bit if the object isn't already marked. This cache can be flushed in O(# of Ps), which is generally small. It follows a simple flushing rule: the cache can be used during any phase, but during mark termination it must be flushed before allowing preemption. This also makes the dispose during mutator assist no longer necessary, which eliminates the vast majority of gcWork dispose calls and reduces contention on the global workbuf lists. And it's a lot faster on some benchmarks: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 11963668673 11206112763 -6.33% BenchmarkFannkuch11 2643217136 2649182499 +0.23% BenchmarkFmtFprintfEmpty 70.4 70.2 -0.28% BenchmarkFmtFprintfString 364 307 -15.66% BenchmarkFmtFprintfInt 317 282 -11.04% BenchmarkFmtFprintfIntInt 512 483 -5.66% BenchmarkFmtFprintfPrefixedInt 404 380 -5.94% BenchmarkFmtFprintfFloat 521 479 -8.06% BenchmarkFmtManyArgs 2164 1894 -12.48% BenchmarkGobDecode 30366146 22429593 -26.14% BenchmarkGobEncode 29867472 26663152 -10.73% BenchmarkGzip 391236616 396779490 +1.42% BenchmarkGunzip 96639491 96297024 -0.35% BenchmarkHTTPClientServer 100110 70763 -29.31% BenchmarkJSONEncode 51866051 52511382 +1.24% BenchmarkJSONDecode 103813138 86094963 -17.07% BenchmarkMandelbrot200 4121834 4120886 -0.02% BenchmarkGoParse 16472789 5879949 -64.31% BenchmarkRegexpMatchEasy0_32 140 140 +0.00% BenchmarkRegexpMatchEasy0_1K 394 394 +0.00% BenchmarkRegexpMatchEasy1_32 120 120 +0.00% BenchmarkRegexpMatchEasy1_1K 621 614 -1.13% BenchmarkRegexpMatchMedium_32 209 202 -3.35% BenchmarkRegexpMatchMedium_1K 54889 55175 +0.52% BenchmarkRegexpMatchHard_32 2682 2675 -0.26% BenchmarkRegexpMatchHard_1K 79383 79524 +0.18% BenchmarkRevcomp 584116718 584595320 +0.08% BenchmarkTemplate 125400565 109620196 -12.58% BenchmarkTimeParse 386 387 +0.26% BenchmarkTimeFormat 580 447 -22.93% (Best out of 10 runs. The delta of averages is similar.) This also puts us in a good position to flush these caches when nearing the end of concurrent marking, which will let us increase the size of the work buffers while still controlling mark termination pause time. Change-Id: I2dd94c8517a19297a98ec280203cccaa58792522 Reviewed-on: https://go-review.googlesource.com/9178 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 20:10:14 +00:00
Austin Clements	d1cae6358c	runtime: fix check for pending GC work When findRunnable considers running a fractional mark worker, it first checks if there's any work to be done; if there isn't there's no point in running the worker because it will just reschedule immediately. However, currently findRunnable just checks work.full and work.partial, whereas getfull can also draw work from m.currentwbuf. As a result, findRunnable may not start a worker even though there actually is work. This problem manifests itself in occasional failures of the test/init1.go test. This test is unusual because it performs a large amount of allocation without executing any write barriers, which means there's nothing to force the pointers in currentwbuf out to the work.partial/full lists where findRunnable can see them. This change fixes this problem by making findRunnable also check for a currentwbuf. This aligns findRunnable with trygetfull's notion of whether or not there's work. Change-Id: Ic76d22b7b5d040bc4f58a6b5975e9217650e66c4 Reviewed-on: https://go-review.googlesource.com/9299 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 20:10:10 +00:00
Austin Clements	26eac917dc	runtime: start dedicated mark workers even if there's no work Currently, findRunnable only considers running a mark worker if there's work in the work queue. In principle, this can delay the start of the desired number of dedicated mark workers if there's no work pending. This is unlikely to occur in practice, since there should be work queued from the scan phase, but if it were to come up, a CPU hog mutator could slow down or delay garbage collection. This check makes sense for fractional mark workers, since they'll just return to the scheduler immediately if there's no work, but we want the scheduler to start all of the dedicated mark workers promptly, even if there's currently no queued work. Hence, this change moves the pending work check after the check for starting a dedicated worker. Change-Id: I52b851cc9e41f508a0955b3f905ca80f109ea101 Reviewed-on: https://go-review.googlesource.com/9298 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-24 20:10:05 +00:00
Austin Clements	711a164267	runtime: fix some out-of-date comments bgMarkCount no longer exists. Change-Id: I3aa406fdccfca659814da311229afbae55af8304 Reviewed-on: https://go-review.googlesource.com/9297 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-24 20:10:01 +00:00
Hyang-Ah Hana Kim	e9a89b80b6	misc/cgo/testcshared: make test.bash resilient against noise. Instead of comparing against the entire output that may include verbose warning messages, use the last line of the output and check it includes the expected success message (PASS). Change-Id: Iafd583ee5529a8aef5439b9f1f6ce0185e4b1331 Reviewed-on: https://go-review.googlesource.com/9304 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-24 18:32:24 +00:00
Rob Pike	b3000b6f6a	cmd/go: rename doc.go to alldocs.go in preparation for "go doc" Also rename and update mkdoc.sh to mkalldocs.sh Change-Id: Ief3673c22d45624e173fc65ee279cea324da03b5 Reviewed-on: https://go-review.googlesource.com/9226 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 18:29:59 +00:00
Srdjan Petrovic	6ad33be2d9	runtime: implement xadduintptr and update system mstats using it The motivation is that sysAlloc/Free() currently aren't safe to be called without a valid G, because arm's xadd64() uses locks that require a valid G. The solution here was proposed by Dmitry Vyukov: use xadduintptr() instead of xadd64(), until arm can support xadd64 on all of its architectures (not a trivial task for arm). Change-Id: I250252079357ea2e4360e1235958b1c22051498f Reviewed-on: https://go-review.googlesource.com/9002 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-24 16:53:26 +00:00
Hyang-Ah Hana Kim	8566979972	misc/cgo/testcshared: add a c-shared test for android/arm. - main3.c tests main.main is exported when compiled for GOOS=android. - wait longer for main2.c (it's slow on android/arm) - rearranged test.bash Fixes #10070. Change-Id: I6e5a98d1c5fae776afa54ecb5da633b59b269316 Reviewed-on: https://go-review.googlesource.com/9296 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>	2015-04-24 16:32:31 +00:00
Michael Hudson-Doyle	029c7bbdfe	cmd/internal/gc, cmd/internal/ld, cmd/internal/obj: teach compiler about local symbols This lets us avoid loading string constants via the GOT and (together with http://golang.org/cl/9102) results in the fannkuch benchmark having very similar register usage with -dynlink as without. Change-Id: Ic3892b399074982b76773c3e547cfbba5dabb6f9 Reviewed-on: https://go-review.googlesource.com/9103 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2015-04-24 16:19:41 +00:00
Austin Clements	0e6a6c510f	runtime: simplify process for starting GC goroutine Currently, when allocation reaches the GC trigger, the runtime uses readyExecute to start the GC goroutine immediately rather than wait for the scheduler to get around to the GC goroutine while the mutator continues to grow the heap. Now that the scheduler runs the most recently readied goroutine when a goroutine yields its time slice, this rigmarole is no longer necessary. The runtime can simply ready the GC goroutine and yield from the readying goroutine. Change-Id: I3b4ebadd2a72a923b1389f7598f82973dd5c8710 Reviewed-on: https://go-review.googlesource.com/9292 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-04-24 15:13:05 +00:00
Austin Clements	ce502b063c	runtime: use park/ready to wake up GC at end of concurrent mark Currently, the main GC goroutine sleeps on a note during concurrent mark and the first background mark worker or assist to finish marking use wakes up that note to let the main goroutine proceed into mark termination. Unfortunately, the latency of this wakeup can be quite high, since the GC goroutine will typically have lost its P while in the futex sleep, meaning it will be placed on the global run queue and will wait there until some P is kind enough to pick it up. This delay gives the mutator more time to allocate and create floating garbage, growing the heap unnecessarily. Worse, it's likely that background marking has stopped at this point (unless GOMAXPROCS>4), so anything that's allocated and published to the heap during this window will have to be scanned during mark termination while the world is stopped. This change replaces the note sleep/wakeup with a gopark/ready scheme. This keeps the wakeup inside the Go scheduler and lets the garbage collector take advantage of the new scheduler semantics that run the ready()d goroutine immediately when the ready()ing goroutine sleeps. For the json benchmark from x/benchmarks with GOMAXPROCS=4, this reduces the delay in waking up the GC goroutine and entering mark termination once concurrent marking is done from ~100ms to typically <100µs. Change-Id: Ib11f8b581b8914f2d68e0094f121e49bac3bb384 Reviewed-on: https://go-review.googlesource.com/9291 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:13:01 +00:00
Austin Clements	4e32718d3e	runtime: use timer for GC control revise rather than timeout Currently, we use a note sleep with a timeout in a loop in func gc to periodically revise the GC control variables. Replace this with a fully blocking note sleep and use a periodic timer to trigger the revise instead. This is a step toward replacing the note sleep in func gc. Change-Id: I2d562f6b9b2e5f0c28e9a54227e2c0f8a2603f63 Reviewed-on: https://go-review.googlesource.com/9290 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:56 +00:00
Austin Clements	e870f06c3f	runtime: yield time slice to most recently readied G Currently, when the runtime ready()s a G, it adds it to the end of the current P's run queue and continues running. If there are many other things in the run queue, this can result in a significant delay before the ready()d G actually runs and can hurt fairness when other Gs in the run queue are CPU hogs. For example, if there are three Gs sharing a P, one of which is a CPU hog that never voluntarily gives up the P and the other two of which are doing small amounts of work and communicating back and forth on an unbuffered channel, the two communicating Gs will get very little CPU time. Change this so that when G1 ready()s G2 and then blocks, the scheduler immediately hands off the remainder of G1's time slice to G2. In the above example, the two communicating Gs will now act as a unit and together get half of the CPU time, while the CPU hog gets the other half of the CPU time. This fixes the problem demonstrated by the ping-pong benchmark added in the previous commit: benchmark old ns/op new ns/op delta BenchmarkPingPongHog 684287 825 -99.88% On the x/benchmarks suite, this change improves the performance of garbage by ~6% (for GOMAXPROCS=1 and 4), and json by 28% and 36% for GOMAXPROCS=1 and 4. It has negligible effect on heap size. This has no effect on the go1 benchmark suite since those benchmarks are mostly single-threaded. Change-Id: I858a08eaa78f702ea98a5fac99d28a4ac91d339f Reviewed-on: https://go-review.googlesource.com/9289 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:52 +00:00
Austin Clements	da0e37fa8d	runtime: benchmark for ping-pong in the presence of a CPU hog This benchmark demonstrates a current problem with the scheduler where a set of frequently communicating goroutines get very little CPU time in the presence of another goroutine that hogs that CPU, even if one of those communicating goroutines is always runnable. Currently it takes about 0.5 milliseconds to switch between ping-ponging goroutines in the presence of a CPU hog: BenchmarkPingPongHog 2000 684287 ns/op Change-Id: I278848c84f778de32344921ae8a4a8056e4898b0 Reviewed-on: https://go-review.googlesource.com/9288 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:47 +00:00
Austin Clements	e5e52f4f2c	runtime: factor checking if P run queue is empty There are a variety of places where we check if a P's run queue is empty. This test is about to get slightly more complicated, so factor it out into a new function, runqempty. This function is inlinable, so this has no effect on performance. Change-Id: If4a0b01ffbd004937de90d8d686f6ded4aad2c6b Reviewed-on: https://go-review.googlesource.com/9287 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:42 +00:00
Russ Cox	9406f68e6a	cmd/internal/gc: add and test write barrier debug output We can expand the test cases as we discover problems. This is some basic tests plus all the things I got wrong in some recent work. Change-Id: Id875fcfaf74eb087ae42b441fe47a34c5b8ccb39 Reviewed-on: https://go-review.googlesource.com/9158 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2015-04-24 14:39:49 +00:00
Aamir Khan	80f575b78f	hash/crc32: clarify documentation Explicitly specify that we represent polynomial in reversed notation Fixes #8229 Change-Id: Idf094c01fd82f133cd0c1b50fa967d12c577bdb5 Reviewed-on: https://go-review.googlesource.com/9237 Reviewed-by: David Chase <drchase@google.com>	2015-04-24 13:44:25 +00:00
Shenghou Ma	7579867fec	cmd/dist: allow $GO_TEST_TIMEOUT_SCALE to override timeoutScale Some machines are so slow that even with the default timeoutScale, they still timeout some tests. For example, currently some linux/arm builders and the openbsd/arm builder are timing out the runtime test and CL 8397 was proposed to skip some tests on openbsd/arm to fix the build. Instead of increasing timeoutScale or skipping tests, this CL introduces an environment variable $GO_TEST_TIMEOUT_SCALE that could be set to manually set a larger timeoutScale for those machines/builders. Fixes #10314. Change-Id: I16c9a9eb980d6a63309e4cacd79eee2fe05769ee Reviewed-on: https://go-review.googlesource.com/9223 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-04-24 05:45:36 +00:00
Srdjan Petrovic	5c8fbc6f1e	runtime: signal forwarding Forward signals to signal handlers installed before Go installs its own, under certain circumstances. In particular, as iant@ suggests, signals are forwarded iff: (1) a non-SIG_DFL signal handler existed before Go, and (2) signal is synchronous (i.e., one of SIGSEGV, SIGBUS, SIGFPE), and (3a) signal occured on a non-Go thread, or (3b) signal occurred on a Go thread but in CGo code. Supported only on Linux, for now. Change-Id: I403219ee47b26cf65da819fb86cf1ec04d3e25f5 Reviewed-on: https://go-review.googlesource.com/8712 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-24 05:19:39 +00:00
Egon Elbre	b075d1fc2e	encoding/base64: Optimize EncodeToString and DecodeString. benchmark old ns/op new ns/op delta BenchmarkEncodeToString 31281 23821 -23.85% BenchmarkDecodeString 156508 82254 -47.44% benchmark old MB/s new MB/s speedup BenchmarkEncodeToString 261.88 343.89 1.31x BenchmarkDecodeString 69.80 132.81 1.90x Change-Id: I115e0b18c3a6d5ef6bfdcb3f637644f02f290907 Reviewed-on: https://go-review.googlesource.com/8808 Reviewed-by: Nigel Tao <nigeltao@golang.org>	2015-04-24 01:45:43 +00:00
Josh Bleecher Snyder	04829a4138	cmd/9g, etc: remove // fallthrough comments They are vestiges of the c2go transition. Change-Id: I22672e40373ef77d7a0bf69cfff8017e46353055 Reviewed-on: https://go-review.googlesource.com/9265 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-23 23:46:19 +00:00
Josh Bleecher Snyder	56a7c5b95c	math/big: add partial arm64 assembly support benchmark old ns/op new ns/op delta BenchmarkAddVV_1 18.7 14.8 -20.86% BenchmarkAddVV_2 21.8 16.6 -23.85% BenchmarkAddVV_3 26.1 17.1 -34.48% BenchmarkAddVV_4 30.4 21.9 -27.96% BenchmarkAddVV_5 35.5 19.8 -44.23% BenchmarkAddVV_1e1 63.0 28.3 -55.08% BenchmarkAddVV_1e2 593 178 -69.98% BenchmarkAddVV_1e3 5691 1490 -73.82% BenchmarkAddVV_1e4 56868 20761 -63.49% BenchmarkAddVV_1e5 569062 207679 -63.51% BenchmarkAddVW_1 15.8 12.6 -20.25% BenchmarkAddVW_2 17.8 13.1 -26.40% BenchmarkAddVW_3 21.2 13.9 -34.43% BenchmarkAddVW_4 23.6 14.7 -37.71% BenchmarkAddVW_5 26.0 15.8 -39.23% BenchmarkAddVW_1e1 41.3 21.6 -47.70% BenchmarkAddVW_1e2 383 145 -62.14% BenchmarkAddVW_1e3 3703 1264 -65.87% BenchmarkAddVW_1e4 36920 14359 -61.11% BenchmarkAddVW_1e5 370345 143046 -61.37% BenchmarkAddMulVVW_1 33.2 32.5 -2.11% BenchmarkAddMulVVW_2 58.0 57.2 -1.38% BenchmarkAddMulVVW_3 95.2 93.9 -1.37% BenchmarkAddMulVVW_4 108 106 -1.85% BenchmarkAddMulVVW_5 159 156 -1.89% BenchmarkAddMulVVW_1e1 344 340 -1.16% BenchmarkAddMulVVW_1e2 3644 3624 -0.55% BenchmarkAddMulVVW_1e3 37344 37208 -0.36% BenchmarkAddMulVVW_1e4 373295 372170 -0.30% BenchmarkAddMulVVW_1e5 3438116 3425606 -0.36% BenchmarkBitLen0 7.21 4.32 -40.08% BenchmarkBitLen1 6.49 4.32 -33.44% BenchmarkBitLen2 7.23 4.32 -40.25% BenchmarkBitLen3 6.49 4.32 -33.44% BenchmarkBitLen4 7.22 4.32 -40.17% BenchmarkBitLen5 6.52 4.33 -33.59% BenchmarkBitLen8 7.22 4.32 -40.17% BenchmarkBitLen9 6.49 4.32 -33.44% BenchmarkBitLen16 8.66 4.32 -50.12% BenchmarkBitLen17 7.95 4.32 -45.66% BenchmarkBitLen31 8.69 4.32 -50.29% BenchmarkGCD10x10 5021 5033 +0.24% BenchmarkGCD10x100 5571 5572 +0.02% BenchmarkGCD10x1000 6707 6729 +0.33% BenchmarkGCD10x10000 13526 13419 -0.79% BenchmarkGCD10x100000 85668 83242 -2.83% BenchmarkGCD100x100 24196 23936 -1.07% BenchmarkGCD100x1000 28802 27309 -5.18% BenchmarkGCD100x10000 64111 51704 -19.35% BenchmarkGCD100x100000 385840 274385 -28.89% BenchmarkGCD1000x1000 262892 236269 -10.13% BenchmarkGCD1000x10000 371393 277883 -25.18% BenchmarkGCD1000x100000 1311795 589055 -55.10% BenchmarkGCD10000x10000 9596740 6123930 -36.19% BenchmarkGCD10000x100000 16404000 7269610 -55.68% BenchmarkGCD100000x100000 776660000 419270000 -46.02% BenchmarkHilbert 13478980 13402270 -0.57% BenchmarkBinomial 9802 9440 -3.69% BenchmarkBitset 142 142 +0.00% BenchmarkBitsetNeg 328 279 -14.94% BenchmarkBitsetOrig 853 861 +0.94% BenchmarkBitsetNegOrig 1489 1444 -3.02% BenchmarkMul 420949000 410481000 -2.49% BenchmarkExp3Power0x10 1148 1229 +7.06% BenchmarkExp3Power0x40 1322 1376 +4.08% BenchmarkExp3Power0x100 2437 2486 +2.01% BenchmarkExp3Power0x400 9456 9346 -1.16% BenchmarkExp3Power0x1000 113623 108701 -4.33% BenchmarkExp3Power0x4000 1134933 1101481 -2.95% BenchmarkExp3Power0x10000 10773570 10396160 -3.50% BenchmarkExp3Power0x40000 101362100 97788300 -3.53% BenchmarkExp3Power0x100000 921114000 885249000 -3.89% BenchmarkExp3Power0x400000 8323094000 7969020000 -4.25% BenchmarkFibo 322021600 92554450 -71.26% BenchmarkScanPi 1264583 321065 -74.61% BenchmarkStringPiParallel 1644661 554216 -66.30% BenchmarkScan10Base2 1111 1080 -2.79% BenchmarkScan100Base2 6645 6345 -4.51% BenchmarkScan1000Base2 84084 62405 -25.78% BenchmarkScan10000Base2 3105998 932551 -69.98% BenchmarkScan100000Base2 257234800 40113333 -84.41% BenchmarkScan10Base8 571 573 +0.35% BenchmarkScan100Base8 2810 2543 -9.50% BenchmarkScan1000Base8 47383 25834 -45.48% BenchmarkScan10000Base8 2739518 567203 -79.30% BenchmarkScan100000Base8 253952400 36495680 -85.63% BenchmarkScan10Base10 553 556 +0.54% BenchmarkScan100Base10 2640 2385 -9.66% BenchmarkScan1000Base10 50865 24049 -52.72% BenchmarkScan10000Base10 3279916 549313 -83.25% BenchmarkScan100000Base10 309121000 36213140 -88.29% BenchmarkScan10Base16 478 483 +1.05% BenchmarkScan100Base16 2353 2144 -8.88% BenchmarkScan1000Base16 48091 24246 -49.58% BenchmarkScan10000Base16 2858886 586475 -79.49% BenchmarkScan100000Base16 266320000 38190500 -85.66% BenchmarkString10Base2 736 730 -0.82% BenchmarkString100Base2 2695 2707 +0.45% BenchmarkString1000Base2 20549 20388 -0.78% BenchmarkString10000Base2 212638 210782 -0.87% BenchmarkString100000Base2 1944963 1938033 -0.36% BenchmarkString10Base8 524 517 -1.34% BenchmarkString100Base8 1326 1320 -0.45% BenchmarkString1000Base8 8213 8249 +0.44% BenchmarkString10000Base8 72204 72092 -0.16% BenchmarkString100000Base8 769068 765993 -0.40% BenchmarkString10Base10 1018 982 -3.54% BenchmarkString100Base10 3485 3206 -8.01% BenchmarkString1000Base10 37102 18935 -48.97% BenchmarkString10000Base10 188633 88637 -53.01% BenchmarkString100000Base10 124490300 19700940 -84.17% BenchmarkString10Base16 509 502 -1.38% BenchmarkString100Base16 1084 1098 +1.29% BenchmarkString1000Base16 5641 5650 +0.16% BenchmarkString10000Base16 46900 46745 -0.33% BenchmarkString100000Base16 508957 505840 -0.61% BenchmarkLeafSize0 8934320 8149465 -8.78% BenchmarkLeafSize1 237666 118381 -50.19% BenchmarkLeafSize2 237807 117854 -50.44% BenchmarkLeafSize3 1688640 353494 -79.07% BenchmarkLeafSize4 235676 116196 -50.70% BenchmarkLeafSize5 2121896 430325 -79.72% BenchmarkLeafSize6 1682306 351775 -79.09% BenchmarkLeafSize7 1051847 251436 -76.10% BenchmarkLeafSize8 232697 115674 -50.29% BenchmarkLeafSize9 2403616 488443 -79.68% BenchmarkLeafSize10 2120975 429545 -79.75% BenchmarkLeafSize11 2023789 426525 -78.92% BenchmarkLeafSize12 1684830 351985 -79.11% BenchmarkLeafSize13 1465529 337906 -76.94% BenchmarkLeafSize14 1050498 253872 -75.83% BenchmarkLeafSize15 683228 197384 -71.11% BenchmarkLeafSize16 232496 116026 -50.10% BenchmarkLeafSize32 245841 126671 -48.47% BenchmarkLeafSize64 301728 190285 -36.93% Change-Id: I63e63297896d96b89c9a275b893c2b405a7e105d Reviewed-on: https://go-review.googlesource.com/9260 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-23 23:29:15 +00:00
Srdjan Petrovic	1f65c9c141	runtime: deflake TestNewOSProc0, fix _rt0_amd64_linux_lib stack alignment This addresses iant's comments from CL 9164. Change-Id: I7b5b282f61b11aab587402c2d302697e76666376 Reviewed-on: https://go-review.googlesource.com/9222 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-23 23:09:03 +00:00
Austin Clements	ed09e0e2bf	runtime: fix underflow in next_gc calculation Currently, it's possible for the next_gc calculation to underflow. Since next_gc is unsigned, this wraps around and effectively disables GC for the rest of the program's execution. Besides being obviously wrong, this is causing test failures on 32-bit because some tests are running out of heap. This underflow happens for two reasons, both having to do with how we estimate the reachable heap size at the end of the GC cycle. One reason is that this calculation depends on the value of heap_live at the beginning of the GC cycle, but we currently only record that value during a concurrent GC and not during a forced STW GC. Fix this by moving the recorded value from gcController to work and recording it on a common code path. The other reason is that we use the amount of allocation during the GC cycle as an approximation of the amount of floating garbage and subtract it from the marked heap to estimate the reachable heap. However, since this is only an approximation, it's possible for the amount of allocation during the cycle to be larger than the marked heap size (since the runtime allocates white and it's possible for these allocations to never be made reachable from the heap). Currently this causes wrap-around in our estimate of the reachable heap size, which in turn causes wrap-around in next_gc. Fix this by bottoming out the reachable heap estimate at 0, in which case we just fall back to triggering GC at heapminimum (which is okay since this only happens on small heaps). Fixes #10555, fixes #10556, and fixes #10559. Change-Id: Iad07b529c03772356fede2ae557732f13ebfdb63 Reviewed-on: https://go-review.googlesource.com/9286 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-23 20:52:54 +00:00
Rick Hudson	77f56af0bc	runtime: Improve scanning performance To achieve a 2% improvement in the garbage benchmark this CL removes an unneeded assert and avoids one hbits.next() call per object being scanned. Change-Id: Ibd542d01e9c23eace42228886f9edc488354df0d Reviewed-on: https://go-review.googlesource.com/9244 Reviewed-by: Austin Clements <austin@google.com>	2015-04-23 20:27:46 +00:00
Hyang-Ah Hana Kim	aef54d40ac	runtime: disable TestNewOSProc0 on android/arm. newosproc0 does not work on android/arm. See issue #10548. Change-Id: Ieaf6f5d0b77cddf5bf0b6c89fd12b1c1b8723f9b Reviewed-on: https://go-review.googlesource.com/9293 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-23 19:08:33 +00:00

... 6 7 8 9 10 ...

23472 Commits