runtime: avoid race on allp in findrunnable

findrunnable loops over allp to check run queues *after* it has dropped its own P. This is unsafe because allp can change when nothing is blocking safe-points. Hence, procresize could change allp concurrently with findrunnable's loop. Beyond generally violating Go's memory model, in the best case this could findrunnable to observe a nil P pointer if allp has been grown but the new slots not yet initialized. In the worst case, the reads of allp could tear, causing findrunnable to read a word that isn't even a valid *P pointer. Fix this by taking a snapshot of the allp slice header (but not the backing store) before findrunnable drops its P and iterating over this snapshot. The actual contents of allp are immutable up to len(allp), so this fixes the race. Updates #23098 (may fix). Change-Id: I556ae2dbfffe9fe4a1bf43126e930b9e5c240ea8 Reviewed-on: https://go-review.googlesource.com/86215 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2024-11-19 15:24:46 -07:00 · 2018-01-04 10:43:29 -05:00 · 2018-01-04 10:43:29 -05:00 · 7c2cf4e779
commit 7c2cf4e779
parent 1f84cd9771
1 changed files with 7 additions and 1 deletions
--- a/src/runtime/proc.go
+++ b/src/runtime/proc.go
@ -2299,6 +2299,12 @@ stop:
 		return gp, false
 	}

+	// Before we drop our P, make a snapshot of the allp slice,
+	// which can change underfoot once we no longer block
+	// safe-points. We don't need to snapshot the contents because
+	// everything up to cap(allp) is immutable.
+	allpSnapshot := allp
+
 	// return P and block
 	lock(&sched.lock)
 	if sched.gcwaiting != 0 || _p_.runSafePointFn != 0 {
@ -2338,7 +2344,7 @@ stop:
 	}

 	// check all runqueues once again
-	for _, _p_ := range allp {
+	for _, _p_ := range allpSnapshot {
 		if !runqempty(_p_) {
 			lock(&sched.lock)
 			_p_ = pidleget()