It's not a group: must handle the inside as a sequence of literal chars,
not a single literal string.
That is, \Qab\E+ is the same as ab+, not (ab)+.
Fixes#11187.
Change-Id: I5406d05ccf7efff3a7f15395bdb0cfb2bd23a8ed
Reviewed-on: https://go-review.googlesource.com/17233
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The one-pass transformation is structured as a search over the input
machine for conditions that violate the one-pass requisites. At each
iteration, we should fully explore all non-input paths that proceed from
the current instruction; this is implemented via recursive check calls.
But when we reach instructions that demand input (InstRune*), these
should be put onto the search queue.
Instead of searching this way, the routine previously (effectively)
proceeded through the machine one instruction at a time until finding an
Inst{Match,Fail,Rune*}, calling check on each instruction. This caused
bug #11905, where the transformation stopped before rewriting all
InstAlts as InstAltMatches.
Further, the check function unnecessarily recurred on InstRune*
instructions. (I believe this helps to mask the above bug.)
This change also deletes some unused functions and duplicate test cases.
Fixes#11905.
Change-Id: I5b0b26efea3d3bd01c7479a518b5ed1b886701cd
Reviewed-on: https://go-review.googlesource.com/17195
Reviewed-by: Russ Cox <rsc@golang.org>
The prefix computation for onepass was incorrectly returning
complete=true when it encountered a beginning-of-text empty width match
(^) in the middle of an expression.
Fix by returning complete only when the prefix is followed by $ and then
an accepting state.
Fixes#11175.
Change-Id: Ie9c4cf5f76c1d2c904a6fb2f016cedb265c19fde
Reviewed-on: https://go-review.googlesource.com/16200
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
This helps users who wish to use separate Regexps in each goroutine to
avoid lock contention. Previously they had to parse the expression
multiple times to achieve this.
I used variants of the included benchmark to evaluate this change. I
used the arguments -benchtime 20s -cpu 1,2,4,8,16 on a machine with 16
hardware cores.
Comparing a single shared Regexp vs. copied Regexps, we can see that
lock contention causes huge slowdowns at higher levels of parallelism.
The copied version shows the expected linear speedup.
name old time/op new time/op delta
MatchParallel 366ns ± 0% 370ns ± 0% +1.09% (p=0.000 n=10+8)
MatchParallel-2 324ns ±28% 184ns ± 1% -43.37% (p=0.000 n=10+10)
MatchParallel-4 352ns ± 5% 93ns ± 1% -73.70% (p=0.000 n=9+10)
MatchParallel-8 480ns ± 3% 46ns ± 0% -90.33% (p=0.000 n=9+8)
MatchParallel-16 510ns ± 8% 24ns ± 6% -95.36% (p=0.000 n=10+8)
I also compared a modified version of Regexp that has no mutex and a
single machine (the "RegexpForSingleGoroutine" rsc mentioned in
https://github.com/golang/go/issues/8232#issuecomment-66096128).
In this next test, I compared using N copied Regexps vs. N separate
RegexpForSingleGoroutines. This shows that, even for this relatively
simple regex, avoiding the lock entirely would only buy about 10-12%
further improvement.
name old time/op new time/op delta
MatchParallel 370ns ± 0% 322ns ± 0% -12.97% (p=0.000 n=8+8)
MatchParallel-2 184ns ± 1% 162ns ± 1% -11.60% (p=0.000 n=10+10)
MatchParallel-4 92.7ns ± 1% 81.1ns ± 2% -12.43% (p=0.000 n=10+10)
MatchParallel-8 46.4ns ± 0% 41.8ns ±10% -9.78% (p=0.000 n=8+10)
MatchParallel-16 23.7ns ± 6% 20.6ns ± 1% -13.14% (p=0.000 n=8+8)
Updates #8232.
Change-Id: I15201a080c363d1b44104eafed46d8df5e311902
Reviewed-on: https://go-review.googlesource.com/16110
Reviewed-by: Russ Cox <rsc@golang.org>
Regular expressions involving a (x){0} term are
simplified by removing this term from the
expression, just before the expression is compiled.
The number of subexpressions is evaluated before
the simplification. The number of capture instructions
in the compiled expressions is not necessarily in line
with the number of subexpressions.
When the ReplaceAll(String) methods are used, a number
of capture slots (nmatch) is evaluated as 2*(s+1)
(s being the number of subexpressions).
In some case, it can be higher than the number of capture
instructions evaluated at compile time, resulting in a
panic when the internal slices of regexp.machine
are resized to this value.
Fixed by capping the number of capture slots to the number
of capture instructions.
I must say I do not really see the benefits of setting
nmatch lower than re.prog.NumCap using this 2*(s+1) formula,
so perhaps this can be further simplified.
Fixes#11178Fixes#11176
Change-Id: I21415e8ef2dd5f2721218e9a679f7f6bfb76ae9b
Reviewed-on: https://go-review.googlesource.com/14013
Reviewed-by: Russ Cox <rsc@golang.org>
The existing comment for regex.Split contains a plain text example,
while many of the other regex functions have runnable examples. This
change provides a runnable example for Split.
Change-Id: I5373f57f532fe843d7d0adcf4b513061ec797047
Reviewed-on: https://go-review.googlesource.com/14737
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In 1.6, go doc is more likely to be available.
Change-Id: I970ad1d3317b35273f5c8d830f75713d3570c473
Reviewed-on: https://go-review.googlesource.com/10518
Reviewed-by: Andrew Gerrand <adg@golang.org>
Replaced code.google.com/p/re2/ with github.com/google/re2/ and
updated the file names (re2-exhaustive.txt.bz2 not re2.txt.gz)
as well as the re2 make command (make log).
Change-Id: I15937b0b8a898d78d45366857ed86421c8d69960
Reviewed-on: https://go-review.googlesource.com/9372
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This update makes maxBacktrackLen return 0 if
len(prog.Inst) > maxBacktrackProg. This prevents an attempt to
backtrack against a nil bitstate.
Fixes#10319
Change-Id: Icdbeb2392782ccf66f9d0a70ea57af22fb93f01b
Reviewed-on: https://go-review.googlesource.com/8473
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The stack blowout can no longer happen,
but we can still test that too-complex regexps
are rejected.
Replacement for CL 162770043.
LGTM=iant, r
R=r, iant
CC=bradfitz, golang-codereviews
https://golang.org/cl/162860043
This is already tested by TestRE2Exhaustive, but the build has
not broken because that test is not run when using -test.short.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/155580043