1
0
mirror of https://github.com/golang/go synced 2024-10-05 04:21:22 -06:00
Commit Graph

84 Commits

Author SHA1 Message Date
Rob Pike
1b3c969ac3 regexp: identify that submatch is also known as capturing group
Mention the syntax is defined by the regexp/syntax package.
Fixes #3953.

R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/7702044
2013-03-11 16:23:06 -07:00
Andrew Gerrand
5fad786452 regexp: update comment on (*Regexp).Longest
Missed this review comment.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7229084
2013-02-04 15:57:32 +11:00
Andrew Gerrand
f41ffc2bf4 regexp: add (*Regexp).Longest
Fixes #3696.

R=rsc
CC=golang-dev
https://golang.org/cl/7133051
2013-02-04 15:28:55 +11:00
Erik St. Martin
54b7ccd514 regexp: fix index panic in Replace
When using subexpressions ($1) as replacements, when they either don't exist or values weren't found causes a panic.
This patch ensures that the match location isn't -1, to prevent out of bounds errors.
Fixes #3816.

R=franciscossouza, rsc
CC=golang-dev
https://golang.org/cl/6931049
2012-12-22 11:14:56 -05:00
Rick Arnold
94b3f6d728 regexp: add Split
As discussed in issue 2672 and on golang-nuts, this CL adds a Split() method
to regexp. It is based on returning the "opposite" of FindAllString() so that
the returned substrings are everything not matched by the expression.

See: https://groups.google.com/forum/?fromgroups=#!topic/golang-nuts/xodBZh9Lh2E

Fixes #2762.

R=remyoudompheng, r, rsc
CC=golang-dev
https://golang.org/cl/6846048
2012-11-27 12:58:27 -05:00
Robert Griesemer
465b9c35e5 gofmt: apply gofmt -w src misc
Remove trailing whitespace in comments.
No other changes.

R=r
CC=golang-dev
https://golang.org/cl/6815053
2012-10-30 13:38:01 -07:00
Rob Pike
4783ad82da regexp: fix glitch in doc for FindReaderIndex
Fixes #3878.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6457054
2012-07-30 12:46:50 -07:00
David G. Andersen
e66d29cdcf pkg: Removing duplicated words ("of of", etc.), mostly from comments.
Ran 'double.pl' on the pkg tree to identify doubled words.
One change to an error string return in x509;  the rest are in comments.
Thanks to Matt Jibson for the idea.

R=golang-dev, bsiegert
CC=golang-dev
https://golang.org/cl/6344089
2012-07-09 09:16:10 +10:00
Rob Pike
43cf5505fc regexp: fix a couple of bugs in the documentation
Byte slices are not strings.

Fixes #3687.

R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/6257074
2012-05-30 21:57:50 -07:00
Brad Fitzpatrick
9cd4a0467a regexp: name result parameters referenced from docs
Fixes #2953

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5653051
2012-02-10 10:22:01 +11:00
Russ Cox
5957f914e2 regexp: fix typo
Fixes #2918.

TBR=golang-dev
CC=golang-dev
https://golang.org/cl/5639062
2012-02-08 08:59:59 -05:00
Russ Cox
7201ba2171 regexp: allow substitutions in Replace, ReplaceString
Add Expand, ExpandString for access to the substitution functionality.

Fixes #2736.

R=r, bradfitz, r, rogpeppe, n13m3y3r
CC=golang-dev
https://golang.org/cl/5638046
2012-02-07 23:46:47 -05:00
Brad Fitzpatrick
73ce14d0aa regexp: remove vestigial Error type
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5573069
2012-01-25 14:50:37 -08:00
Olivier Duperray
e5c1f3870b pkg: Add & fix Copyright of "hand generated" files
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5554064
2012-01-19 10:14:56 -08:00
Russ Cox
21d3721eb8 regexp: add SubexpNames
Fixes #2440.

R=r, dsymonds
CC=golang-dev
https://golang.org/cl/5559043
2012-01-19 01:24:01 -05:00
Russ Cox
2f2cc24cd8 regexp: avoid allocation of input interface
Matters most for small inputs, because there is no real work
to amortize the allocation effort against.

benchmark                                old ns/op    new ns/op    delta
BenchmarkLiteral                               613          473  -22.84%
BenchmarkNotLiteral                           4981         4931   -1.00%
BenchmarkMatchClass                           7289         7122   -2.29%
BenchmarkMatchClass_InRange                   6618         6663   +0.68%
BenchmarkReplaceAll                           7843         7233   -7.78%
BenchmarkAnchoredLiteralShortNonMatch          329          228  -30.70%
BenchmarkAnchoredLiteralLongNonMatch           322          228  -29.19%
BenchmarkAnchoredShortMatch                    838          715  -14.68%
BenchmarkAnchoredLongMatch                     824          715  -13.23%

benchmark                                 old MB/s     new MB/s  speedup
BenchmarkMatchEasy0_32                      119.73       196.61    1.64x
BenchmarkMatchEasy0_1K                      540.58       538.33    1.00x
BenchmarkMatchEasy0_32K                     732.57       714.00    0.97x
BenchmarkMatchEasy0_1M                      726.44       708.36    0.98x
BenchmarkMatchEasy0_32M                     707.77       691.45    0.98x
BenchmarkMatchEasy1_32                      102.12       136.11    1.33x
BenchmarkMatchEasy1_1K                      298.31       307.04    1.03x
BenchmarkMatchEasy1_32K                     273.56       274.43    1.00x
BenchmarkMatchEasy1_1M                      268.42       269.23    1.00x
BenchmarkMatchEasy1_32M                     266.15       267.34    1.00x
BenchmarkMatchMedium_32                       2.53         3.38    1.34x
BenchmarkMatchMedium_1K                       9.37         9.57    1.02x
BenchmarkMatchMedium_32K                      9.29         9.67    1.04x
BenchmarkMatchMedium_1M                       9.42         9.66    1.03x
BenchmarkMatchMedium_32M                      9.41         9.62    1.02x
BenchmarkMatchHard_32                         6.66         6.75    1.01x
BenchmarkMatchHard_1K                         6.81         6.85    1.01x
BenchmarkMatchHard_32K                        6.79         6.85    1.01x
BenchmarkMatchHard_1M                         6.82         6.83    1.00x
BenchmarkMatchHard_32M                        6.80         6.80    1.00x

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5453076
2011-12-07 15:03:05 -05:00
Russ Cox
7a6a9755a9 regexp: fix doc comment
Fixes #2432.

R=r, r
CC=golang-dev
https://golang.org/cl/5376041
2011-11-09 13:46:54 -05:00
Rob Pike
45e3bcb343 renaming_3: gofix -r go1pkgrename src/pkg/[m-z]*
R=rsc
CC=golang-dev
https://golang.org/cl/5345045
2011-11-08 15:41:54 -08:00
Russ Cox
eb6929299b src/pkg/[n-z]*: gofix -r error -force=error
R=golang-dev, bsiegert, iant
CC=golang-dev
https://golang.org/cl/5294074
2011-11-01 22:05:34 -04:00
Russ Cox
3e52dadfd7 regexp: use rune
Public API of syntax tree changes.

R=golang-dev, r, gri
CC=golang-dev
https://golang.org/cl/5302046
2011-10-25 22:20:57 -07:00
Russ Cox
8f699a3fb9 regexp: speedups
MatchEasy0_1K        500000        4207 ns/op   243.35 MB/s
MatchEasy0_1K_Old    500000        4625 ns/op   221.40 MB/s
MatchEasy0_1M           500     3948932 ns/op   265.53 MB/s
MatchEasy0_1M_Old       500     3943926 ns/op   265.87 MB/s
MatchEasy0_32K        10000      122974 ns/op   266.46 MB/s
MatchEasy0_32K_Old    10000      123270 ns/op   265.82 MB/s
MatchEasy0_32M           10   127265400 ns/op   263.66 MB/s
MatchEasy0_32M_Old       10   127123500 ns/op   263.95 MB/s
MatchEasy1_1K        500000        5637 ns/op   181.63 MB/s
MatchEasy1_1K_Old     10000      100690 ns/op    10.17 MB/s
MatchEasy1_1M           200     7683150 ns/op   136.48 MB/s
MatchEasy1_1M_Old        10   145774000 ns/op     7.19 MB/s
MatchEasy1_32K        10000      239887 ns/op   136.60 MB/s
MatchEasy1_32K_Old      500     4508182 ns/op     7.27 MB/s
MatchEasy1_32M           10   247103500 ns/op   135.79 MB/s
MatchEasy1_32M_Old        1  4660191000 ns/op     7.20 MB/s
MatchMedium_1K        10000      160567 ns/op     6.38 MB/s
MatchMedium_1K_Old    10000      158367 ns/op     6.47 MB/s
MatchMedium_1M           10   162928000 ns/op     6.44 MB/s
MatchMedium_1M_Old       10   159699200 ns/op     6.57 MB/s
MatchMedium_32K         500     5090758 ns/op     6.44 MB/s
MatchMedium_32K_Old     500     5005800 ns/op     6.55 MB/s
MatchMedium_32M           1  5233973000 ns/op     6.41 MB/s
MatchMedium_32M_Old       1  5109676000 ns/op     6.57 MB/s
MatchHard_1K          10000      249087 ns/op     4.11 MB/s
MatchHard_1K_Old       5000      364569 ns/op     2.81 MB/s
MatchHard_1M              5   256050000 ns/op     4.10 MB/s
MatchHard_1M_Old          5   372446400 ns/op     2.82 MB/s
MatchHard_32K           200     7944525 ns/op     4.12 MB/s
MatchHard_32K_Old       100    11609380 ns/op     2.82 MB/s
MatchHard_32M             1  8144503000 ns/op     4.12 MB/s
MatchHard_32M_Old         1 11885434000 ns/op     2.82 MB/s

R=r, bradfitz
CC=golang-dev
https://golang.org/cl/5134049
2011-09-28 12:00:31 -04:00
Russ Cox
6c230fbc67 regexp: move to old/regexp, replace with exp/regexp
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5127042
2011-09-26 18:33:13 -04:00
Rob Pike
d6f80e1a4c regexp: document that Regexp is thread-safe.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4667047
2011-06-29 15:41:09 +10:00
Nigel Tao
278952c393 regexp: add a package prefix to error strings.
R=r, r
CC=golang-dev
https://golang.org/cl/4630041
2011-06-17 10:50:38 +10:00
Rob Pike
7db904c1f6 regexp: add support for matching text read from things that implement
ReadRune.  (If you have a Reader but not a RuneReader, use bufio.)

The matching code is a few percent slower but significantly cleaner.

R=rsc
CC=golang-dev
https://golang.org/cl/4125046
2011-02-03 13:58:40 -08:00
Ben Lynn
eb56a79e99 regexp: reject bare ?
Minor cleanup:
  - removed a duplicate test case
  - added a function to remove repeated code
  - for consistency, replaced "return nil" with a panic at an
    unreachable point

Fixes #1428.

R=golang-dev, r, rsc
CC=golang-dev
https://golang.org/cl/4057042
2011-01-19 13:47:04 -05:00
Rob Pike
6a5a527173 regexp: implement early out for failed anchored search.
R=rsc
CC=golang-dev
https://golang.org/cl/3813045
2011-01-04 12:43:52 -08:00
Rob Pike
15cb7ed34f regexp: fix prefix bug.
After a prefix match, the old code advanced the length of the
prefix.  This is incorrect since the full match might begin
in the middle of the prefix. (Consider "aaaab+" matching
"aaaaaab").

Fixes #1373

R=rsc
CC=golang-dev
https://golang.org/cl/3795044
2011-01-03 11:35:34 -08:00
Rob Pike
c0d0d4ef05 regexp: fix performance bug, make anchored searches fail fast.
The bug was that for an anchored pattern such as ^x, the prefix
scan ignored the anchor, and could scan the whole file if there was
no x present.  The fix is to do prefix matching after the anchor;
the cost miniscule; the speedups huge.

R=rsc, gri
CC=golang-dev
https://golang.org/cl/3837042
2011-01-03 11:31:51 -08:00
Rob Pike
a9e7c9381e regexp: change Expr() to String(); add HasOperator method to Regexp.
It reports whether a regular expression has operators
as opposed to matching literal text.

R=rsc, gri
CC=golang-dev
https://golang.org/cl/3731041
2010-12-17 10:23:46 -08:00
Rob Pike
5bd4094d2e regexp: add HasMeta and regexp.Expr().
The former is a boolean function to test whether a string
contains a regular expression metacharacter; the second
returns the string used to compile the regexp.

R=gri, rsc
CC=golang-dev
https://golang.org/cl/3728041
2010-12-16 16:55:26 -08:00
Rob Pike
da1cbe5d11 regexp: simplify code for brackets, per rsc suggestion
R=rsc
CC=golang-dev
https://golang.org/cl/3545044
2010-12-14 12:01:35 -08:00
Rob Pike
8bb9e616ed regexp: speed up by about 30%.
The code used interfaces in a pretty, pedagogical way but not efficiently.
Remove unnecessary interface code for significant speedups.
Before:

	regexp.BenchmarkLiteral	 1000000	      2629 ns/op
	regexp.BenchmarkNotLiteral	  100000	     18131 ns/op
	regexp.BenchmarkMatchClass	  100000	     26647 ns/op
	regexp.BenchmarkMatchClass_InRange	  100000	     27092 ns/op
	regexp.BenchmarkReplaceAll	  100000	     27014 ns/op

After:

	regexp.BenchmarkLiteral	 1000000	      2077 ns/op
	regexp.BenchmarkNotLiteral	  100000	     13738 ns/op
	regexp.BenchmarkMatchClass	  100000	     20418 ns/op
	regexp.BenchmarkMatchClass_InRange	  100000	     20999 ns/op
	regexp.BenchmarkReplaceAll	  100000	     21825 ns/op

There's likely more to do without major surgery, but this is a simple, significant step.

R=rsc
CC=golang-dev
https://golang.org/cl/3572042
2010-12-14 11:15:32 -08:00
Kyle Consalus
009aebdba8 Removed bytes.Add and bytes.AddByte; we now have 'append'.
Changed all uses of bytes.Add (aside from those testing bytes.Add) to append(a, b...).
Also ran "gofmt -s" and made use of copy([]byte, string) in the fasta benchmark.

R=golang-dev, r, r2
CC=golang-dev
https://golang.org/cl/3302042
2010-12-01 11:59:13 -08:00
Adam Langley
3cb4bdb9ce utf8: make EncodeRune's destination the first argument.
R=r
CC=golang-dev
https://golang.org/cl/3364041
2010-11-30 16:59:43 -05:00
Rob Pike
1f4d54ea01 regexp: eliminate vector in favor of append.
R=rsc
CC=golang-dev
https://golang.org/cl/2795041
2010-10-28 15:54:01 -07:00
Russ Cox
69c4e9380b use append
R=gri, r, r2
CC=golang-dev
https://golang.org/cl/2743042
2010-10-27 19:47:23 -07:00
Rob Pike
4659f6de38 regexp: delete Iter methods
They are unused and not that useful anyway.

R=rsc
CC=golang-dev
https://golang.org/cl/2225045
2010-09-21 21:21:44 +10:00
Rob Pike
ca3b5222eb regexp: interpret all Go characer escapes \a \b \f \n \r \t \v
R=rsc
CC=golang-dev
https://golang.org/cl/2042044
2010-08-30 14:06:59 +10:00
Rob Pike
69fe3dd754 regexp: grow slices dynamically in the 'All' routines.
R=rsc
CC=golang-dev
https://golang.org/cl/1953044
2010-08-16 15:17:34 -07:00
Rob Pike
079a117469 regexp: delete the deprecated methods and tests.
R=golang-dev
CC=golang-dev
https://golang.org/cl/1956044
2010-08-12 17:16:37 +10:00
Rob Pike
6610d79eda regexp: new regularized methods for matching.
The previous set was spotty, incomplete, and confusing.
This CL proposes a regular, clean set with clearer names.
It's also complete.  Many existing methods will be deprecated,
but not in this CL.  Ditto for the tests.

R=rsc, gri
CC=golang-dev, rog
https://golang.org/cl/1946041
2010-08-12 14:41:52 +10:00
Rob Pike
46db2e3c25 regexp: document that backslashes are the escape character.
Fixes #1013.

R=rsc, gri
CC=golang-dev
https://golang.org/cl/1938041
2010-08-09 15:11:02 -07:00
Rob Pike
a8cd6c2012 regexp: bug fix: need to track whether match begins with fixed prefix.
Fixes #872.

R=rsc
CC=golang-dev
https://golang.org/cl/1731043
2010-06-22 16:02:14 -07:00
Kyle Consalus
aae02a1855 Optimization to regexp _CharClass: keep track of overall range of
charclass to avoid unnecessarily iterating over ranges.
    Also, use the fact that IntVector is an []int to avoid method calls.
    On my machine, this brings us from ~27500 ns/op to ~17500 ns/op in the benchmark I've added (it is also faster in the case where a range check
    doesn't help, added a benchmark for this too.)

    I'd also like to propose that "[]", and "[^]" be disallowed. They aren't useful as far as I can tell, they aren't widely supported, and they make reasoning about character classes a bit more complicated.

R=r
CC=golang-dev
https://golang.org/cl/1495041
2010-06-02 23:04:44 -07:00
Russ Cox
6f33f34bbc regexp: allow escaping of any punctuation
More in line with other regexp packages
and egrep; accommodates overzealous escapers.

R=r
CC=golang-dev
https://golang.org/cl/1008041
2010-04-26 10:00:18 -07:00
Rob Pike
b12007c4ed testing/regexp: use recover.
R=rsc
CC=golang-dev
https://golang.org/cl/816042
2010-03-31 17:57:50 -07:00
Rob Pike
7de610cc61 regexp: use panic/recover to handle errors
R=rsc, gri
CC=golang-dev
https://golang.org/cl/821046
2010-03-31 15:58:21 -07:00
Rob Pike
7ffe938f08 regexp: don't return non-nil *Regexp if there is an error.
R=gri
CC=golang-dev
https://golang.org/cl/787041
2010-03-26 16:18:20 -07:00
Rob Pike
325cf8ef21 delete all uses of panicln by rewriting them using panic or,
in the tests, println+panic.
gofmt some tests too.

R=rsc
CC=golang-dev
https://golang.org/cl/741041
2010-03-24 16:46:53 -07:00