1
0
mirror of https://github.com/golang/go synced 2024-10-05 15:51:22 -06:00
Commit Graph

8 Commits

Author SHA1 Message Date
Marcel van Lohuizen
d673c95d6c exp/norm: Added some benchmarks for form-specific performance measurements.
R=r
CC=golang-dev
https://golang.org/cl/5605051
2012-02-02 13:19:12 +01:00
Marcel van Lohuizen
cadbd3ea49 exp/norm: fixed two unrelated bugs in normalization library.
1) incorrect length given for out buffer in String.
2) patchTail bug that could cause characters to be lost
   when crossing into the out-buffer boundary.

Added tests to expose these bugs.  Also slightly improved
performance of Bytes() and String() by sharing the reorderBuffer
across operations.

Fixes #2567.

R=r
CC=golang-dev
https://golang.org/cl/5502069
2011-12-23 18:21:26 +01:00
Russ Cox
c945f77f41 exp/norm: use rune
Nothing terribly interesting here. (!)

Since the public APIs are all in terms of UTF-8,
the changes are all internal only.

R=mpvl, gri, r
CC=golang-dev
https://golang.org/cl/5309042
2011-10-25 22:26:12 -07:00
Marcel van Lohuizen
5844fc1b21 exp/norm: introduced input interface to implement string versions
of methods.

R=r, mpvl
CC=golang-dev
https://golang.org/cl/5166045
2011-10-05 10:44:11 -07:00
Robert Griesemer
9c643bb3fa exp/norm: fix benchmark bug
- don't use range over string to copy string bytes
- some code simplification

R=mpvl
CC=golang-dev
https://golang.org/cl/5144044
2011-09-26 18:23:21 -07:00
Marcel van Lohuizen
d5e24b6975 exp/norm: performance improvements of quickSpan
- fixed performance bug that could lead to O(n^2) behavior
- performance improvement for ASCII case

R=r, r
CC=golang-dev
https://golang.org/cl/4956060
2011-09-05 19:09:20 +02:00
Marcel van Lohuizen
2517143957 exp/norm: added Reader and Writer and bug fixes to support these.
Needed to ensure that finding the last boundary does not result in O(n^2)-like behavior.
Now prevents lookbacks beyond 31 characters across the board (starter + 30 non-starters).
composition.go:
- maxCombiningCharacters now means exactly that.
- Bug fix.
- Small performance improvement/ made code consistent with other code.
forminfo.go:
- Bug fix: ccc needs to be 0 for inert runes.
normalize.go:
- A few bug fixes.
- Limit the amount of combining characters considered in FirstBoundary.
- Ditto for LastBoundary.
- Changed semantics of LastBoundary to not consider trailing illegal runes a boundary
  as long as adding bytes might still make them legal.
trie.go:
- As utf8.UTFMax is 4, we should treat UTF-8 encodings of size 5 or greater as illegal.
  This has no impact on the normalization process, but it prevents buffer overflows
  where we expect at most UTFMax bytes.

R=r
CC=golang-dev
https://golang.org/cl/4963041
2011-09-02 12:39:35 +02:00
Marcel van Lohuizen
d9c9c48797 exp/norm: added implemenation for []byte versions of methods.
R=r
CC=golang-dev
https://golang.org/cl/4925041
2011-08-22 12:52:04 +02:00