1
0
mirror of https://github.com/golang/go synced 2024-10-05 15:51:22 -06:00
Commit Graph

9 Commits

Author SHA1 Message Date
Marcel van Lohuizen
a52fb458df exp/norm: merged charinfo and decomposition tables. As a result only
one trie lookup per rune is needed. See forminfo.go for a description
of the new format.  Also included leading and trailing canonical
combining class in decomposition information.  This will often avoid
additional trie lookups.

R=r, r
CC=golang-dev
https://golang.org/cl/5616071
2012-02-13 14:54:46 +01:00
Marcel van Lohuizen
8ba20dbdb5 exp/norm: a few minor changes in prepration for a table format change:
- Unified bounary conditions for NFC and NFD and removed some indirections.
   This enforces boundaries at the character level, which is typically what
   the user expects. (NFD allows a boundary between 'a' and '`', for example,
   which may give unexpected results for collation.  The current implementation
   is already stricter than the standard, so nothing much changes.  This change
   just formalizes it.
 - Moved methods of qcflags to runeInfo.
 - Swapped YesC and YesMaybe bits in qcFlags. This is to aid future changes.
 - runeInfo return values use named fields in preperation for struct change.
 - Replaced some left-over uint32s with rune.

R=r
CC=golang-dev
https://golang.org/cl/5607050
2012-02-02 13:55:53 +01:00
Rob Pike
30aa701fec renaming_2: gofix -r go1pkgrename src/pkg/[a-l]*
R=rsc
CC=golang-dev
https://golang.org/cl/5358041
2011-11-08 15:40:58 -08:00
Russ Cox
c945f77f41 exp/norm: use rune
Nothing terribly interesting here. (!)

Since the public APIs are all in terms of UTF-8,
the changes are all internal only.

R=mpvl, gri, r
CC=golang-dev
https://golang.org/cl/5309042
2011-10-25 22:26:12 -07:00
Marcel van Lohuizen
5844fc1b21 exp/norm: introduced input interface to implement string versions
of methods.

R=r, mpvl
CC=golang-dev
https://golang.org/cl/5166045
2011-10-05 10:44:11 -07:00
Marcel van Lohuizen
2517143957 exp/norm: added Reader and Writer and bug fixes to support these.
Needed to ensure that finding the last boundary does not result in O(n^2)-like behavior.
Now prevents lookbacks beyond 31 characters across the board (starter + 30 non-starters).
composition.go:
- maxCombiningCharacters now means exactly that.
- Bug fix.
- Small performance improvement/ made code consistent with other code.
forminfo.go:
- Bug fix: ccc needs to be 0 for inert runes.
normalize.go:
- A few bug fixes.
- Limit the amount of combining characters considered in FirstBoundary.
- Ditto for LastBoundary.
- Changed semantics of LastBoundary to not consider trailing illegal runes a boundary
  as long as adding bytes might still make them legal.
trie.go:
- As utf8.UTFMax is 4, we should treat UTF-8 encodings of size 5 or greater as illegal.
  This has no impact on the normalization process, but it prevents buffer overflows
  where we expect at most UTFMax bytes.

R=r
CC=golang-dev
https://golang.org/cl/4963041
2011-09-02 12:39:35 +02:00
Marcel van Lohuizen
4a4fa38d0e exp/norm: Reduced the size of the byte buffer used by reorderBuffer by half by reusing space when combining.
R=r
CC=golang-dev
https://golang.org/cl/4939042
2011-08-24 11:05:45 +02:00
Marcel van Lohuizen
45b7084b92 exp/norm: a few minor fixes to support the implementation of norm.
maketables.go/tables.go
- Properly set combinesForward flag for JamoL and JamoV.
- Fixed Printf bug.
composition.go
- Make insertString use the same control flow as insert.
- Better Hangul and non-Hangul mixing.
forminfo.go
- Fixed bug in compBoundaryBefore that affected a few esoteric cases.
- Buffer overflow now tested in normalize_test.go (other CL).

R=r
CC=golang-dev
https://golang.org/cl/4924041
2011-08-22 12:11:29 +02:00
Marcel van Lohuizen
b40bd5efb7 exp/norm: implementation of decomposition and composing functionality.
forminfo.go:
- Wrappers for table data.
- Per Form dispatch table.
composition.go:
- reorderBuffer type.  Implements decomposition, reordering, and composition.
- Note: decompose and decomposeString fields in formInfo could be replaced by
  a pointer to the trie for the respective form.  The proposed design makes
  testing easier, though.
normalization.go:
- Temporarily added panic("not implemented") methods to make the tests run.
  These will be removed again with the next CL, which will introduce the
  implementation.

R=r, rogpeppe, mpvl, rsc
CC=golang-dev
https://golang.org/cl/4875043
2011-08-17 18:12:39 +10:00