1
0
mirror of https://github.com/golang/go synced 2024-10-04 08:31:22 -06:00
Commit Graph

625 Commits

Author SHA1 Message Date
Andrew Balholm
74db9d298b exp/html: don't treat SVG <title> like HTML <title>
The content of an HTML <title> element is RCDATA, but the content of an SVG
<title> element is parsed as tags. Now the parser doesn't go into RCDATA
mode in foreign content.

Pass 4 additional tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6448111
2012-08-05 22:32:35 +10:00
Marcel van Lohuizen
89d40b911c exp/locale/collate: changed API of Builder to be more convenient
for dealing with CLDR files:
- Add now taxes a list of indexes of colelems that are variables. Checking and
  handling is now done by the Builder.  VariableTop is now also properly generated
  using the Build method.
- Introduced separate Builder, called Tailoring, for creating tailorings of root
  table.  This clearly separates the functionality for building a table based on
  weights (the allkeys* files) versus tables based on LDML XML files.
- Tailorings are now added by two calls instead of one: SetAnchor and Insert.
  This more closely reflects the structure of LDML side and simplifies the
  implementation of both the client and library side.  It also preserves
  some information that is otherwise hard to recover for the Builder.
- Allow the LDML XML element extend to be passed to Insert.  This simplifies
  both client and library implementation.

R=r
CC=golang-dev
https://golang.org/cl/6454061
2012-08-03 09:01:21 +02:00
Andrew Balholm
2f39a33b6a exp/html: in parse tests, discard only one trailing newline
Pass 2 additional tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6454090
2012-08-03 09:31:45 +10:00
Nigel Tao
1916db786f html: make the low-level tokenizer also skip end-tag attributes.
R=andybalholm
CC=golang-dev
https://golang.org/cl/6453071
2012-08-03 09:29:16 +10:00
Rémy Oudompheng
37d7500f8d exp/types: set non-embedded method type during GcImport.
R=golang-dev, gri
CC=golang-dev
https://golang.org/cl/6445068
2012-08-02 16:24:09 -07:00
Robert Griesemer
a4ac339f43 exp/types: enable cycle checks again
Process a package's object in a reproducible
order (rather then in map order) so that we
get error messages in reproducible order.

R=r
CC=golang-dev
https://golang.org/cl/6449076
2012-08-01 16:37:06 -07:00
Andrew Balholm
dbbfbcc4a1 exp/html: implement escaping and double-escaping in scripts
The text inside <script> tags is not ordinary raw text; there are all sorts
of other complications. This CL implements those complications.

Pass 76 additional tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6443070
2012-08-01 14:45:35 +10:00
Robert Griesemer
152279f203 exp/types: Replace String method with TypeString function
This is more in sync with the rest of the package;
for instance, we have functions (not methods) to
deref or find the underlying type of a Type.

In the process use a single bytes.Buffer to create
the string representation for a type rather than
the (occasional) string concatenation.

R=r
CC=golang-dev
https://golang.org/cl/6458057
2012-07-31 19:30:18 -07:00
Robert Griesemer
dcb6f59811 exp/types: implement Type.String methods for testing/debugging
Also:
- replaced existing test with a more comprehensive test
- fixed bug in map type creation

R=r
CC=golang-dev
https://golang.org/cl/6450072
2012-07-31 17:09:12 -07:00
Andrew Balholm
9f3b00579e exp/html: tokenize attributes of end tags
If an end tag has an attribute that is a quoted string containing '>',
the tokenizer would end the tag prematurely. Now it reads the attributes
on end tags just as it does on start tags, but the high-level interface
still doesn't return them, because their presence is a parse error.

Pass 1 additional test.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6457060
2012-08-01 09:35:02 +10:00
Andrew Balholm
eff32f573b exp/html: replace NUL with U+FFFD in text in foreign content
Pass 5 additional tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6452055
2012-07-29 16:29:49 +10:00
Marcel van Lohuizen
601045e87a exp/locale/collate: changed trie in first step towards support for multiple locales.
- Allow handles into the trie for different locales.  Multiple tables share the same
  try to allow for reuse of blocks.
- Significantly improved memory footprint and reduced allocations of trieNodes.
  This speeds up generation by about 30% and allows keeping trieNodes around
  for multiple locales during generation.
- Renamed print method to fprint.

R=r
CC=golang-dev
https://golang.org/cl/6408052
2012-07-28 18:44:14 +02:00
Andrew Balholm
a1f340fa1a exp/html: parse CDATA sections in foreign content
Also convert NUL to U+FFFD in comments.

Pass 23 additional tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6446055
2012-07-27 16:05:25 +10:00
Andrew Balholm
55f0c8b2cd exp/html: replace NUL bytes in plaintext, raw text, and RCDATA
If NUL bytes occur inside certain elements, convert them to U+FFFD
replacement character.

Pass 1 additional test.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6452047
2012-07-27 09:27:10 +10:00
Andrew Wilkins
d399b681a4 exp/types: process ast.Fun in checkObj; fix variadic function building
Fixed creation of Func's, taking IsVariadic from parameter list rather
than results.

Updated checkObj to process ast.Fun objects.

R=gri
CC=golang-dev
https://golang.org/cl/6402046
2012-07-26 11:47:46 -07:00
Andrew Balholm
899be50991 exp/html: don't insert empty text nodes
Pass 1 additional test.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6443048
2012-07-26 10:32:24 +10:00
Andrew Balholm
4d22519678 exp/html: allow frameset if body contains whitespace
If the body of an HTML document contains text, the <frameset> tag is
ignored. But not if the text is only whitespace.

Pass 4 additional tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6442043
2012-07-25 12:09:58 +10:00
Andrew Balholm
f979528ce6 exp/html: special handling for entities in attributes
Don't unescape entities in attributes when they don't end with
a semicolon and they are followed by '=', a letter, or a digit.

Pass 6 more tests from the WebKit test suite, plus one that was
commented out in token_test.go.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6405073
2012-07-23 12:39:58 +10:00
Marcel van Lohuizen
882b6ef454 exp/locale/collate: This CL includes the following changes:
- Changed the representation of colElem to support a few cases
  for some languages not supported by the current format.
- Changed offsets for implicit primary values. This makes the
  values both easier to read and debug (last 4 nibbles are identical to
  implicit primary value) and also results in better packing.
- Fixed bug in weight conversion code that did not pop up yet by
  sheer luck.
Note that tables.go also includes changes to the contraction trie
from CL 6346092.

R=r, mpvl
CC=golang-dev
https://golang.org/cl/6392060
2012-07-13 11:38:22 +02:00
Marcel van Lohuizen
adc19ac5e3 exp/locale/collate: adjusted contraction trie to support Myanmar (Burmese),
which has a rather large contraction table. The value of the next state
offset now starts after the current block, instead of before.  This is
slightly less efficient (on extra addition per state change), but gives
some extra range for the offsets.
Also introduced constants for final (0) and noIndex (0xFF).
tables.go is updated in a separate CL.

R=r
CC=golang-dev
https://golang.org/cl/6346092
2012-07-13 11:38:00 +02:00
Jan Ziak
b3382ec9e9 exp/inotify: prevent data race
Fixes #3713.

R=bradfitz, rsc
CC=golang-dev
https://golang.org/cl/6331055
2012-06-25 14:08:09 -04:00
Jan Ziak
f5f3c3fe09 exp/inotify: prevent data race during testing
Fixes #3714.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6341047
2012-06-24 19:22:48 -04:00
Marcel van Lohuizen
77b1378c3e exp/locale/collate: added regression test for collate package
based on UCA test files.

R=r
CC=golang-dev
https://golang.org/cl/6216056
2012-06-19 11:34:56 -07:00
Robert Griesemer
ca2ae27dd0 go/ast: multiple "blank" imports are permitted
R=rsc, dsymonds
CC=golang-dev
https://golang.org/cl/6303099
2012-06-18 21:56:41 -07:00
Nigel Tao
834edc4257 exp/html/atom: add some more atoms.
R=r, dsymonds
CC=golang-dev
https://golang.org/cl/6298085
2012-06-15 15:39:25 +10:00
Shenghou Ma
9852614291 exp/types: clean up objects after test
Fixes #3739.

R=bradfitz, rsc
CC=golang-dev
https://golang.org/cl/6295083
2012-06-15 02:52:18 +08:00
Nigel Tao
66429dcf75 exp/html: simplify some of the parser's internal methods.
benchmark          old ns/op    new ns/op    delta
BenchmarkParser      4006888      3950604   -1.40%

R=r, andybalholm
CC=golang-dev
https://golang.org/cl/6301070
2012-06-13 10:13:05 +10:00
Robert Griesemer
49d6e49087 exp/types: testing resolution of qualified identifiers
Also: fix a bug with exp/types/GcImport.

R=rsc, r
CC=golang-dev
https://golang.org/cl/6302060
2012-06-11 11:06:27 -07:00
Nigel Tao
6c204982e0 exp/html: check the context node for consistency when parsing fragments.
R=rsc
CC=golang-dev
https://golang.org/cl/6303053
2012-06-08 13:55:15 +10:00
Nigel Tao
c8fac7b967 exp/html: when parsing, compare atoms (ints) instead of strings.
This is the mechanical part of the 2-part change that started with
https://golang.org/cl/6305053/

R=rsc
CC=andybalholm, golang-dev, r
https://golang.org/cl/6295055
2012-06-07 13:46:57 +10:00
Nigel Tao
cd21eff705 exp/html: make the tokenizer return atoms for tag tokens.
This is part 1 of a 2 part changelist. Part 2 contains the mechanical
change to parse.go to compare atoms (ints) instead of strings.

The overall effect of the two changes are:
benchmark                      old ns/op    new ns/op    delta
BenchmarkParser                  4462274      4058254   -9.05%
BenchmarkRawLevelTokenizer        913202       912917   -0.03%
BenchmarkLowLevelTokenizer       1268626      1267836   -0.06%
BenchmarkHighLevelTokenizer      1947305      1968944   +1.11%

R=rsc
CC=andybalholm, golang-dev, r
https://golang.org/cl/6305053
2012-06-07 13:05:35 +10:00
Nigel Tao
90fa13d2b7 exp/html/atom: add more atoms.
This completely covers the tags used by exp/html's parser.

Before:
295 atoms; 1406 string bytes + 2048 tables = 3454 total data
BenchmarkLookup    50000         59841 ns/op

After:
322 atoms; 1508 string bytes + 2048 tables = 3556 total data
BenchmarkLookup    50000         60159 ns/op

R=r
CC=golang-dev
https://golang.org/cl/6296045
2012-06-07 09:35:35 +10:00
Marcel van Lohuizen
de0c1c9cf5 exp/locale/collate: somehow an incorrect version of tables was checked in earlier.
Regenerated tables using maketables.

R=r, rsc
CC=golang-dev
https://golang.org/cl/6248067
2012-06-04 18:35:26 +02:00
Russ Cox
192550592a exp/html/atom: faster Lookup with smaller tables
Use perfect cuckoo hash, to avoid binary search.
Define Atom bits as offset+len in long string instead
of enumeration, to avoid string headers.

Before: 1909 string bytes + 6060 tables = 7969 total data
After: 1406 string bytes + 2048 tables = 3454 total data

benchmark          old ns/op    new ns/op    delta
BenchmarkLookup        83878        64681  -22.89%

R=nigeltao, r
CC=golang-dev
https://golang.org/cl/6262051
2012-06-02 22:43:11 -04:00
Nigel Tao
d2a6098e9c exp/html/atom: faster, hash-based lookup.
exp/html/atom benchmark:
benchmark          old ns/op    new ns/op    delta
BenchmarkLookup       199226        80770  -59.46%

exp/html benchmark:
benchmark                      old ns/op    new ns/op    delta
BenchmarkParser                  4864890      4510834   -7.28%
BenchmarkHighLevelTokenizer      2209192      1969684  -10.84%
benchmark                       old MB/s     new MB/s  speedup
BenchmarkParser                    16.07        17.33    1.08x
BenchmarkHighLevelTokenizer        35.38        39.68    1.12x

R=r
CC=golang-dev
https://golang.org/cl/6261054
2012-06-01 09:36:05 +10:00
Nigel Tao
bb4a817a92 exp/html/atom: new package.
50% fewer mallocs in HTML tokenization, resulting in 25% fewer mallocs
in parsing go1.html.

Making the parser use integer comparisons instead of string comparisons
will be a follow-up CL, to be co-ordinated with Andy Balholm's work.

exp/html benchmarks before/after:

BenchmarkParser	     500	   4754294 ns/op	  16.44 MB/s
        parse_test.go:409: 500 iterations, 14651 mallocs per iteration
BenchmarkRawLevelTokenizer	    2000	    903481 ns/op	  86.51 MB/s
        token_test.go:678: 2000 iterations, 28 mallocs per iteration
BenchmarkLowLevelTokenizer	    2000	   1260485 ns/op	  62.01 MB/s
        token_test.go:678: 2000 iterations, 41 mallocs per iteration
BenchmarkHighLevelTokenizer	    1000	   2165964 ns/op	  36.09 MB/s
        token_test.go:678: 1000 iterations, 6616 mallocs per iteration

BenchmarkParser	     500	   4664912 ns/op	  16.76 MB/s
        parse_test.go:409: 500 iterations, 11266 mallocs per iteration
BenchmarkRawLevelTokenizer	    2000	    903065 ns/op	  86.55 MB/s
        token_test.go:678: 2000 iterations, 28 mallocs per iteration
BenchmarkLowLevelTokenizer	    2000	   1260032 ns/op	  62.03 MB/s
        token_test.go:678: 2000 iterations, 41 mallocs per iteration
BenchmarkHighLevelTokenizer	    1000	   2143356 ns/op	  36.47 MB/s
        token_test.go:678: 1000 iterations, 3231 mallocs per iteration

R=r, rsc, rogpeppe
CC=andybalholm, golang-dev
https://golang.org/cl/6255062
2012-05-31 15:37:18 +10:00
Marcel van Lohuizen
c633f85f65 exp/locale/collate: avoid double building in maketables.go. Also added check.
R=r
CC=golang-dev
https://golang.org/cl/6202063
2012-05-30 17:47:56 +02:00
Andrew Balholm
4e0749a478 exp/html: Convert \r and \r\n to \n when tokenizing
Also escape "\r" as "&#13;" when rendering HTML.

Pass 2 additional tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6260046
2012-05-30 15:50:12 +10:00
Nigel Tao
034fa90dc1 exp/html: add some tokenizer and parser benchmarks.
$GOROOT/src/pkg/exp/html/testdata/go1.html is an execution of the
$GOROOT/doc/go1.html template by godoc.

Sample numbers on my linux,amd64 desktop:
BenchmarkParser	     500	   4699198 ns/op	  16.63 MB/s
--- BENCH: BenchmarkParser
        parse_test.go:409: 1 iterations, 14653 mallocs per iteration
        parse_test.go:409: 100 iterations, 14651 mallocs per iteration
        parse_test.go:409: 500 iterations, 14651 mallocs per iteration
BenchmarkRawLevelTokenizer	    2000	    904957 ns/op	  86.37 MB/s
--- BENCH: BenchmarkRawLevelTokenizer
        token_test.go:657: 1 iterations, 28 mallocs per iteration
        token_test.go:657: 100 iterations, 28 mallocs per iteration
        token_test.go:657: 2000 iterations, 28 mallocs per iteration
BenchmarkLowLevelTokenizer	    2000	   1134300 ns/op	  68.91 MB/s
--- BENCH: BenchmarkLowLevelTokenizer
        token_test.go:657: 1 iterations, 41 mallocs per iteration
        token_test.go:657: 100 iterations, 41 mallocs per iteration
        token_test.go:657: 2000 iterations, 41 mallocs per iteration
BenchmarkHighLevelTokenizer	    1000	   2096179 ns/op	  37.29 MB/s
--- BENCH: BenchmarkHighLevelTokenizer
        token_test.go:657: 1 iterations, 6616 mallocs per iteration
        token_test.go:657: 100 iterations, 6616 mallocs per iteration
        token_test.go:657: 1000 iterations, 6616 mallocs per iteration

R=rsc
CC=andybalholm, golang-dev, r
https://golang.org/cl/6257067
2012-05-30 13:00:32 +10:00
Robert Griesemer
bd7c626348 exp/types: properly read dotted identifiers
Fixes #3682.

R=rsc
CC=golang-dev
https://golang.org/cl/6256067
2012-05-29 13:15:13 -07:00
Russ Cox
95ae5c180e exp/types: disable test
It's broken and seems to be exp/types's fault.

Update #3682.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6243068
2012-05-29 13:33:37 -04:00
Andrew Balholm
9c14184e25 exp/html: implement Noah's Ark clause
Implement the (3-per-family) Noah's Ark clause (i.e. don't put
more than three identical elements on the list of active formatting
elements.

Also, when running tests, sort attributes by name before dumping
them.

Pass 4 additional tests with Noah's Ark clause (including one
that needs attributes to be sorted).

Pass 5 additional, unrelated tests because of sorting attributes.

R=nigeltao, rsc
CC=golang-dev
https://golang.org/cl/6247056
2012-05-29 13:39:54 +10:00
Andrew Balholm
c23041efd9 exp/html: adjust parseForeignContent to match spec
Remove redundant checks for integration points.

Ignore null bytes in text.

Don't break out of foreign content for a <font> tag unless it
has a color, face, or size attribute.

Check for MathML text integration points when breaking out of
foreign content.

Pass two new tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6256045
2012-05-25 10:03:59 +10:00
Russ Cox
ce69666273 exp/locale/collate: avoid 16-bit math
There's no need for the 16-bit arithmetic here,
and it tickles a long-standing compiler bug.
Fix the exp code not to use 16-bit math and
create an explicit test for the compiler bug.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6256048
2012-05-24 14:50:36 -04:00
Andrew Balholm
82e2272566 exp/html: detect "integration points" in SVG and MathML content
Detect HTML integration points and MathML text integration points.
At these points, process tokens as HTML, not as foreign content.

Pass 33 more tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6249044
2012-05-24 13:46:41 +10:00
Andrew Balholm
e947eba291 exp/html: update test data
Import updated test data from the WebKit Subversion repository (SVN revision 118111).

Some of the old tests were failing because we were HTML5 compliant, but the tests weren't.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6228049
2012-05-24 10:35:31 +10:00
Andrew Balholm
33a89b5fda exp/html: adjust the last few insertion modes to match the spec
Handle text, comment, and doctype tokens in afterBodyIM, afterAfterBodyIM,
and afterAfterFramesetIM.

Pass three more tests.

R=nigeltao
CC=golang-dev
https://golang.org/cl/6231043
2012-05-23 11:11:34 +10:00
Jan Ziak
fbaf59bf1e cmd/gc: export constants in hexadecimal
R=golang-dev, r, rsc, iant, remyoudompheng, dave
CC=golang-dev
https://golang.org/cl/6206077
2012-05-22 13:53:38 -04:00
Andrew Balholm
8f66d7dc32 exp/html: adjust inSelectIM to match spec
Simplify the flow of control.

Handle EOF, null bytes, <html>, <input>, <keygen>, <textarea>, <script>.

Pass 5 more tests.

R=golang-dev, rsc, nigeltao
CC=golang-dev
https://golang.org/cl/6220062
2012-05-22 15:30:13 +10:00
Andrew Balholm
7648f61c7d exp/html: adjust inCellIM to match spec
Clean up flow of control.

Ignore </table>, </tbody>, </tfoot>, </thead>, </tr> if there is not
an appropriate element in table scope.

Pass 3 more tests.

R=golang-dev, nigeltao
CC=golang-dev
https://golang.org/cl/6206093
2012-05-22 10:31:08 +10:00