1
0
mirror of https://github.com/golang/go synced 2024-10-04 00:21:20 -06:00
Commit Graph

172 Commits

Author SHA1 Message Date
Andrew Balholm
e25a83d03e html: close <button> element before opening a new one
Pass tests6.dat, test 13:
<button><button>

| <html>
|   <head>
|   <body>
|     <button>
|     <button>

Also pass tests through test 25:
<table><colgroup>foo

R=nigeltao
CC=golang-dev
https://golang.org/cl/5487072
2011-12-14 21:40:31 +11:00
Nigel Tao
66113ac818 html: update comments to match latest spec.
R=dsymonds
CC=golang-dev
https://golang.org/cl/5482054
2011-12-13 14:20:26 +11:00
Nigel Tao
b9064fb132 html: a first step at parsing foreign content (MathML, SVG).
Nodes now have a Namespace field.

Pass adoption01.dat, test 12:
<a><svg><tr><input></a>

| <html>
|   <head>
|   <body>
|     <a>
|       <svg svg>
|         <svg tr>
|           <svg input>

The other adoption01.dat tests already passed.

R=andybalholm
CC=golang-dev
https://golang.org/cl/5467075
2011-12-13 13:52:47 +11:00
Andrew Balholm
0c5443a0a6 html: don't ignore whitespace in or after framesets
Pass tests6.dat, test 7:
<frameset></frameset>
foo

| <html>
|   <head>
|   <frameset>
|   "
"

Also pass tests through test 12:
<form><form>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5480061
2011-12-12 13:18:01 +11:00
Rob Pike
5912869d61 html/template: make Must work
Fixes #2545.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5475054
2011-12-09 10:47:36 -08:00
Russ Cox
a250f37cbc update tree for new default type rule
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/5448091
2011-12-08 22:08:03 -05:00
Rob Pike
0397b28a90 html/template: clean up locking for ExecuteTemplate
R=mikesamuel, rogpeppe
CC=golang-dev
https://golang.org/cl/5448137
2011-12-08 10:15:53 -08:00
Rob Pike
ee8b597b1f html/template: simplify ExecuteTemplate a little
Allow the text template to handle the error case of no template
with the given name.
Simplification suggested by Mike Samuel.

R=mikesamuel
CC=golang-dev
https://golang.org/cl/5437147
2011-12-06 12:47:12 -08:00
Russ Cox
2666b815a3 use new strconv API
All but 3 cases (in gcimporter.go and hixie.go)
are automatic conversions using gofix.

No attempt is made to use the new Append functions
even though there are definitely opportunities.

R=golang-dev, gri
CC=golang-dev
https://golang.org/cl/5447069
2011-12-05 15:48:46 -05:00
Russ Cox
dcf1d7bc0e gofmt -s misc src
R=golang-dev, bradfitz, gri
CC=golang-dev
https://golang.org/cl/5451079
2011-12-02 14:14:25 -05:00
Andrew Balholm
a5d300862b html: allow whitespace between head and body
Also ignore <head> tag after </head>.

Pass tests6.dat, test 0:
<!doctype html></head> <head>

| <!DOCTYPE html>
| <html>
|   <head>
|   " "
|   <body>

Also pass tests through test 6:
<body>
<div>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5447064
2011-12-02 11:46:24 +11:00
Robert Griesemer
15a3a5cf6c gofmt: applied gofmt -w -s src misc
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5451070
2011-12-01 14:33:24 -08:00
Rob Pike
d38cc47c0c text/template: replace Add with AddParseTree
Makes it clear we're adding exactly one tree and creating a
new template for it.

R=rsc
CC=golang-dev
https://golang.org/cl/5448077
2011-12-01 09:19:53 -08:00
Rob Pike
9a86e244bf html/template: make execution thread-safe
The problem is that execution can modify the template, so it needs
interlocking to have the same thread-safe guarantee as text/template.
Fixes #2439.

R=golang-dev, adg
CC=golang-dev
https://golang.org/cl/5450056
2011-11-30 20:11:57 -08:00
Andrew Balholm
ce27b00f48 html: implement fragment parsing algorithm
Pass the tests in tests4.dat.

R=nigeltao
CC=golang-dev
https://golang.org/cl/5447055
2011-12-01 12:47:57 +11:00
Rob Pike
07ee3cc741 html/template: update to new template API
Not quite done yet but enough is here to review.

Embedding is eliminated so clients can't accidentally reach
methods of text/template.Template that would break the
invariants.

TODO later: Add and Clone are unimplemented.
TODO later: address issue 2349

R=golang-dev, r, rsc
CC=golang-dev
https://golang.org/cl/5434077
2011-11-30 17:42:18 -05:00
Nigel Tao
849fc19cab html: clean up the z.rawTag calculation in the tokenizer.
R=andybalholm
CC=golang-dev
https://golang.org/cl/5440064
2011-11-30 17:00:37 +11:00
Andrew Balholm
3b3922771a html: parse <xmp> tags
Pass tests5.dat, test 10:
<p><xmp></xmp>

| <html>
|   <head>
|   <body>
|     <p>
|     <xmp>

Also pass the remaining tests in tests5.dat.

R=nigeltao
CC=golang-dev
https://golang.org/cl/5440062
2011-11-30 15:37:41 +11:00
Andrew Balholm
e32f4ba77d html: parse the contents of <iframe> elements as raw text
Pass tests5.dat, test 4:
<iframe> <!---> </iframe>x

| <html>
|   <head>
|   <body>
|     <iframe>
|       " <!---> "
|     "x"

Also pass tests through test 9:
<style> <!</-- </style>x

R=nigeltao
CC=golang-dev
https://golang.org/cl/5450044
2011-11-30 11:44:54 +11:00
Nigel Tao
929290d5a0 html: spin doctype.go out of parse.go.
R=andybalholm
CC=golang-dev
https://golang.org/cl/5445049
2011-11-29 18:20:59 +11:00
Andrew Balholm
c32b607687 html: detect quirks mode
Pass tests3.dat, test 23:
<p><table></table>

| <html>
|   <head>
|   <body>
|     <p>
|       <table>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5446043
2011-11-29 11:18:49 +11:00
Andrew Balholm
68e7363b56 html: parse <nobr> elements
Pass tests3.dat, test 20:
<!doctype html><nobr><nobr><nobr>

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <nobr>
|     <nobr>
|     <nobr>

Also pass tests through test 22:
<!doctype html><html><body><p><table></table></body></html>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5438056
2011-11-28 10:55:31 +11:00
Andrew Balholm
557ba72e69 html: ignore <head> tags in <head> element
Pass tests3.dat, test 12:
<!DOCTYPE html><HTML><META><HEAD></HEAD></HTML>

| <!DOCTYPE html>
| <html>
|   <head>
|     <meta>
|   <body>

Also pass tests through test 19:
<!DOCTYPE html><html><head></head><body><ul><li><div><p><li></ul></body></html>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5436069
2011-11-27 14:41:08 +11:00
Andrew Gerrand
38c082f69e html/template: fix documentation indent
R=nigeltao
CC=golang-dev
https://golang.org/cl/5437061
2011-11-25 13:32:44 +11:00
Andrew Balholm
af081cd43e html: ingore newline at the start of a <pre> block
Pass tests3.dat, test 4:
<!DOCTYPE html><html><head></head><body><pre>\n</pre></body></html>

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <pre>

Also pass tests through test 11:
<!DOCTYPE html><pre>&#x0a;&#x0a;A</pre>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5437051
2011-11-24 13:15:09 +11:00
Andrew Balholm
77b0ad1e80 html: parse DOCTYPE into name and public and system identifiers
Pass tests2.dat, test 59:
<!DOCTYPE <!DOCTYPE HTML>><!--<!--x-->-->

| <!DOCTYPE <!doctype>
| <html>
|   <head>
|   <body>
|     ">"
|     <!-- <!--x -->
|     "-->"

Pass all the tests in doctype01.dat.

Also pass tests2.dat, test 60:
<!doctype html><div><form></form><div></div></div>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5437045
2011-11-24 09:28:58 +11:00
Andrew Balholm
57ed39fd3b html: on EOF in a comment, ignore final dashes (up to 2)
Pass tests2.dat, test 57:
<!DOCTYPE html><!--x--

| <!DOCTYPE html>
| <!-- x -->
| <html>
|   <head>
|   <body>

Also pass test 58:
<!DOCTYPE html><table><tr><td></p></table>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5436048
2011-11-23 09:26:37 +11:00
Andrew Balholm
95e60acb97 html: copy attributes from extra <html> tags to root element
Pass tests2.dat, test 50:
<!DOCTYPE html><html><body><html id=x>

| <!DOCTYPE html>
| <html>
|   id="x"
|   <head>
|   <body>

Also pass tests through test 56:
<!DOCTYPE html>X<p/x/y/z>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5432045
2011-11-22 12:08:22 +11:00
Andrew Balholm
750de28d6c html: ignore whitespace before <head> element
Pass tests2.dat, test 47:
" \n "
(That is, two spaces separated by a newline)

| <html>
|   <head>
|   <body>

Also pass tests through test 49:
<!DOCTYPE html><script>
</script>  <title>x</title>  </head>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5422043
2011-11-22 09:27:27 +11:00
Andrew Balholm
05d8d112fe html: refactor parse test infrastructure
My excuse for doing this is that test cases with newlines in them didn't
work. But instead of just fixing that, I rearranged everything in
parse_test.go to use fewer channels and pipes, and just call a
straightforward function to read test cases from a file.

R=nigeltao
CC=golang-dev
https://golang.org/cl/5410049
2011-11-20 22:42:28 +11:00
Andrew Gerrand
6c864210fc html/template: fix documentation formatting
See http://weekly.golang.org/pkg/html/template/

R=golang-dev, r, rsc
CC=golang-dev
https://golang.org/cl/5413055
2011-11-19 10:54:44 +11:00
Lucio De Re
5b9d7825ed html/template, net/http, websocket: fix import paths in comments
R=golang-dev
CC=golang-dev, rsc
https://golang.org/cl/5411048
2011-11-18 18:33:44 -05:00
Gustavo Niemeyer
f6279b46f8 html: fix doc after Err method name change
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5411045
2011-11-18 01:06:59 -02:00
Andrew Balholm
a1dbfa6f09 html: parse <isindex>
Pass tests2.dat, test 42:
<isindex test=x name=x>

| <html>
|   <head>
|   <body>
|     <form>
|       <hr>
|       <label>
|         "This is a searchable index. Enter search keywords: "
|         <input>
|           name="isindex"
|           test="x"
|       <hr>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5399049
2011-11-17 13:12:13 +11:00
Andrew Balholm
3276afd4d4 html: parse </optgroup> and </option>
Pass tests2.dat, test 35:
<!DOCTYPE html><select><optgroup><option></optgroup><option><select><option>

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <select>
|       <optgroup>
|         <option>
|       <option>
|     <option>

Also pass tests through test 41:
<!DOCTYPE html><!-- XXX - XXX - XXX -->

R=nigeltao, rsc
CC=golang-dev
https://golang.org/cl/5395045
2011-11-17 10:25:33 +11:00
Rob Pike
f5db4d05f2 html/template: indirect top-level values before printing
text/template does this (in an entirely different way), so
make html/template do the same. Before this fix, the template
{{.}} given a pointer to a string prints its address instead of its
value.

R=mikesamuel, r
CC=golang-dev
https://golang.org/cl/5370098
2011-11-16 09:32:52 -08:00
Andrew Balholm
3307597069 html: parse <optgroup> tags
Pass tests2.dat, test 34:
<!DOCTYPE html><select><option><optgroup>

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <select>
|       <option>
|       <optgroup>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5393045
2011-11-16 19:25:55 +11:00
Andrew Balholm
28546ed56a html: parse <caption> elements
Pass tests2.dat, test 33:
<!DOCTYPE html><table><caption>test TEST</caption><td>test

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <table>
|       <caption>
|         "test TEST"
|       <tbody>
|         <tr>
|           <td>
|             "test"

R=nigeltao
CC=golang-dev
https://golang.org/cl/5371099
2011-11-16 12:18:11 +11:00
Andrew Balholm
b91d82258f html: auto-close <p> elements when starting <form> element.
Pass tests2.dat, test 26:
<!doctypehtml><p><form>

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <p>
|     <form>

Also pass tests through test 32:
<!DOCTYPE html><!-- X

R=nigeltao
CC=golang-dev
https://golang.org/cl/5369114
2011-11-15 15:31:22 +11:00
Andrew Balholm
3bd5082f57 html: parse and render <plaintext> elements
Pass tests2.dat, test 10:
<table><plaintext><td>

| <html>
|   <head>
|   <body>
|     <plaintext>
|       "<td>"
|     <table>

Also pass tests through test 25:
<!doctypehtml><p><dd>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5369109
2011-11-15 11:39:18 +11:00
Andrew Balholm
06ef97e15d html: auto-close <dd> and <dt> elements
Pass tests2.dat, test 8:
<!DOCTYPE html><dt><div><dd>

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <dt>
|       <div>
|     <dd>

Also pass tests through test 9:
<script></x

R=nigeltao
CC=golang-dev
https://golang.org/cl/5373083
2011-11-13 23:27:20 +11:00
Andrew Balholm
631a575fd9 html: store the current insertion mode in the parser
Currently, the state transition functions in the HTML parser
return the next insertion mode and whether the token is consumed.
This works well except for when one insertion mode needs to use
the rules for another insertion mode. Then the useTheRulesFor
function needs to patch things up. This requires comparing functions
for equality, which is going to stop working.

Adding a field to the parser structure to store the current
insertion mode eliminates the need for useTheRulesFor;
one insertion mode function can now just call the other
directly. The insertion mode will be changed only if it needs to be.

This CL is an alternative to CL 5372078.

R=nigeltao, rsc
CC=golang-dev
https://golang.org/cl/5372079
2011-11-13 12:39:41 +11:00
Andrew Balholm
3df0512469 html: handle end tags in strange places
Pass tests1.dat, test 111:
</strong></b></em></i></u></strike></s></blink></tt></pre></big></small></font></select></h1></h2></h3></h4></h5></h6></body></br></a></img></title></span></style></script></table></th></td></tr></frame></area></link></param></hr></input></col></base></meta></basefont></bgsound></embed></spacer></p></dd></dt></caption></colgroup></tbody></tfoot></thead></address></blockquote></center></dir></div></dl></fieldset></listing></menu></ol></ul></li></nobr></wbr></form></button></marquee></object></html></frameset></head></iframe></image></isindex></noembed></noframes></noscript></optgroup></option></plaintext></textarea>

| <html>
|   <head>
|   <body>
|     <br>
|     <p>

Also pass all the remaining tests in tests1.dat.

R=nigeltao
CC=golang-dev
https://golang.org/cl/5372066
2011-11-12 12:23:30 +11:00
Andrew Balholm
0a61c846ef html: ignore <col> tag outside tables
Pass tests1.dat, test 109:
<table><col><tbody><col><tr><col><td><col></table><col>

| <html>
|   <head>
|   <body>
|     <table>
|       <colgroup>
|         <col>
|       <tbody>
|       <colgroup>
|         <col>
|       <tbody>
|         <tr>
|       <colgroup>
|         <col>
|       <tbody>
|         <tr>
|           <td>
|       <colgroup>
|         <col>

Also pass test 110:
<table><colgroup><tbody><colgroup><tr><colgroup><td><colgroup></table><colgroup>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5369069
2011-11-11 21:44:01 +11:00
Andrew Balholm
83f61a27d6 html: parse column groups
Pass tests1.dat, test 108:
<table><colgroup><col><colgroup><col><col><col><colgroup><col><col><thead><tr><td></table>

| <html>
|   <head>
|   <body>
|     <table>
|       <colgroup>
|         <col>
|       <colgroup>
|         <col>
|         <col>
|         <col>
|       <colgroup>
|         <col>
|         <col>
|       <thead>
|         <tr>
|           <td>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5369061
2011-11-11 11:41:46 +11:00
Andrew Balholm
e9e874b7fc html: parse framesets
Pass tests1.dat, test 106:
<frameset><frame><frameset><frame></frameset><noframes></noframes></frameset>

| <html>
|   <head>
|   <frameset>
|     <frame>
|     <frameset>
|       <frame>
|     <noframes>

Also pass test 107:
<h1><table><td><h3></table><h3></h1>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5373050
2011-11-10 23:56:13 +11:00
Andrew Balholm
ddc5ec642d html: don't emit text token for empty raw text elements.
Pass tests1.dat, test 99:
<script></script></div><title></title><p><p>

| <html>
|   <head>
|     <script>
|     <title>
|   <body>
|     <p>
|     <p>

Also pass tests through test 105:
<ul><li><ul></li><li>a</li></ul></li></ul>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5373043
2011-11-10 08:09:54 +11:00
Andrew Balholm
820523d091 html: correctly parse </html> in <head> element.
Pass tests1.dat, test 92:
<head></html><meta><p>

| <html>
|   <head>
|   <body>
|     <meta>
|     <p>

Also pass tests through test 98:
<p><b><div><marquee></p></b></div>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5359054
2011-11-09 19:18:26 +11:00
Rob Pike
30aa701fec renaming_2: gofix -r go1pkgrename src/pkg/[a-l]*
R=rsc
CC=golang-dev
https://golang.org/cl/5358041
2011-11-08 15:40:58 -08:00
Rob Pike
6ab6c49fce renaming_1: hand-edited files for go 1 renaming
This contains the files that required handiwork, mostly
Makefiles with updated TARGs, plus the two packages
with modified package names.
html/template/doc.go needs a separate edit pass.
test/fixedbugs/bug358.go is not legal go so gofix fails on it.

R=rsc
CC=golang-dev
https://golang.org/cl/5340050
2011-11-08 15:38:47 -08:00
Andrew Balholm
ce4eec2e0a html: treat <image> as <img>
Pass tests1.dat, test 90:
<p><image></p>

| <html>
|   <head>
|   <body>
|     <p>
|       <img>

Also pass test 91:
<a><table><a></table><p><a><div><a>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5339052
2011-11-09 09:43:55 +11:00
Andrew Balholm
f2b602ed42 html: parse <body>, <base>, <link>, <meta>, and <title> tags inside page body
Pass tests1.dat, test 87:
<body><body><base><link><meta><title><p></title><body><p></body>

| <html>
|   <head>
|   <body>
|     <base>
|     <link>
|     <meta>
|     <title>
|       "<p>"
|     <p>

Handling the last <body> tag requires correcting the original insertion mode in useTheRulesFor.

Also pass test 88:
<textarea><p></textarea>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5364047
2011-11-08 17:55:17 +11:00
Nigel Tao
46ee09eff1 html: fix typo in package docs.
Fixes #2419.

R=dsymonds, rsc
CC=golang-dev
https://golang.org/cl/5352046
2011-11-08 10:09:17 +11:00
Nigel Tao
bbd173fc3d html: be able to test more than one testdata file.
R=andybalholm
CC=golang-dev
https://golang.org/cl/5351041
2011-11-07 09:38:40 +11:00
Mike Samuel
a5291099d2 html/template: wraps package template instead of exposing func Escape
This does escaping on first execution.

template.go defines the same interface elements as package template.
It requires rather more duplication of code than I'd like, but I'm
not clear how to avoid that.

Maybe instead of

    mySet.ParseGlob(...)
    template.ParseSetGlob(...)
    mySet.ParseFiles(...)
    mySet.ParseTemplateFiles(...)
    template.ParseTemplateFiles(...)

we combine these into a fileset abstraction that can be wrapped

    var fileset template.FileSet
    fileset.Glob(...)  // Load a few files by glob
    fileset.Files(...)  // Load a few {{define}}d files
    fileset.TemplateFiles(...)  // Load a few files as template bodies
    fileset.Funcs(...)  // Make the givens func available to templates
    // Do the parsing.
    set, err := fileset.ParseSet()
    // or set, err := fileset.ParseInto(set)

or provide an interface that can receive filenames and functions and
parse messages:

    type Bundle interface {
      TemplateFile(string)
      File(string)
      Funcs(FuncMap)
    }

and define template.Parse* to handle the file-system stuff and send
messages to a bundle:

    func ParseFiles(b Bundle, filenames ...string)

R=r, r
CC=golang-dev
https://golang.org/cl/5270042
2011-11-04 13:09:21 -04:00
Gustavo Niemeyer
f2dc50b48d html,bzip2,sql: rename Error methods that return error to Err
There are three classes of methods/functions called Error:

a) The Error method in the just introduced error interface
b) Error methods that create or report errors (http.Error, etc)
c) Error methods that return errors previously associated with
   the receiver (Tokenizer.Error, rows.Error, etc).

This CL introduces the convention that methods in case (c)
should be named Err.

The reasoning for the change is:

- The change differentiates the two kinds of APIs based on
  names rather than just on signature, unloading Error a bit
- Err is closer to the err variable name that is so commonly
  used with the intent of verifying an error
- Err is shorter and thus more convenient to be used often
  on error verifications, such as in iterators following the
  convention of the sql package.

R=bradfitz, rsc
CC=golang-dev
https://golang.org/cl/5327064
2011-11-04 09:50:20 -04:00
Andrew Balholm
632a2c59b1 html: properly close <tr> element when an new <tr> starts.
Pass tests1.dat, test 87:
<table><tr><tr><td><td><span><th><span>X</table>

| <html>
|   <head>
|   <body>
|     <table>
|       <tbody>
|         <tr>
|         <tr>
|           <td>
|           <td>
|             <span>
|           <th>
|             <span>
|               "X"

R=nigeltao
CC=golang-dev
https://golang.org/cl/5343041
2011-11-04 15:48:11 +11:00
Andrew Balholm
46308d7d11 html: move <link> element from after <head> into <head>
Pass tests1.dat, test 85:
<head><meta></head><link>

| <html>
|   <head>
|     <meta>
|     <link>
|   <body>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5297079
2011-11-04 09:29:06 +11:00
Vincent Vanackere
eb1717e035 all: rename os.EOF to io.EOF in various non-code contexts
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5334050
2011-11-03 14:01:30 -07:00
Rob Pike
5cb4a15320 html,log,math: renamings
This is Go 1 package renaming CL #2.
This one merely moves the source; the import strings will be
changed after the next weekly release.

exp/template/html -> html/template
big -> math/big
cmath -> math/cmplx
rand -> math/rand
syslog -> log/syslog

The only edits are in Makefiles and deps.bash.

Note that this CL moves exp/template/html out of exp. I decided
to do that so all the renamings can be done together, even though
the API (and that of template, for that matter) is still fluid.

R=r, rsc
CC=golang-dev
https://golang.org/cl/5332053
2011-11-03 12:42:57 -07:00
Andrew Balholm
77aabbf217 html: parse <link> elements in <head>
Pass tests1.dat, test 83:
<title><meta></title><link><title><meta></title>

| <html>
|   <head>
|     <title>
|       "<meta>"
|     <link>
|     <title>
|       "<meta>"
|   <body>

Also pass test 84:
<style><!--</style><meta><script>--><link></script>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5331061
2011-11-03 17:12:13 +11:00
Andrew Balholm
cf6a712162 html: properly close <marquee> elements.
Pass tests1.dat, test 80:
<a href=a>aa<marquee>aa<a href=b>bb</marquee>aa

| <html>
|   <head>
|   <body>
|     <a>
|       href="a"
|       "aa"
|       <marquee>
|         "aa"
|         <a>
|           href="b"
|           "bb"
|       "aa"

Also pass tests through test 82:
<!DOCTYPE html><spacer>foo

R=nigeltao
CC=golang-dev
https://golang.org/cl/5319071
2011-11-03 10:11:06 +11:00
Russ Cox
c2049d2dfe src/pkg/[a-m]*: gofix -r error -force=error
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/5322051
2011-11-01 22:04:37 -04:00
Andrew Balholm
22ee5ae25a html: stop at scope marker node when generating implied </a> tags
A <a> tag generates implied end tags for any open <a> elements.
But it shouldn't do that when it is inside a table cell the the open <a>
is outside the table.
So stop the search for an open <a> when we reach a scope marker node.

Pass tests1.dat, test 78:
<a href="blah">aba<table><tr><td><a href="foo">br</td></tr>x</table>aoe

| <html>
|   <head>
|   <body>
|     <a>
|       href="blah"
|       "abax"
|       <table>
|         <tbody>
|           <tr>
|             <td>
|               <a>
|                 href="foo"
|                 "br"
|       "aoe"

Also pass test 79:
<table><a href="blah">aba<tr><td><a href="foo">br</td></tr>x</table>aoe

R=nigeltao
CC=golang-dev
https://golang.org/cl/5320063
2011-11-02 11:47:05 +11:00
Nigel Tao
90b76c0f3e html: refactor the blacklist for the "render and re-parse" test.
R=andybalholm
CC=golang-dev, mikesamuel
https://golang.org/cl/5331056
2011-11-02 09:42:25 +11:00
Andrew Balholm
9db3f78c39 html: process </td> tags; foster parent at most one node per token
Correctly close table cell when </td> is read.

Because of reconstructing the active formatting elements, more than one
node may be created when reading a single token.
If both nodes are foster parented, they will be siblings, but the first
node should be the parent of the second.

Pass tests1.dat, test 77:
<a href="blah">aba<table><a href="foo">br<tr><td></td></tr>x</table>aoe

| <html>
|   <head>
|   <body>
|     <a>
|       href="blah"
|       "aba"
|       <a>
|         href="foo"
|         "br"
|       <a>
|         href="foo"
|         "x"
|       <table>
|         <tbody>
|           <tr>
|             <td>
|     <a>
|       href="foo"
|       "aoe"

R=nigeltao
CC=golang-dev
https://golang.org/cl/5305074
2011-11-01 11:42:54 +11:00
Andrew Balholm
604e10c34d html: adjust bookmark in "adoption agency" algorithm
In the adoption agency algorithm, the formatting element is sometimes
removed from the list of active formatting elements and reinserted at a later index.
In that case, the bookmark showing where it is to be reinserted needs to be moved,
so that its position relative to its neighbors remains the same
(and also so that it doesn't become out of bounds).

Pass tests1.dat, test 70:
<DIV> abc <B> def <I> ghi <P> jkl </B>

| <html>
|   <head>
|   <body>
|     <div>
|       " abc "
|       <b>
|         " def "
|         <i>
|           " ghi "
|       <i>
|         <p>
|           <b>
|             " jkl "

Also pass tests through test 76:
<test attribute---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->

R=nigeltao
CC=golang-dev
https://golang.org/cl/5322052
2011-10-29 10:51:59 +11:00
Andrew Balholm
03f163c7f2 html: don't run "adoption agency" on elements that aren't in scope.
Pass tests1.dat, test 55:
<!DOCTYPE html><font><table></font></table></font>

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <font>
|       <table>

Also pass tests through test 69:
<DIV> abc <B> def <I> ghi <P> jkl

R=nigeltao
CC=golang-dev
https://golang.org/cl/5309074
2011-10-28 16:04:58 +11:00
Russ Cox
785baa86f1 html: fix print argument in test
R=nigeltao
CC=golang-dev
https://golang.org/cl/5302069
2011-10-27 18:04:29 -07:00
Andrew Balholm
053549ca1b html: allow whitespace text nodes in <head>
Pass tests1.dat, test 50:
<!DOCTYPE html><script> <!-- </script> --> </script> EOF

| <!DOCTYPE html>
| <html>
|   <head>
|     <script>
|       " <!-- "
|     " "
|   <body>
|     "-->  EOF"

Also pass tests through test 54:
<!DOCTYPE html><title>U-test</title><body><div><p>Test<u></p></div></body>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5311066
2011-10-28 09:06:30 +11:00
Andrew Balholm
833fb4198d html: parse <style> elements inside <head> element.
Also correctly handle EOF inside a <style> element.

Pass tests1.dat, test 49:
<!DOCTYPE html><style> EOF

| <!DOCTYPE html>
| <html>
|   <head>
|     <style>
|       " EOF"
|   <body>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5321057
2011-10-27 10:26:11 +11:00
Andrew Balholm
bd07e4f259 html: close <option> element when opening <optgroup>
Pass tests1.dat, test 34:
<!DOCTYPE html>A<option>B<optgroup>C<select>D</option>E

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     "A"
|     <option>
|       "B"
|     <optgroup>
|       "C"
|       <select>
|         "DE"

Also passes tests 35-48. Test 48 is:
</ COM--MENT >

R=nigeltao
CC=golang-dev
https://golang.org/cl/5311063
2011-10-27 09:45:53 +11:00
Russ Cox
db33959797 cgo, goyacc, go/build, html, http, path, path/filepath, testing/quick, test: use rune
Nothing terribly interesting here.

R=golang-dev, bradfitz, gri, r
CC=golang-dev
https://golang.org/cl/5300043
2011-10-25 22:20:02 -07:00
Andrew Balholm
05ed18f4f6 html: improve parsing of lists
Make a <li> tag close the previous <li> element.
Make a </ul> tag close <li> elements.

Pass tests1.dat, test 33:
<!DOCTYPE html><li>hello<li>world<ul>how<li>do</ul>you</body><!--do-->

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <li>
|       "hello"
|     <li>
|       "world"
|       <ul>
|         "how"
|         <li>
|           "do"
|       "you"
|   <!-- do -->

R=nigeltao
CC=golang-dev
https://golang.org/cl/5321051
2011-10-26 14:02:30 +11:00
Andrew Balholm
6e318bda6c html: improve parsing of tables
When foster parenting, merge adjacent text nodes.
Properly close table row at </tr> tag.

Pass tests1.dat, test 32:
<!-----><font><div>hello<table>excite!<b>me!<th><i>please!</tr><!--X-->

| <!-- - -->
| <html>
|   <head>
|   <body>
|     <font>
|       <div>
|         "helloexcite!"
|         <b>
|           "me!"
|         <table>
|           <tbody>
|             <tr>
|               <th>
|                 <i>
|                   "please!"
|             <!-- X -->

R=nigeltao
CC=golang-dev
https://golang.org/cl/5323048
2011-10-26 11:36:46 +11:00
Nigel Tao
18b025d530 html: remove the Tokenizer.ReturnComments option.
The original intention was to simplify the parser, in making it skip
all comment tokens. However, checking that the Go html package is
100% compatible with the WebKit HTML test suite requires parsing the
comments. There is no longer any real benefit for the option.

R=gri, andybalholm
CC=golang-dev
https://golang.org/cl/5321043
2011-10-25 11:28:07 +11:00
Andrew Balholm
2f3f3aa2ed html: dump attributes when running parser tests.
The WebKit test data shows attributes as though they were child nodes:

<a X>0<b>1<a Y>2
dumps as:
| <html>
|   <head>
|   <body>
|     <a>
|       x=""
|       "0"
|       <b>
|         "1"
|     <b>
|       <a>
|         y=""
|         "2"

So we need to do the same when dumping a tree to compare with it.

R=nigeltao
CC=golang-dev
https://golang.org/cl/5322044
2011-10-25 09:33:15 +11:00
Andrew Balholm
2aa589c843 html: implement foster parenting
Implement the foster-parenting algorithm for content that is inside a table
but not in a cell.

Also fix a bug in reconstructing the active formatting elements.

Pass test 30 in tests1.dat:
<a><table><td><a><table></table><a></tr><a></table><b>X</b>C<a>Y

R=nigeltao
CC=golang-dev
https://golang.org/cl/5309052
2011-10-23 18:36:01 +11:00
Nigel Tao
2f352ae48a html: parse <select> tags.
The additional test case in parse_test.go is:
<select><b><option><select><option></b></select>X

R=andybalholm
CC=golang-dev
https://golang.org/cl/5293051
2011-10-22 20:18:12 +11:00
Nigel Tao
64306c9fd0 html: parse and render comment nodes.
The first additional test case in parse_test.go is:
<!--><div>--<!-->

The second one is unrelated to the comment change, but also passes:
<p><hr></p>

R=andybalholm
CC=golang-dev
https://golang.org/cl/5299047
2011-10-20 11:45:30 +11:00
Nigel Tao
b1fd528db5 html: parse raw text and RCDATA elements, such as <script> and <title>.
Pass tests1.dat, test 26:
#data
<script><div></script></div><title><p></title><p><p>
#document
| <html>
|   <head>
|     <script>
|       "<div>"
|     <title>
|       "<p>"
|   <body>
|     <p>
|     <p>

Thanks to Andy Balholm for driving this change.

R=andybalholm
CC=golang-dev
https://golang.org/cl/5301042
2011-10-19 08:03:30 +11:00
Nigel Tao
e5f3dc8bc5 html: refactor the tokenizer; parse "</>" correctly.
Previously, Next would call either nextText or nextTag, but nextTag
could also call nextText. Both nextText and nextTag were responsible
for detecting "</a" end tags and "<!" comments. This change simplifies
the call chain and puts that responsibility in a single place.

R=andybalholm
CC=golang-dev
https://golang.org/cl/5263050
2011-10-18 09:42:16 +11:00
Nigel Tao
1887907fee html: tokenize "a < b" as one whole text token.
R=andybalholm
CC=golang-dev
https://golang.org/cl/5284042
2011-10-16 20:50:11 +11:00
Andrew Balholm
b770c9e9a2 html: improve parsing of comments and "bogus comments"
R=nigeltao
CC=golang-dev
https://golang.org/cl/5279044
2011-10-15 12:22:08 +11:00
Nigel Tao
b82a8e7c22 html: fix some tokenizer bugs with attribute key/values.
The relevant spec sections are 13.2.4.38-13.2.4.40.
http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#attribute-value-(double-quoted)-state

R=andybalholm
CC=golang-dev
https://golang.org/cl/5262044
2011-10-14 15:22:02 +11:00
Nigel Tao
a49b8b9875 html: rewrite the tokenizer to be more consistent.
Previously, the tokenizer made two passes per token. The first pass
established the token boundary. The second pass picked out the tag name
and attributes inside that boundary. This was problematic when the two
passes disagreed. For example, "<p id=can't><p id=won't>" caused an
infinite loop because the first pass skipped everything inside the
single quotes, and recognized only one token, but the second pass never
got past the first '>'.

This change rewrites the tokenizer to use one pass, accumulating the
boundary points of token text, tag names, attribute keys and attribute
values as it looks for the token endpoint.

It should still be reasonably efficient: text, names, keys and values
are not lower-cased or unescaped (and converted from []byte to string)
until asked for.

One of the token_test test cases was fixed to be consistent with
html5lib. Three more test cases were temporarily disabled, and will be
re-enabled in a follow-up CL. All the parse_test test cases pass.

R=andybalholm, gri
CC=golang-dev
https://golang.org/cl/5244061
2011-10-14 09:58:39 +11:00
Andrew Balholm
c64e8e327e html: insert implied <p> and </p> tags
(test # 25 in tests1.dat)
#data
<p><b><div></p></b></div>X
#document
| <html>
|   <head>
|   <body>
|     <p>
|       <b>
|     <div>
|       <b>
|
|           <p>
|           "X"

R=nigeltao
CC=golang-dev
https://golang.org/cl/5254060
2011-10-13 12:40:48 +11:00
Nigel Tao
85368292a3 html: when a parse test fails, don't bother testing rendering.
R=andybalholm
CC=golang-dev
https://golang.org/cl/5248061
2011-10-13 11:53:15 +11:00
Nigel Tao
be8b4d943f html: add a Render function.
R=mikesamuel, andybalholm
CC=golang-dev
https://golang.org/cl/5218041
2011-10-10 14:44:37 +11:00
Nigel Tao
bca65e395e html: parse more malformed tags.
This continues the work in revision 914a659b44ff, now passing more test
cases. As before, the new tokenization tests match html5lib's behavior.

Fixes #2124.

R=dsymonds, r
CC=golang-dev
https://golang.org/cl/4867042
2011-08-11 18:49:09 +10:00
Nigel Tao
37afff2978 html: parse malformed tags missing a '>', such as <p id=0</p>.
The additional token_test.go cases matches html5lib behavior.

Fixes #2124.

R=gri
CC=golang-dev
https://golang.org/cl/4844055
2011-08-10 13:39:07 +10:00
Nigel Tao
1d0c141d7d html: parse doctype tokens; merge adjacent text nodes.
The test case input is "<!DOCTYPE html><span><button>foo</span>bar".
The correct parse is:
| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <span>
|       <button>
|         "foobar"

R=gri
CC=golang-dev
https://golang.org/cl/4794063
2011-08-01 10:26:46 +10:00
Nigel Tao
5f134f9b5b html: sync html/testdata/webkit with upstream WebKit.
As $GOROOT/src/pkg/html/testdata/webkit/README says, we're pulling from
$WEBKITROOT/LayoutTests/html5lib/resources.

R=r
CC=golang-dev
https://golang.org/cl/4810043
2011-07-21 12:50:45 +10:00
Nigel Tao
5a141064ed html: parse misnested formatting tags according to the HTML5 spec.
This is the "adoption agency" algorithm.

The test case input is "<a><p>X<a>Y</a>Z</p></a>". The correct parse is:
| <html>
|   <head>
|   <body>
|     <a>
|     <p>
|       <a>
|         "X"
|       <a>
|         "Y"
|       "Z"

R=gri
CC=golang-dev
https://golang.org/cl/4771042
2011-07-21 11:20:54 +10:00
Andrew Balholm
816c972ff0 html: handle character entities without semicolons
Fix the TODO: unescape("&notit;") should be "¬it;"

Also accept digits in entity names.

R=nigeltao
CC=golang-dev, rsc
https://golang.org/cl/4781042
2011-07-21 09:10:49 +10:00
Nigel Tao
d360e0213d html: update section references in comments to the latest HTML5 spec.
R=r
CC=golang-dev
https://golang.org/cl/4699048
2011-07-13 16:53:02 +10:00
Yasuhiro Matsumoto
1e6d946594 html: parse start tags that aren't explicitly otherwise dealt with.
R=golang-dev, nigeltao
CC=golang-dev
https://golang.org/cl/4626080
2011-07-06 13:08:52 +10:00
Yasuhiro Matsumoto
054cf72b56 html: fix nesting when parsing a close tag.
R=nigeltao
CC=golang-dev
https://golang.org/cl/4636067
2011-06-30 23:16:33 +10:00
Rob Pike
ebb1566a46 strings.Split: make the default to split all.
Change the signature of Split to have no count,
assuming a full split, and rename the existing
Split with a count to SplitN.
Do the same to package bytes.
Add a gofix module.

R=adg, dsymonds, alex.brainman, rsc
CC=golang-dev
https://golang.org/cl/4661051
2011-06-28 09:43:14 +10:00
Brad Fitzpatrick
5e03143c1a html: improve attribute parsing, note package status
Fixes #1890

R=nigeltao
CC=golang-dev
https://golang.org/cl/4528102
2011-06-06 15:56:15 -07:00