1
0
mirror of https://github.com/golang/go synced 2024-10-05 12:21:22 -06:00
Commit Graph

22 Commits

Author SHA1 Message Date
Andrew Balholm
604e10c34d html: adjust bookmark in "adoption agency" algorithm
In the adoption agency algorithm, the formatting element is sometimes
removed from the list of active formatting elements and reinserted at a later index.
In that case, the bookmark showing where it is to be reinserted needs to be moved,
so that its position relative to its neighbors remains the same
(and also so that it doesn't become out of bounds).

Pass tests1.dat, test 70:
<DIV> abc <B> def <I> ghi <P> jkl </B>

| <html>
|   <head>
|   <body>
|     <div>
|       " abc "
|       <b>
|         " def "
|         <i>
|           " ghi "
|       <i>
|         <p>
|           <b>
|             " jkl "

Also pass tests through test 76:
<test attribute---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->

R=nigeltao
CC=golang-dev
https://golang.org/cl/5322052
2011-10-29 10:51:59 +11:00
Andrew Balholm
03f163c7f2 html: don't run "adoption agency" on elements that aren't in scope.
Pass tests1.dat, test 55:
<!DOCTYPE html><font><table></font></table></font>

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <font>
|       <table>

Also pass tests through test 69:
<DIV> abc <B> def <I> ghi <P> jkl

R=nigeltao
CC=golang-dev
https://golang.org/cl/5309074
2011-10-28 16:04:58 +11:00
Andrew Balholm
053549ca1b html: allow whitespace text nodes in <head>
Pass tests1.dat, test 50:
<!DOCTYPE html><script> <!-- </script> --> </script> EOF

| <!DOCTYPE html>
| <html>
|   <head>
|     <script>
|       " <!-- "
|     " "
|   <body>
|     "-->  EOF"

Also pass tests through test 54:
<!DOCTYPE html><title>U-test</title><body><div><p>Test<u></p></div></body>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5311066
2011-10-28 09:06:30 +11:00
Andrew Balholm
833fb4198d html: parse <style> elements inside <head> element.
Also correctly handle EOF inside a <style> element.

Pass tests1.dat, test 49:
<!DOCTYPE html><style> EOF

| <!DOCTYPE html>
| <html>
|   <head>
|     <style>
|       " EOF"
|   <body>

R=nigeltao
CC=golang-dev
https://golang.org/cl/5321057
2011-10-27 10:26:11 +11:00
Andrew Balholm
bd07e4f259 html: close <option> element when opening <optgroup>
Pass tests1.dat, test 34:
<!DOCTYPE html>A<option>B<optgroup>C<select>D</option>E

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     "A"
|     <option>
|       "B"
|     <optgroup>
|       "C"
|       <select>
|         "DE"

Also passes tests 35-48. Test 48 is:
</ COM--MENT >

R=nigeltao
CC=golang-dev
https://golang.org/cl/5311063
2011-10-27 09:45:53 +11:00
Andrew Balholm
05ed18f4f6 html: improve parsing of lists
Make a <li> tag close the previous <li> element.
Make a </ul> tag close <li> elements.

Pass tests1.dat, test 33:
<!DOCTYPE html><li>hello<li>world<ul>how<li>do</ul>you</body><!--do-->

| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <li>
|       "hello"
|     <li>
|       "world"
|       <ul>
|         "how"
|         <li>
|           "do"
|       "you"
|   <!-- do -->

R=nigeltao
CC=golang-dev
https://golang.org/cl/5321051
2011-10-26 14:02:30 +11:00
Andrew Balholm
6e318bda6c html: improve parsing of tables
When foster parenting, merge adjacent text nodes.
Properly close table row at </tr> tag.

Pass tests1.dat, test 32:
<!-----><font><div>hello<table>excite!<b>me!<th><i>please!</tr><!--X-->

| <!-- - -->
| <html>
|   <head>
|   <body>
|     <font>
|       <div>
|         "helloexcite!"
|         <b>
|           "me!"
|         <table>
|           <tbody>
|             <tr>
|               <th>
|                 <i>
|                   "please!"
|             <!-- X -->

R=nigeltao
CC=golang-dev
https://golang.org/cl/5323048
2011-10-26 11:36:46 +11:00
Nigel Tao
18b025d530 html: remove the Tokenizer.ReturnComments option.
The original intention was to simplify the parser, in making it skip
all comment tokens. However, checking that the Go html package is
100% compatible with the WebKit HTML test suite requires parsing the
comments. There is no longer any real benefit for the option.

R=gri, andybalholm
CC=golang-dev
https://golang.org/cl/5321043
2011-10-25 11:28:07 +11:00
Andrew Balholm
2aa589c843 html: implement foster parenting
Implement the foster-parenting algorithm for content that is inside a table
but not in a cell.

Also fix a bug in reconstructing the active formatting elements.

Pass test 30 in tests1.dat:
<a><table><td><a><table></table><a></tr><a></table><b>X</b>C<a>Y

R=nigeltao
CC=golang-dev
https://golang.org/cl/5309052
2011-10-23 18:36:01 +11:00
Nigel Tao
2f352ae48a html: parse <select> tags.
The additional test case in parse_test.go is:
<select><b><option><select><option></b></select>X

R=andybalholm
CC=golang-dev
https://golang.org/cl/5293051
2011-10-22 20:18:12 +11:00
Nigel Tao
64306c9fd0 html: parse and render comment nodes.
The first additional test case in parse_test.go is:
<!--><div>--<!-->

The second one is unrelated to the comment change, but also passes:
<p><hr></p>

R=andybalholm
CC=golang-dev
https://golang.org/cl/5299047
2011-10-20 11:45:30 +11:00
Nigel Tao
b1fd528db5 html: parse raw text and RCDATA elements, such as <script> and <title>.
Pass tests1.dat, test 26:
#data
<script><div></script></div><title><p></title><p><p>
#document
| <html>
|   <head>
|     <script>
|       "<div>"
|     <title>
|       "<p>"
|   <body>
|     <p>
|     <p>

Thanks to Andy Balholm for driving this change.

R=andybalholm
CC=golang-dev
https://golang.org/cl/5301042
2011-10-19 08:03:30 +11:00
Andrew Balholm
c64e8e327e html: insert implied <p> and </p> tags
(test # 25 in tests1.dat)
#data
<p><b><div></p></b></div>X
#document
| <html>
|   <head>
|   <body>
|     <p>
|       <b>
|     <div>
|       <b>
|
|           <p>
|           "X"

R=nigeltao
CC=golang-dev
https://golang.org/cl/5254060
2011-10-13 12:40:48 +11:00
Nigel Tao
1d0c141d7d html: parse doctype tokens; merge adjacent text nodes.
The test case input is "<!DOCTYPE html><span><button>foo</span>bar".
The correct parse is:
| <!DOCTYPE html>
| <html>
|   <head>
|   <body>
|     <span>
|       <button>
|         "foobar"

R=gri
CC=golang-dev
https://golang.org/cl/4794063
2011-08-01 10:26:46 +10:00
Nigel Tao
5a141064ed html: parse misnested formatting tags according to the HTML5 spec.
This is the "adoption agency" algorithm.

The test case input is "<a><p>X<a>Y</a>Z</p></a>". The correct parse is:
| <html>
|   <head>
|   <body>
|     <a>
|     <p>
|       <a>
|         "X"
|       <a>
|         "Y"
|       "Z"

R=gri
CC=golang-dev
https://golang.org/cl/4771042
2011-07-21 11:20:54 +10:00
Nigel Tao
d360e0213d html: update section references in comments to the latest HTML5 spec.
R=r
CC=golang-dev
https://golang.org/cl/4699048
2011-07-13 16:53:02 +10:00
Yasuhiro Matsumoto
1e6d946594 html: parse start tags that aren't explicitly otherwise dealt with.
R=golang-dev, nigeltao
CC=golang-dev
https://golang.org/cl/4626080
2011-07-06 13:08:52 +10:00
Yasuhiro Matsumoto
054cf72b56 html: fix nesting when parsing a close tag.
R=nigeltao
CC=golang-dev
https://golang.org/cl/4636067
2011-06-30 23:16:33 +10:00
Nigel Tao
fec6ab9726 html: parse "<h1>foo<h2>bar".
R=gri
CC=golang-dev
https://golang.org/cl/3571043
2010-12-15 11:39:56 +11:00
Nigel Tao
71bd053ada html: parse <table><tr><td> tags.
Also, shorten fooInsertionMode to fooIM.

R=gri
CC=golang-dev
https://golang.org/cl/3504042
2010-12-10 12:20:14 +11:00
Nigel Tao
49014c5b12 html: handle unexpected EOF during parsing.
This lets us parse HTML like "<html>foo".

R=gri
CC=golang-dev
https://golang.org/cl/3460043
2010-12-08 08:59:20 +11:00
Nigel Tao
08a47d6f60 html: first cut at a parser.
R=gri
CC=golang-dev
https://golang.org/cl/3355041
2010-12-07 12:02:36 +11:00