You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1080 lines
35 KiB

8 years ago
  1. <!DOCTYPE html>
  2. <html>
  3. <head>
  4. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  5. <meta name="viewport" content="width=device-width, initial-scale=1">
  6. <meta name="theme-color" content="#375EAB">
  7. <title>html - The Go Programming Language</title>
  8. <link type="text/css" rel="stylesheet" href="../../../../../lib/godoc/style.css">
  9. <link rel="stylesheet" href="../../../../../lib/godoc/jquery.treeview.css">
  10. <script type="text/javascript">window.initFuncs = [];</script>
  11. </head>
  12. <body>
  13. <div id='lowframe' style="position: fixed; bottom: 0; left: 0; height: 0; width: 100%; border-top: thin solid grey; background-color: white; overflow: auto;">
  14. ...
  15. </div><!-- #lowframe -->
  16. <div id="topbar" class="wide"><div class="container">
  17. <div class="top-heading" id="heading-wide"><a href="http://localhost:6060/">The Go Programming Language</a></div>
  18. <div class="top-heading" id="heading-narrow"><a href="http://localhost:6060/">Go</a></div>
  19. <a href="index.html#" id="menu-button"><span id="menu-button-arrow">&#9661;</span></a>
  20. <form method="GET" action="http://localhost:6060/search">
  21. <div id="menu">
  22. <a href="http://localhost:6060/doc/">Documents</a>
  23. <a href="http://localhost:6060/pkg/">Packages</a>
  24. <a href="http://localhost:6060/project/">The Project</a>
  25. <a href="http://localhost:6060/help/">Help</a>
  26. <a href="http://localhost:6060/blog/">Blog</a>
  27. <input type="text" id="search" name="q" class="inactive" value="Search" placeholder="Search">
  28. </div>
  29. </form>
  30. </div></div>
  31. <div id="page" class="wide">
  32. <div class="container">
  33. <h1>Package html</h1>
  34. <div id="nav"></div>
  35. <!--
  36. Copyright 2009 The Go Authors. All rights reserved.
  37. Use of this source code is governed by a BSD-style
  38. license that can be found in the LICENSE file.
  39. -->
  40. <!--
  41. Note: Static (i.e., not template-generated) href and id
  42. attributes start with "pkg-" to make it impossible for
  43. them to conflict with generated attributes (some of which
  44. correspond to Go identifiers).
  45. -->
  46. <script type='text/javascript'>
  47. document.ANALYSIS_DATA = null;
  48. document.CALLGRAPH = null;
  49. </script>
  50. <div id="short-nav">
  51. <dl>
  52. <dd><code>import "golang.org/x/net/html"</code></dd>
  53. </dl>
  54. <dl>
  55. <dd><a href="index.html#pkg-overview" class="overviewLink">Overview</a></dd>
  56. <dd><a href="index.html#pkg-index" class="indexLink">Index</a></dd>
  57. <dd><a href="index.html#pkg-examples" class="examplesLink">Examples</a></dd>
  58. <dd><a href="index.html#pkg-subdirectories">Subdirectories</a></dd>
  59. </dl>
  60. </div>
  61. <!-- The package's Name is printed as title by the top-level template -->
  62. <div id="pkg-overview" class="toggleVisible">
  63. <div class="collapsed">
  64. <h2 class="toggleButton" title="Click to show Overview section">Overview ▹</h2>
  65. </div>
  66. <div class="expanded">
  67. <h2 class="toggleButton" title="Click to hide Overview section">Overview ▾</h2>
  68. <p>
  69. Package html implements an HTML5-compliant tokenizer and parser.
  70. </p>
  71. <p>
  72. Tokenization is done by creating a Tokenizer for an io.Reader r. It is the
  73. caller&#39;s responsibility to ensure that r provides UTF-8 encoded HTML.
  74. </p>
  75. <pre>z := html.NewTokenizer(r)
  76. </pre>
  77. <p>
  78. Given a Tokenizer z, the HTML is tokenized by repeatedly calling z.Next(),
  79. which parses the next token and returns its type, or an error:
  80. </p>
  81. <pre>for {
  82. tt := z.Next()
  83. if tt == html.ErrorToken {
  84. // ...
  85. return ...
  86. }
  87. // Process the current token.
  88. }
  89. </pre>
  90. <p>
  91. There are two APIs for retrieving the current token. The high-level API is to
  92. call Token; the low-level API is to call Text or TagName / TagAttr. Both APIs
  93. allow optionally calling Raw after Next but before Token, Text, TagName, or
  94. TagAttr. In EBNF notation, the valid call sequence per token is:
  95. </p>
  96. <pre>Next {Raw} [ Token | Text | TagName {TagAttr} ]
  97. </pre>
  98. <p>
  99. Token returns an independent data structure that completely describes a token.
  100. Entities (such as &#34;&amp;lt;&#34;) are unescaped, tag names and attribute keys are
  101. lower-cased, and attributes are collected into a []Attribute. For example:
  102. </p>
  103. <pre>for {
  104. if z.Next() == html.ErrorToken {
  105. // Returning io.EOF indicates success.
  106. return z.Err()
  107. }
  108. emitToken(z.Token())
  109. }
  110. </pre>
  111. <p>
  112. The low-level API performs fewer allocations and copies, but the contents of
  113. the []byte values returned by Text, TagName and TagAttr may change on the next
  114. call to Next. For example, to extract an HTML page&#39;s anchor text:
  115. </p>
  116. <pre>depth := 0
  117. for {
  118. tt := z.Next()
  119. switch tt {
  120. case ErrorToken:
  121. return z.Err()
  122. case TextToken:
  123. if depth &gt; 0 {
  124. // emitBytes should copy the []byte it receives,
  125. // if it doesn&#39;t process it immediately.
  126. emitBytes(z.Text())
  127. }
  128. case StartTagToken, EndTagToken:
  129. tn, _ := z.TagName()
  130. if len(tn) == 1 &amp;&amp; tn[0] == &#39;a&#39; {
  131. if tt == StartTagToken {
  132. depth++
  133. } else {
  134. depth--
  135. }
  136. }
  137. }
  138. }
  139. </pre>
  140. <p>
  141. Parsing is done by calling Parse with an io.Reader, which returns the root of
  142. the parse tree (the document element) as a *Node. It is the caller&#39;s
  143. responsibility to ensure that the Reader provides UTF-8 encoded HTML. For
  144. example, to process each anchor node in depth-first order:
  145. </p>
  146. <pre>doc, err := html.Parse(r)
  147. if err != nil {
  148. // ...
  149. }
  150. var f func(*html.Node)
  151. f = func(n *html.Node) {
  152. if n.Type == html.ElementNode &amp;&amp; n.Data == &#34;a&#34; {
  153. // Do something with n...
  154. }
  155. for c := n.FirstChild; c != nil; c = c.NextSibling {
  156. f(c)
  157. }
  158. }
  159. f(doc)
  160. </pre>
  161. <p>
  162. The relevant specifications include:
  163. <a href="https://html.spec.whatwg.org/multipage/syntax.html">https://html.spec.whatwg.org/multipage/syntax.html</a> and
  164. <a href="https://html.spec.whatwg.org/multipage/syntax.html#tokenization">https://html.spec.whatwg.org/multipage/syntax.html#tokenization</a>
  165. </p>
  166. </div>
  167. </div>
  168. <div id="pkg-index" class="toggleVisible">
  169. <div class="collapsed">
  170. <h2 class="toggleButton" title="Click to show Index section">Index ▹</h2>
  171. </div>
  172. <div class="expanded">
  173. <h2 class="toggleButton" title="Click to hide Index section">Index ▾</h2>
  174. <!-- Table of contents for API; must be named manual-nav to turn off auto nav. -->
  175. <div id="manual-nav">
  176. <dl>
  177. <dd><a href="index.html#pkg-variables">Variables</a></dd>
  178. <dd><a href="index.html#EscapeString">func EscapeString(s string) string</a></dd>
  179. <dd><a href="index.html#ParseFragment">func ParseFragment(r io.Reader, context *Node) ([]*Node, error)</a></dd>
  180. <dd><a href="index.html#Render">func Render(w io.Writer, n *Node) error</a></dd>
  181. <dd><a href="index.html#UnescapeString">func UnescapeString(s string) string</a></dd>
  182. <dd><a href="index.html#Attribute">type Attribute</a></dd>
  183. <dd><a href="index.html#Node">type Node</a></dd>
  184. <dd>&nbsp; &nbsp; <a href="index.html#Parse">func Parse(r io.Reader) (*Node, error)</a></dd>
  185. <dd>&nbsp; &nbsp; <a href="index.html#Node.AppendChild">func (n *Node) AppendChild(c *Node)</a></dd>
  186. <dd>&nbsp; &nbsp; <a href="index.html#Node.InsertBefore">func (n *Node) InsertBefore(newChild, oldChild *Node)</a></dd>
  187. <dd>&nbsp; &nbsp; <a href="index.html#Node.RemoveChild">func (n *Node) RemoveChild(c *Node)</a></dd>
  188. <dd><a href="index.html#NodeType">type NodeType</a></dd>
  189. <dd><a href="index.html#Token">type Token</a></dd>
  190. <dd>&nbsp; &nbsp; <a href="index.html#Token.String">func (t Token) String() string</a></dd>
  191. <dd><a href="index.html#TokenType">type TokenType</a></dd>
  192. <dd>&nbsp; &nbsp; <a href="index.html#TokenType.String">func (t TokenType) String() string</a></dd>
  193. <dd><a href="index.html#Tokenizer">type Tokenizer</a></dd>
  194. <dd>&nbsp; &nbsp; <a href="index.html#NewTokenizer">func NewTokenizer(r io.Reader) *Tokenizer</a></dd>
  195. <dd>&nbsp; &nbsp; <a href="index.html#NewTokenizerFragment">func NewTokenizerFragment(r io.Reader, contextTag string) *Tokenizer</a></dd>
  196. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.AllowCDATA">func (z *Tokenizer) AllowCDATA(allowCDATA bool)</a></dd>
  197. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.Buffered">func (z *Tokenizer) Buffered() []byte</a></dd>
  198. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.Err">func (z *Tokenizer) Err() error</a></dd>
  199. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.Next">func (z *Tokenizer) Next() TokenType</a></dd>
  200. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.NextIsNotRawText">func (z *Tokenizer) NextIsNotRawText()</a></dd>
  201. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.Raw">func (z *Tokenizer) Raw() []byte</a></dd>
  202. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.SetMaxBuf">func (z *Tokenizer) SetMaxBuf(n int)</a></dd>
  203. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.TagAttr">func (z *Tokenizer) TagAttr() (key, val []byte, moreAttr bool)</a></dd>
  204. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.TagName">func (z *Tokenizer) TagName() (name []byte, hasAttr bool)</a></dd>
  205. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.Text">func (z *Tokenizer) Text() []byte</a></dd>
  206. <dd>&nbsp; &nbsp; <a href="index.html#Tokenizer.Token">func (z *Tokenizer) Token() Token</a></dd>
  207. </dl>
  208. </div><!-- #manual-nav -->
  209. <div id="pkg-examples">
  210. <h4>Examples</h4>
  211. <dl>
  212. <dd><a class="exampleLink" href="index.html#example_Parse">Parse</a></dd>
  213. </dl>
  214. </div>
  215. <h4>Package files</h4>
  216. <p>
  217. <span style="font-size:90%">
  218. <a href="http://localhost:6060/src/golang.org/x/net/html/const.go">const.go</a>
  219. <a href="http://localhost:6060/src/golang.org/x/net/html/doc.go">doc.go</a>
  220. <a href="http://localhost:6060/src/golang.org/x/net/html/doctype.go">doctype.go</a>
  221. <a href="http://localhost:6060/src/golang.org/x/net/html/entity.go">entity.go</a>
  222. <a href="http://localhost:6060/src/golang.org/x/net/html/escape.go">escape.go</a>
  223. <a href="http://localhost:6060/src/golang.org/x/net/html/foreign.go">foreign.go</a>
  224. <a href="http://localhost:6060/src/golang.org/x/net/html/node.go">node.go</a>
  225. <a href="http://localhost:6060/src/golang.org/x/net/html/parse.go">parse.go</a>
  226. <a href="http://localhost:6060/src/golang.org/x/net/html/render.go">render.go</a>
  227. <a href="http://localhost:6060/src/golang.org/x/net/html/token.go">token.go</a>
  228. </span>
  229. </p>
  230. </div><!-- .expanded -->
  231. </div><!-- #pkg-index -->
  232. <div id="pkg-callgraph" class="toggle" style="display: none">
  233. <div class="collapsed">
  234. <h2 class="toggleButton" title="Click to show Internal Call Graph section">Internal call graph ▹</h2>
  235. </div> <!-- .expanded -->
  236. <div class="expanded">
  237. <h2 class="toggleButton" title="Click to hide Internal Call Graph section">Internal call graph ▾</h2>
  238. <p>
  239. In the call graph viewer below, each node
  240. is a function belonging to this package
  241. and its children are the functions it
  242. calls&mdash;perhaps dynamically.
  243. </p>
  244. <p>
  245. The root nodes are the entry points of the
  246. package: functions that may be called from
  247. outside the package.
  248. There may be non-exported or anonymous
  249. functions among them if they are called
  250. dynamically from another package.
  251. </p>
  252. <p>
  253. Click a node to visit that function's source code.
  254. From there you can visit its callers by
  255. clicking its declaring <code>func</code>
  256. token.
  257. </p>
  258. <p>
  259. Functions may be omitted if they were
  260. determined to be unreachable in the
  261. particular programs or tests that were
  262. analyzed.
  263. </p>
  264. <!-- Zero means show all package entry points. -->
  265. <ul style="margin-left: 0.5in" id="callgraph-0" class="treeview"></ul>
  266. </div>
  267. </div> <!-- #pkg-callgraph -->
  268. <h2 id="pkg-variables">Variables</h2>
  269. <pre>var <span id="ErrBufferExceeded">ErrBufferExceeded</span> = <a href="../../../../errors/index.html">errors</a>.<a href="../../../../errors/index.html#New">New</a>(&#34;max buffer exceeded&#34;)</pre>
  270. <p>
  271. ErrBufferExceeded means that the buffering limit was exceeded.
  272. </p>
  273. <h2 id="EscapeString">func <a href="http://localhost:6060/src/golang.org/x/net/html/escape.go?s=5448:5482#L227">EscapeString</a></h2>
  274. <pre>func EscapeString(s <a href="../../../../builtin/index.html#string">string</a>) <a href="../../../../builtin/index.html#string">string</a></pre>
  275. <p>
  276. EscapeString escapes special characters like &#34;&lt;&#34; to become &#34;&amp;lt;&#34;. It
  277. escapes only five such characters: &lt;, &gt;, &amp;, &#39; and &#34;.
  278. UnescapeString(EscapeString(s)) == s always holds, but the converse isn&#39;t
  279. always true.
  280. </p>
  281. <h2 id="ParseFragment">func <a href="http://localhost:6060/src/golang.org/x/net/html/parse.go?s=47851:47914#L2026">ParseFragment</a></h2>
  282. <pre>func ParseFragment(r <a href="../../../../io/index.html">io</a>.<a href="../../../../io/index.html#Reader">Reader</a>, context *<a href="index.html#Node">Node</a>) ([]*<a href="index.html#Node">Node</a>, <a href="../../../../builtin/index.html#error">error</a>)</pre>
  283. <p>
  284. ParseFragment parses a fragment of HTML and returns the nodes that were
  285. found. If the fragment is the InnerHTML for an existing element, pass that
  286. element in context.
  287. </p>
  288. <h2 id="Render">func <a href="http://localhost:6060/src/golang.org/x/net/html/render.go?s=1842:1881#L35">Render</a></h2>
  289. <pre>func Render(w <a href="../../../../io/index.html">io</a>.<a href="../../../../io/index.html#Writer">Writer</a>, n *<a href="index.html#Node">Node</a>) <a href="../../../../builtin/index.html#error">error</a></pre>
  290. <p>
  291. Render renders the parse tree n to the given writer.
  292. </p>
  293. <p>
  294. Rendering is done on a &#39;best effort&#39; basis: calling Parse on the output of
  295. Render will always result in something similar to the original tree, but it
  296. is not necessarily an exact clone unless the original tree was &#39;well-formed&#39;.
  297. &#39;Well-formed&#39; is not easily specified; the HTML5 specification is
  298. complicated.
  299. </p>
  300. <p>
  301. Calling Parse on arbitrary input typically results in a &#39;well-formed&#39; parse
  302. tree. However, it is possible for Parse to yield a &#39;badly-formed&#39; parse tree.
  303. For example, in a &#39;well-formed&#39; parse tree, no &lt;a&gt; element is a child of
  304. another &lt;a&gt; element: parsing &#34;&lt;a&gt;&lt;a&gt;&#34; results in two sibling elements.
  305. Similarly, in a &#39;well-formed&#39; parse tree, no &lt;a&gt; element is a child of a
  306. &lt;table&gt; element: parsing &#34;&lt;p&gt;&lt;table&gt;&lt;a&gt;&#34; results in a &lt;p&gt; with two sibling
  307. children; the &lt;a&gt; is reparented to the &lt;table&gt;&#39;s parent. However, calling
  308. Parse on &#34;&lt;a&gt;&lt;table&gt;&lt;a&gt;&#34; does not return an error, but the result has an &lt;a&gt;
  309. element with an &lt;a&gt; child, and is therefore not &#39;well-formed&#39;.
  310. </p>
  311. <p>
  312. Programmatically constructed trees are typically also &#39;well-formed&#39;, but it
  313. is possible to construct a tree that looks innocuous but, when rendered and
  314. re-parsed, results in a different tree. A simple example is that a solitary
  315. text node would become a tree containing &lt;html&gt;, &lt;head&gt; and &lt;body&gt; elements.
  316. Another example is that the programmatic equivalent of &#34;a&lt;head&gt;b&lt;/head&gt;c&#34;
  317. becomes &#34;&lt;html&gt;&lt;head&gt;&lt;head/&gt;&lt;body&gt;abc&lt;/body&gt;&lt;/html&gt;&#34;.
  318. </p>
  319. <h2 id="UnescapeString">func <a href="http://localhost:6060/src/golang.org/x/net/html/escape.go?s=5911:5947#L241">UnescapeString</a></h2>
  320. <pre>func UnescapeString(s <a href="../../../../builtin/index.html#string">string</a>) <a href="../../../../builtin/index.html#string">string</a></pre>
  321. <p>
  322. UnescapeString unescapes entities like &#34;&amp;lt;&#34; to become &#34;&lt;&#34;. It unescapes a
  323. larger range of entities than EscapeString escapes. For example, &#34;&amp;aacute;&#34;
  324. unescapes to &#34;á&#34;, as does &#34;&amp;#225;&#34; and &#34;&amp;xE1;&#34;.
  325. UnescapeString(EscapeString(s)) == s always holds, but the converse isn&#39;t
  326. always true.
  327. </p>
  328. <h2 id="Attribute">type <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=1665:1718#L57">Attribute</a></h2>
  329. <pre>type Attribute struct {
  330. Namespace, Key, Val <a href="../../../../builtin/index.html#string">string</a>
  331. }</pre>
  332. <p>
  333. An Attribute is an attribute namespace-key-value triple. Namespace is
  334. non-empty for foreign attributes like xlink, Key is alphabetic (and hence
  335. does not contain escapable characters like &#39;&amp;&#39;, &#39;&lt;&#39; or &#39;&gt;&#39;), and Val is
  336. unescaped (it looks like &#34;a&lt;b&#34; rather than &#34;a&amp;lt;b&#34;).
  337. </p>
  338. <p>
  339. Namespace is only used by the parser, not the tokenizer.
  340. </p>
  341. <h2 id="Node">type <a href="http://localhost:6060/src/golang.org/x/net/html/node.go?s=1230:1414#L28">Node</a></h2>
  342. <pre>type Node struct {
  343. Parent, FirstChild, LastChild, PrevSibling, NextSibling *<a href="index.html#Node">Node</a>
  344. Type <a href="index.html#NodeType">NodeType</a>
  345. DataAtom <a href="atom/index.html">atom</a>.<a href="atom/index.html#Atom">Atom</a>
  346. Data <a href="../../../../builtin/index.html#string">string</a>
  347. Namespace <a href="../../../../builtin/index.html#string">string</a>
  348. Attr []<a href="index.html#Attribute">Attribute</a>
  349. }</pre>
  350. <p>
  351. A Node consists of a NodeType and some Data (tag name for element nodes,
  352. content for text) and are part of a tree of Nodes. Element nodes may also
  353. have a Namespace and contain a slice of Attributes. Data is unescaped, so
  354. that it looks like &#34;a&lt;b&#34; rather than &#34;a&amp;lt;b&#34;. For element nodes, DataAtom
  355. is the atom for Data, or zero if Data is not a known tag name.
  356. </p>
  357. <p>
  358. An empty Namespace implies a &#34;<a href="http://www.w3.org/1999/xhtml">http://www.w3.org/1999/xhtml</a>&#34; namespace.
  359. Similarly, &#34;math&#34; is short for &#34;<a href="http://www.w3.org/1998/Math/MathML">http://www.w3.org/1998/Math/MathML</a>&#34;, and
  360. &#34;svg&#34; is short for &#34;<a href="http://www.w3.org/2000/svg">http://www.w3.org/2000/svg</a>&#34;.
  361. </p>
  362. <h3 id="Parse">func <a href="http://localhost:6060/src/golang.org/x/net/html/parse.go?s=47401:47439#L2006">Parse</a></h3>
  363. <pre>func Parse(r <a href="../../../../io/index.html">io</a>.<a href="../../../../io/index.html#Reader">Reader</a>) (*<a href="index.html#Node">Node</a>, <a href="../../../../builtin/index.html#error">error</a>)</pre>
  364. <p>
  365. Parse returns the parse tree for the HTML from the given Reader.
  366. The input is assumed to be UTF-8 encoded.
  367. </p>
  368. <div id="example_Parse" class="toggle">
  369. <div class="collapsed">
  370. <p class="exampleHeading toggleButton"><span class="text">Example</span></p>
  371. </div>
  372. <div class="expanded">
  373. <p class="exampleHeading toggleButton"><span class="text">Example</span></p>
  374. <p>Code:</p>
  375. <pre class="code">s := `&lt;p&gt;Links:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&#34;foo&#34;&gt;Foo&lt;/a&gt;&lt;li&gt;&lt;a href=&#34;/bar/baz&#34;&gt;BarBaz&lt;/a&gt;&lt;/ul&gt;`
  376. doc, err := html.Parse(strings.NewReader(s))
  377. if err != nil {
  378. log.Fatal(err)
  379. }
  380. var f func(*html.Node)
  381. f = func(n *html.Node) {
  382. if n.Type == html.ElementNode &amp;&amp; n.Data == &#34;a&#34; {
  383. for _, a := range n.Attr {
  384. if a.Key == &#34;href&#34; {
  385. fmt.Println(a.Val)
  386. break
  387. }
  388. }
  389. }
  390. for c := n.FirstChild; c != nil; c = c.NextSibling {
  391. f(c)
  392. }
  393. }
  394. f(doc)
  395. <span class="comment"></pre>
  396. <p>Output:</p>
  397. <pre class="output">foo
  398. /bar/baz
  399. </pre>
  400. </div>
  401. </div>
  402. <h3 id="Node.AppendChild">func (*Node) <a href="http://localhost:6060/src/golang.org/x/net/html/node.go?s=2381:2416#L71">AppendChild</a></h3>
  403. <pre>func (n *<a href="index.html#Node">Node</a>) AppendChild(c *<a href="index.html#Node">Node</a>)</pre>
  404. <p>
  405. AppendChild adds a node c as a child of n.
  406. </p>
  407. <p>
  408. It will panic if c already has a parent or siblings.
  409. </p>
  410. <h3 id="Node.InsertBefore">func (*Node) <a href="http://localhost:6060/src/golang.org/x/net/html/node.go?s=1683:1736#L43">InsertBefore</a></h3>
  411. <pre>func (n *<a href="index.html#Node">Node</a>) InsertBefore(newChild, oldChild *<a href="index.html#Node">Node</a>)</pre>
  412. <p>
  413. InsertBefore inserts newChild as a child of n, immediately before oldChild
  414. in the sequence of n&#39;s children. oldChild may be nil, in which case newChild
  415. is appended to the end of n&#39;s children.
  416. </p>
  417. <p>
  418. It will panic if newChild already has a parent or siblings.
  419. </p>
  420. <h3 id="Node.RemoveChild">func (*Node) <a href="http://localhost:6060/src/golang.org/x/net/html/node.go?s=2857:2892#L90">RemoveChild</a></h3>
  421. <pre>func (n *<a href="index.html#Node">Node</a>) RemoveChild(c *<a href="index.html#Node">Node</a>)</pre>
  422. <p>
  423. RemoveChild removes a node c that is a child of n. Afterwards, c will have
  424. no parent and no siblings.
  425. </p>
  426. <p>
  427. It will panic if c&#39;s parent is not n.
  428. </p>
  429. <h2 id="NodeType">type <a href="http://localhost:6060/src/golang.org/x/net/html/node.go?s=253:273#L2">NodeType</a></h2>
  430. <pre>type NodeType <a href="../../../../builtin/index.html#uint32">uint32</a></pre>
  431. <p>
  432. A NodeType is the type of a Node.
  433. </p>
  434. <pre>const (
  435. <span id="ErrorNode">ErrorNode</span> <a href="index.html#NodeType">NodeType</a> = <a href="../../../../builtin/index.html#iota">iota</a>
  436. <span id="TextNode">TextNode</span>
  437. <span id="DocumentNode">DocumentNode</span>
  438. <span id="ElementNode">ElementNode</span>
  439. <span id="CommentNode">CommentNode</span>
  440. <span id="DoctypeNode">DoctypeNode</span>
  441. )</pre>
  442. <h2 id="Token">type <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=2074:2174#L66">Token</a></h2>
  443. <pre>type Token struct {
  444. Type <a href="index.html#TokenType">TokenType</a>
  445. DataAtom <a href="atom/index.html">atom</a>.<a href="atom/index.html#Atom">Atom</a>
  446. Data <a href="../../../../builtin/index.html#string">string</a>
  447. Attr []<a href="index.html#Attribute">Attribute</a>
  448. }</pre>
  449. <p>
  450. A Token consists of a TokenType and some Data (tag name for start and end
  451. tags, content for text, comments and doctypes). A tag Token may also contain
  452. a slice of Attributes. Data is unescaped for all Tokens (it looks like &#34;a&lt;b&#34;
  453. rather than &#34;a&amp;lt;b&#34;). For tag Tokens, DataAtom is the atom for Data, or
  454. zero if Data is not a known tag name.
  455. </p>
  456. <h3 id="Token.String">func (Token) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=2592:2622#L90">String</a></h3>
  457. <pre>func (t <a href="index.html#Token">Token</a>) String() <a href="../../../../builtin/index.html#string">string</a></pre>
  458. <p>
  459. String returns a string representation of the Token.
  460. </p>
  461. <h2 id="TokenType">type <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=303:324#L8">TokenType</a></h2>
  462. <pre>type TokenType <a href="../../../../builtin/index.html#uint32">uint32</a></pre>
  463. <p>
  464. A TokenType is the type of a Token.
  465. </p>
  466. <pre>const (
  467. <span class="comment">// ErrorToken means that an error occurred during tokenization.</span>
  468. <span id="ErrorToken">ErrorToken</span> <a href="index.html#TokenType">TokenType</a> = <a href="../../../../builtin/index.html#iota">iota</a>
  469. <span class="comment">// TextToken means a text node.</span>
  470. <span id="TextToken">TextToken</span>
  471. <span class="comment">// A StartTagToken looks like &lt;a&gt;.</span>
  472. <span id="StartTagToken">StartTagToken</span>
  473. <span class="comment">// An EndTagToken looks like &lt;/a&gt;.</span>
  474. <span id="EndTagToken">EndTagToken</span>
  475. <span class="comment">// A SelfClosingTagToken tag looks like &lt;br/&gt;.</span>
  476. <span id="SelfClosingTagToken">SelfClosingTagToken</span>
  477. <span class="comment">// A CommentToken looks like &lt;!--x--&gt;.</span>
  478. <span id="CommentToken">CommentToken</span>
  479. <span class="comment">// A DoctypeToken looks like &lt;!DOCTYPE x&gt;</span>
  480. <span id="DoctypeToken">DoctypeToken</span>
  481. )</pre>
  482. <h3 id="TokenType.String">func (TokenType) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=940:974#L31">String</a></h3>
  483. <pre>func (t <a href="index.html#TokenType">TokenType</a>) String() <a href="../../../../builtin/index.html#string">string</a></pre>
  484. <p>
  485. String returns a string representation of the TokenType.
  486. </p>
  487. <h2 id="Tokenizer">type <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=3250:5402#L117">Tokenizer</a></h2>
  488. <pre>type Tokenizer struct {
  489. <span class="comment">// contains filtered or unexported fields</span>
  490. }</pre>
  491. <p>
  492. A Tokenizer returns a stream of HTML Tokens.
  493. </p>
  494. <h3 id="NewTokenizer">func <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=29453:29494#L1185">NewTokenizer</a></h3>
  495. <pre>func NewTokenizer(r <a href="../../../../io/index.html">io</a>.<a href="../../../../io/index.html#Reader">Reader</a>) *<a href="index.html#Tokenizer">Tokenizer</a></pre>
  496. <p>
  497. NewTokenizer returns a new HTML Tokenizer for the given Reader.
  498. The input is assumed to be UTF-8 encoded.
  499. </p>
  500. <h3 id="NewTokenizerFragment">func <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=29900:29968#L1197">NewTokenizerFragment</a></h3>
  501. <pre>func NewTokenizerFragment(r <a href="../../../../io/index.html">io</a>.<a href="../../../../io/index.html#Reader">Reader</a>, contextTag <a href="../../../../builtin/index.html#string">string</a>) *<a href="index.html#Tokenizer">Tokenizer</a></pre>
  502. <p>
  503. NewTokenizerFragment returns a new HTML Tokenizer for the given Reader, for
  504. tokenizing an existing element&#39;s InnerHTML fragment. contextTag is that
  505. element&#39;s tag, such as &#34;div&#34; or &#34;iframe&#34;.
  506. </p>
  507. <p>
  508. For example, how the InnerHTML &#34;a&lt;b&#34; is tokenized depends on whether it is
  509. for a &lt;p&gt; tag or a &lt;script&gt; tag.
  510. </p>
  511. <p>
  512. The input is assumed to be UTF-8 encoded.
  513. </p>
  514. <h3 id="Tokenizer.AllowCDATA">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=6364:6411#L178">AllowCDATA</a></h3>
  515. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) AllowCDATA(allowCDATA <a href="../../../../builtin/index.html#bool">bool</a>)</pre>
  516. <p>
  517. AllowCDATA sets whether or not the tokenizer recognizes &lt;![CDATA[foo]]&gt; as
  518. the text &#34;foo&#34;. The default value is false, which means to recognize it as
  519. a bogus comment &#34;&lt;!-- [CDATA[foo]] --&gt;&#34; instead.
  520. </p>
  521. <p>
  522. Strictly speaking, an HTML5 compliant tokenizer should allow CDATA if and
  523. only if tokenizing foreign content, such as MathML and SVG. However,
  524. tracking foreign-contentness is difficult to do purely in the tokenizer,
  525. as opposed to the parser, due to HTML integration points: an &lt;svg&gt; element
  526. can contain a &lt;foreignObject&gt; that is foreign-to-SVG but not foreign-to-
  527. HTML. For strict compliance with the HTML5 tokenization algorithm, it is the
  528. responsibility of the user of a tokenizer to call AllowCDATA as appropriate.
  529. In practice, if using the tokenizer without caring whether MathML or SVG
  530. CDATA is text or comments, such as tokenizing HTML to find all the anchor
  531. text, it is acceptable to ignore this responsibility.
  532. </p>
  533. <h3 id="Tokenizer.Buffered">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=10222:10259#L280">Buffered</a></h3>
  534. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) Buffered() []<a href="../../../../builtin/index.html#byte">byte</a></pre>
  535. <p>
  536. Buffered returns a slice containing data buffered but not yet tokenized.
  537. </p>
  538. <h3 id="Tokenizer.Err">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=8212:8243#L212">Err</a></h3>
  539. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) Err() <a href="../../../../builtin/index.html#error">error</a></pre>
  540. <p>
  541. Err returns the error associated with the most recent ErrorToken token.
  542. This is typically io.EOF, meaning the end of tokenization.
  543. </p>
  544. <h3 id="Tokenizer.Next">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=23371:23407#L940">Next</a></h3>
  545. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) Next() <a href="index.html#TokenType">TokenType</a></pre>
  546. <p>
  547. Next scans the next token and returns its type.
  548. </p>
  549. <h3 id="Tokenizer.NextIsNotRawText">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=8016:8054#L206">NextIsNotRawText</a></h3>
  550. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) NextIsNotRawText()</pre>
  551. <p>
  552. NextIsNotRawText instructs the tokenizer that the next token should not be
  553. considered as &#39;raw text&#39;. Some elements, such as script and title elements,
  554. normally require the next token after the opening tag to be &#39;raw text&#39; that
  555. has no child elements. For example, tokenizing &#34;&lt;title&gt;a&lt;b&gt;c&lt;/b&gt;d&lt;/title&gt;&#34;
  556. yields a start tag token for &#34;&lt;title&gt;&#34;, a text token for &#34;a&lt;b&gt;c&lt;/b&gt;d&#34;, and
  557. an end tag token for &#34;&lt;/title&gt;&#34;. There are no distinct start tag or end tag
  558. tokens for the &#34;&lt;b&gt;&#34; and &#34;&lt;/b&gt;&#34;.
  559. </p>
  560. <p>
  561. This tokenizer implementation will generally look for raw text at the right
  562. times. Strictly speaking, an HTML5 compliant tokenizer should not look for
  563. raw text if in foreign content: &lt;title&gt; generally needs raw text, but a
  564. &lt;title&gt; inside an &lt;svg&gt; does not. Another example is that a &lt;textarea&gt;
  565. generally needs raw text, but a &lt;textarea&gt; is not allowed as an immediate
  566. child of a &lt;select&gt;; in normal parsing, a &lt;textarea&gt; implies &lt;/select&gt;, but
  567. one cannot close the implicit element when parsing a &lt;select&gt;&#39;s InnerHTML.
  568. Similarly to AllowCDATA, tracking the correct moment to override raw-text-
  569. ness is difficult to do purely in the tokenizer, as opposed to the parser.
  570. For strict compliance with the HTML5 tokenization algorithm, it is the
  571. responsibility of the user of a tokenizer to call NextIsNotRawText as
  572. appropriate. In practice, like AllowCDATA, it is acceptable to ignore this
  573. responsibility for basic usage.
  574. </p>
  575. <p>
  576. Note that this &#39;raw text&#39; concept is different from the one offered by the
  577. Tokenizer.Raw method.
  578. </p>
  579. <h3 id="Tokenizer.Raw">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=26094:26126#L1060">Raw</a></h3>
  580. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) Raw() []<a href="../../../../builtin/index.html#byte">byte</a></pre>
  581. <p>
  582. Raw returns the unmodified text of the current token. Calling Next, Token,
  583. Text, TagName or TagAttr may change the contents of the returned slice.
  584. </p>
  585. <h3 id="Tokenizer.SetMaxBuf">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=29285:29321#L1179">SetMaxBuf</a></h3>
  586. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) SetMaxBuf(n <a href="../../../../builtin/index.html#int">int</a>)</pre>
  587. <p>
  588. SetMaxBuf sets a limit on the amount of data buffered during tokenization.
  589. A value of 0 means unlimited.
  590. </p>
  591. <h3 id="Tokenizer.TagAttr">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=28118:28180#L1140">TagAttr</a></h3>
  592. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) TagAttr() (key, val []<a href="../../../../builtin/index.html#byte">byte</a>, moreAttr <a href="../../../../builtin/index.html#bool">bool</a>)</pre>
  593. <p>
  594. TagAttr returns the lower-cased key and unescaped value of the next unparsed
  595. attribute for the current tag token and whether there are more attributes.
  596. The contents of the returned slices may change on the next call to Next.
  597. </p>
  598. <h3 id="Tokenizer.TagName">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=27548:27605#L1124">TagName</a></h3>
  599. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) TagName() (name []<a href="../../../../builtin/index.html#byte">byte</a>, hasAttr <a href="../../../../builtin/index.html#bool">bool</a>)</pre>
  600. <p>
  601. TagName returns the lower-cased name of a tag token (the `img` out of
  602. `&lt;IMG SRC=&#34;foo&#34;&gt;`) and whether the tag has attributes.
  603. The contents of the returned slice may change on the next call to Next.
  604. </p>
  605. <h3 id="Tokenizer.Text">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=26930:26963#L1103">Text</a></h3>
  606. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) Text() []<a href="../../../../builtin/index.html#byte">byte</a></pre>
  607. <p>
  608. Text returns the unescaped text of a text, comment or doctype token. The
  609. contents of the returned slice may change on the next call to Next.
  610. </p>
  611. <h3 id="Tokenizer.Token">func (*Tokenizer) <a href="http://localhost:6060/src/golang.org/x/net/html/token.go?s=28639:28672#L1156">Token</a></h3>
  612. <pre>func (z *<a href="index.html#Tokenizer">Tokenizer</a>) Token() <a href="index.html#Token">Token</a></pre>
  613. <p>
  614. Token returns the next Token. The result&#39;s Data and Attr values remain valid
  615. after subsequent Next calls.
  616. </p>
  617. <h2 id="pkg-subdirectories">Subdirectories</h2>
  618. <div class="pkg-dir">
  619. <table>
  620. <tr>
  621. <th class="pkg-name">Name</th>
  622. <th class="pkg-synopsis">Synopsis</th>
  623. </tr>
  624. <tr>
  625. <td colspan="2"><a href="../index.html">..</a></td>
  626. </tr>
  627. <tr>
  628. <td class="pkg-name" style="padding-left: 0px;">
  629. <a href="atom/index.html">atom</a>
  630. </td>
  631. <td class="pkg-synopsis">
  632. Package atom provides integer codes (also known as atoms) for a fixed set of frequently occurring HTML strings: tag names and attribute keys such as &#34;p&#34; and &#34;id&#34;.
  633. </td>
  634. </tr>
  635. <tr>
  636. <td class="pkg-name" style="padding-left: 0px;">
  637. <a href="charset/index.html">charset</a>
  638. </td>
  639. <td class="pkg-synopsis">
  640. Package charset provides common text encodings for HTML documents.
  641. </td>
  642. </tr>
  643. </table>
  644. </div>
  645. <div id="footer">
  646. Build version go1.6.<br>
  647. Except as <a href="https://developers.google.com/site-policies#restrictions">noted</a>,
  648. the content of this page is licensed under the
  649. Creative Commons Attribution 3.0 License,
  650. and code is licensed under a <a href="http://localhost:6060/LICENSE">BSD license</a>.<br>
  651. <a href="http://localhost:6060/doc/tos.html">Terms of Service</a> |
  652. <a href="http://www.google.com/intl/en/policies/privacy/">Privacy Policy</a>
  653. </div>
  654. </div><!-- .container -->
  655. </div><!-- #page -->
  656. <!-- TODO(adonovan): load these from <head> using "defer" attribute? -->
  657. <script type="text/javascript" src="../../../../../lib/godoc/jquery.js"></script>
  658. <script type="text/javascript" src="../../../../../lib/godoc/jquery.treeview.js"></script>
  659. <script type="text/javascript" src="../../../../../lib/godoc/jquery.treeview.edit.js"></script>
  660. <script type="text/javascript" src="../../../../../lib/godoc/godocs.js"></script>
  661. </body>
  662. </html>