|
|
<!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta name="theme-color" content="#375EAB">
<title>unicode - The Go Programming Language</title>
<link type="text/css" rel="stylesheet" href="../../../../../../lib/godoc/style.css">
<link rel="stylesheet" href="../../../../../../lib/godoc/jquery.treeview.css"> <script type="text/javascript">window.initFuncs = [];</script> </head> <body>
<div id='lowframe' style="position: fixed; bottom: 0; left: 0; height: 0; width: 100%; border-top: thin solid grey; background-color: white; overflow: auto;"> ... </div><!-- #lowframe -->
<div id="topbar" class="wide"><div class="container"> <div class="top-heading" id="heading-wide"><a href="http://localhost:6060/">The Go Programming Language</a></div> <div class="top-heading" id="heading-narrow"><a href="http://localhost:6060/">Go</a></div> <a href="index.html#" id="menu-button"><span id="menu-button-arrow">▽</span></a> <form method="GET" action="http://localhost:6060/search"> <div id="menu"> <a href="http://localhost:6060/doc/">Documents</a> <a href="http://localhost:6060/pkg/">Packages</a> <a href="http://localhost:6060/project/">The Project</a> <a href="http://localhost:6060/help/">Help</a> <a href="http://localhost:6060/blog/">Blog</a>
<input type="text" id="search" name="q" class="inactive" value="Search" placeholder="Search"> </div> </form>
</div></div>
<div id="page" class="wide"> <div class="container">
<h1>Package unicode</h1>
<div id="nav"></div>
<!--
Copyright 2009 The Go Authors. All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file. --> <!--
Note: Static (i.e., not template-generated) href and id attributes start with "pkg-" to make it impossible for them to conflict with generated attributes (some of which correspond to Go identifiers). -->
<script type='text/javascript'> document.ANALYSIS_DATA = null; document.CALLGRAPH = null; </script>
<div id="short-nav"> <dl> <dd><code>import "golang.org/x/text/encoding/unicode"</code></dd> </dl> <dl> <dd><a href="index.html#pkg-overview" class="overviewLink">Overview</a></dd> <dd><a href="index.html#pkg-index" class="indexLink">Index</a></dd> <dd><a href="index.html#pkg-subdirectories">Subdirectories</a></dd> </dl> </div> <!-- The package's Name is printed as title by the top-level template --> <div id="pkg-overview" class="toggleVisible"> <div class="collapsed"> <h2 class="toggleButton" title="Click to show Overview section">Overview ▹</h2> </div> <div class="expanded"> <h2 class="toggleButton" title="Click to hide Overview section">Overview ▾</h2> <p> Package unicode provides Unicode encodings such as UTF-16. </p>
</div> </div>
<div id="pkg-index" class="toggleVisible"> <div class="collapsed"> <h2 class="toggleButton" title="Click to show Index section">Index ▹</h2> </div> <div class="expanded"> <h2 class="toggleButton" title="Click to hide Index section">Index ▾</h2>
<!-- Table of contents for API; must be named manual-nav to turn off auto nav. --> <div id="manual-nav"> <dl> <dd><a href="index.html#pkg-variables">Variables</a></dd> <dd><a href="index.html#BOMOverride">func BOMOverride(fallback transform.Transformer) transform.Transformer</a></dd> <dd><a href="index.html#UTF16">func UTF16(e Endianness, b BOMPolicy) encoding.Encoding</a></dd> <dd><a href="index.html#BOMPolicy">type BOMPolicy</a></dd> <dd><a href="index.html#Endianness">type Endianness</a></dd> </dl> </div><!-- #manual-nav -->
<h4>Package files</h4> <p> <span style="font-size:90%"> <a href="http://localhost:6060/src/golang.org/x/text/encoding/unicode/override.go">override.go</a> <a href="http://localhost:6060/src/golang.org/x/text/encoding/unicode/unicode.go">unicode.go</a> </span> </p> </div><!-- .expanded --> </div><!-- #pkg-index -->
<div id="pkg-callgraph" class="toggle" style="display: none"> <div class="collapsed"> <h2 class="toggleButton" title="Click to show Internal Call Graph section">Internal call graph ▹</h2> </div> <!-- .expanded --> <div class="expanded"> <h2 class="toggleButton" title="Click to hide Internal Call Graph section">Internal call graph ▾</h2> <p> In the call graph viewer below, each node is a function belonging to this package and its children are the functions it calls—perhaps dynamically. </p> <p> The root nodes are the entry points of the package: functions that may be called from outside the package. There may be non-exported or anonymous functions among them if they are called dynamically from another package. </p> <p> Click a node to visit that function's source code. From there you can visit its callers by clicking its declaring <code>func</code> token. </p> <p> Functions may be omitted if they were determined to be unreachable in the particular programs or tests that were analyzed. </p> <!-- Zero means show all package entry points. --> <ul style="margin-left: 0.5in" id="callgraph-0" class="treeview"></ul> </div> </div> <!-- #pkg-callgraph -->
<h2 id="pkg-variables">Variables</h2> <pre>var <span id="All">All</span> = []<a href="../index.html">encoding</a>.<a href="../index.html#Encoding">Encoding</a>{ <a href="index.html#UTF8">UTF8</a>, <a href="index.html#UTF16">UTF16</a>(<a href="index.html#BigEndian">BigEndian</a>, <a href="index.html#UseBOM">UseBOM</a>), <a href="index.html#UTF16">UTF16</a>(<a href="index.html#BigEndian">BigEndian</a>, <a href="index.html#IgnoreBOM">IgnoreBOM</a>), <a href="index.html#UTF16">UTF16</a>(<a href="index.html#LittleEndian">LittleEndian</a>, <a href="index.html#IgnoreBOM">IgnoreBOM</a>), }</pre> <p> All lists a configuration for each IANA-defined UTF-16 variant. </p>
<pre>var <span id="ErrMissingBOM">ErrMissingBOM</span> = <a href="../../../../../errors/index.html">errors</a>.<a href="../../../../../errors/index.html#New">New</a>("encoding: missing byte order mark")</pre> <p> ErrMissingBOM means that decoding UTF-16 input with ExpectBOM did not find a starting byte order mark. </p>
<pre>var <span id="UTF8">UTF8</span> <a href="../index.html">encoding</a>.<a href="../index.html#Encoding">Encoding</a> = utf8enc</pre> <p> UTF8 is the UTF-8 encoding. </p>
<h2 id="BOMOverride">func <a href="http://localhost:6060/src/golang.org/x/text/encoding/unicode/override.go?s=1109:1179#L17">BOMOverride</a></h2> <pre>func BOMOverride(fallback <a href="../../transform/index.html">transform</a>.<a href="../../transform/index.html#Transformer">Transformer</a>) <a href="../../transform/index.html">transform</a>.<a href="../../transform/index.html#Transformer">Transformer</a></pre> <p> BOMOverride returns a new decoder transformer that is identical to fallback, except that the presence of a Byte Order Mark at the start of the input causes it to switch to the corresponding Unicode decoding. It will only consider BOMs for UTF-8, UTF-16BE, and UTF-16LE. </p> <p> This differs from using ExpectBOM by allowing a BOM to switch to UTF-8, not just UTF-16 variants, and allowing falling back to any encoding scheme. </p> <p> This technique is recommended by the W3C for use in HTML 5: "For compatibility with deployed content, the byte order mark (also known as BOM) is considered more authoritative than anything else." <a href="http://www.w3.org/TR/encoding/#specification-hooks">http://www.w3.org/TR/encoding/#specification-hooks</a> </p> <p> Using BOMOverride is mostly intended for use cases where the first characters of a fallback encoding are known to not be a BOM, for example, for valid HTML and most encodings. </p>
<h2 id="UTF16">func <a href="http://localhost:6060/src/golang.org/x/text/encoding/unicode/unicode.go?s=5459:5514#L142">UTF16</a></h2> <pre>func UTF16(e <a href="index.html#Endianness">Endianness</a>, b <a href="index.html#BOMPolicy">BOMPolicy</a>) <a href="../index.html">encoding</a>.<a href="../index.html#Encoding">Encoding</a></pre> <p> UTF16 returns a UTF-16 Encoding for the given default endianness and byte order mark (BOM) policy. </p> <p> When decoding from UTF-16 to UTF-8, if the BOMPolicy is IgnoreBOM then neither BOMs U+FEFF nor noncharacters U+FFFE in the input stream will affect the endianness used for decoding, and will instead be output as their standard UTF-8 encodings: "\xef\xbb\xbf" and "\xef\xbf\xbe". If the BOMPolicy is UseBOM or ExpectBOM a staring BOM is not written to the UTF-8 output. Instead, it overrides the default endianness e for the remainder of the transformation. Any subsequent BOMs U+FEFF or noncharacters U+FFFE will not affect the endianness used, and will instead be output as their standard UTF-8 encodings. For UseBOM, if there is no starting BOM, it will proceed with the default Endianness. For ExpectBOM, in that case, the transformation will return early with an ErrMissingBOM error. </p> <p> When encoding from UTF-8 to UTF-16, a BOM will be inserted at the start of the output if the BOMPolicy is UseBOM or ExpectBOM. Otherwise, a BOM will not be inserted. The UTF-8 input does not need to contain a BOM. </p> <p> There is no concept of a 'native' endianness. If the UTF-16 data is produced and consumed in a greater context that implies a certain endianness, use IgnoreBOM. Otherwise, use ExpectBOM and always produce and consume a BOM. </p> <p> In the language of <a href="http://www.unicode.org/faq/utf_bom.html#bom10">http://www.unicode.org/faq/utf_bom.html#bom10</a>, IgnoreBOM corresponds to "Where the precise type of the data stream is known... the BOM should not be used" and ExpectBOM corresponds to "A particular protocol... may require use of the BOM". </p>
<h2 id="BOMPolicy">type <a href="http://localhost:6060/src/golang.org/x/text/encoding/unicode/unicode.go?s=6708:6728#L173">BOMPolicy</a></h2> <pre>type BOMPolicy <a href="../../../../../builtin/index.html#uint8">uint8</a></pre> <p> BOMPolicy is a UTF-16 encoding's byte order mark policy. </p>
<pre>const (
<span class="comment">// IgnoreBOM means to ignore any byte order marks.</span> <span id="IgnoreBOM">IgnoreBOM</span> <a href="index.html#BOMPolicy">BOMPolicy</a> = 0
<span class="comment">// UseBOM means that the UTF-16 form may start with a byte order mark, which</span> <span class="comment">// will be used to override the default encoding.</span> <span id="UseBOM">UseBOM</span> <a href="index.html#BOMPolicy">BOMPolicy</a> = writeBOM | acceptBOM
<span class="comment">// ExpectBOM means that the UTF-16 form must start with a byte order mark,</span> <span class="comment">// which will be used to override the default encoding.</span> <span id="ExpectBOM">ExpectBOM</span> <a href="index.html#BOMPolicy">BOMPolicy</a> = writeBOM | acceptBOM | requireBOM )</pre>
<h2 id="Endianness">type <a href="http://localhost:6060/src/golang.org/x/text/encoding/unicode/unicode.go?s=8326:8346#L212">Endianness</a></h2> <pre>type Endianness <a href="../../../../../builtin/index.html#bool">bool</a></pre> <p> Endianness is a UTF-16 encoding's default endianness. </p>
<pre>const ( <span class="comment">// BigEndian is UTF-16BE.</span> <span id="BigEndian">BigEndian</span> <a href="index.html#Endianness">Endianness</a> = <a href="../../../../../builtin/index.html#false">false</a> <span class="comment">// LittleEndian is UTF-16LE.</span> <span id="LittleEndian">LittleEndian</span> <a href="index.html#Endianness">Endianness</a> = <a href="../../../../../builtin/index.html#true">true</a> )</pre>
<h2 id="pkg-subdirectories">Subdirectories</h2>
<div class="pkg-dir"> <table> <tr> <th class="pkg-name">Name</th> <th class="pkg-synopsis">Synopsis</th> </tr>
<tr> <td colspan="2"><a href="../index.html">..</a></td> </tr>
<tr> <td class="pkg-name" style="padding-left: 0px;"> <a href="utf32/index.html">utf32</a> </td> <td class="pkg-synopsis"> Package utf32 provides the UTF-32 Unicode encoding. </td> </tr> </table> </div>
<div id="footer"> Build version go1.6.<br> Except as <a href="https://developers.google.com/site-policies#restrictions">noted</a>, the content of this page is licensed under the Creative Commons Attribution 3.0 License, and code is licensed under a <a href="http://localhost:6060/LICENSE">BSD license</a>.<br> <a href="http://localhost:6060/doc/tos.html">Terms of Service</a> | <a href="http://www.google.com/intl/en/policies/privacy/">Privacy Policy</a> </div>
</div><!-- .container --> </div><!-- #page -->
<!-- TODO(adonovan): load these from <head> using "defer" attribute? --> <script type="text/javascript" src="../../../../../../lib/godoc/jquery.js"></script> <script type="text/javascript" src="../../../../../../lib/godoc/jquery.treeview.js"></script> <script type="text/javascript" src="../../../../../../lib/godoc/jquery.treeview.edit.js"></script>
<script type="text/javascript" src="../../../../../../lib/godoc/godocs.js"></script>
</body> </html>
|