1URI::Escape::XS(3)    User Contributed Perl Documentation   URI::Escape::XS(3)
2
3
4

NAME

6       URI::Escape::XS - Drop-In replacement for URI::Escape
7

VERSION

9       $Id: XS.pm,v 0.14 2016/06/09 11:09:14 dankogai Exp $
10

SYNOPSIS

12         # use it instead of URI::Escape
13         use URI::Escape::XS qw/uri_escape uri_unescape/;
14         $safe = uri_escape("10% is enough\n");
15         $verysafe = uri_escape("foo", "\0-\377");
16         $str  = uri_unescape($safe);
17
18         # or use encodeURIComponent and decodeURIComponent
19         use URI::Escape::XS;
20         $safe = encodeURIComponent("10% is enough\n");
21         $str  = decodeURIComponent("10%25%20is%20enough%0A");
22
23         # if you have CNet::IDN::Encode installed
24         $safe = encodeURIComponentIDN("http://ドメイン名例.jp/dan/");
25         $str  = decodeURIComponentIDN("http:%2F%2Fxn--eckwd4c7cu47r2wf.jp%2Fdan%2F");
26

EXPORT

28   by default
29       "encodeURIComponent" and "decodeURIComponent"
30
31       "encodeURIComponentIDN" and "decodeURIComponentIDN" if either
32       Net::LibIDN or Net::IDN::Encode is available
33
34   on demand
35       "uri_escape" and "uri_unescape"
36

FUNCTIONS

38   encodeURIComponent
39       Does what JavaScript's encodeURIComponent does.
40
41         $uri = encodeURIComponent("http://www.example.com/");
42         # http%3A%2F%2Fwww.example.com%2F
43
44       Note you cannot customize characters to escape.  If you need to do so,
45       use "uri_escape".
46
47   decodeURIComponent
48       Does what JavaScript's decodeURIComponent does.
49
50         $str = decodeURIComponent("http%3A%2F%2Fwww.example.com%2F");
51         # http://www.example.com/
52
53       It decode not only %HH sequences but also %uHHHH sequences, with
54       surrogate pairs correctly decoded.
55
56         $str = decodeURIComponent("%uD869%uDEB2%u5F3E%u0061");
57         # \x{2A6B2}\x{5F3E}a
58
59       This function UNCONDITIONALLY returns the decoded string with utf8 flag
60       off.  To get utf8-decoded string, use Encode and
61
62         decode_utf8(decodeURIComponent($uri));
63
64       This is the correct behavior because you cannot tell if the decoded
65       string actually contains UTF-8 decoded string, like ISO-8859-1 and
66       Shift_JIS.
67
68   encodeURIComponentIDN
69       Same as "encodeURIComponent" except that the host part is encoded in
70       punycode.  Either Net::LibIDN or Net::IDN::Encode is required to use
71       this function.
72
73       URIs with Internationalizing Domain Names require two encodings:
74       Punycode for host part and URI escape for the rest.
75
76       Currently only FULL URIs with "http:" or "https:" are supported.
77
78   decodeURIComponentIDN
79       Same as "decodeURIComponent" except that the host part is encoded in
80       punycode.  Either Net::LibIDN or Net::IDN::Encode is required to use
81       this function.
82
83   uri_escape
84       Does exactly the same as URI::Escape::uri_escape() except when
85       utf8-flagged string is fed.
86
87       URI::Escape::uri_escape() croak and urge you to "uri_escape_utf8()" but
88       it is pointless because URI itself has no such things as utf8 flag.
89       The function in this module ALWAYS TREATS the string as byte sequence.
90       That way you can safely use this function without worrying about utf8
91       flags.
92
93       Note this function is NOT EXPORTED by default.  That way you can use
94       URI::Escape and URI::Escape::XS simultaneously.
95
96   uri_unescape
97       Does exactly the same as URI::Escape::uri_escape() except when %uHHHH
98       is fed.
99
100       URI::Escape::uri_unescape() simply ignores %uHHHH sequences while the
101       function in this module does decode it into the corresponding UTF-8
102       byte sequence.
103
104       Like uri_escape, this function is NOT EXPORTED by default.
105
106   Note on the %uHHHH sequence
107       With this module the resulting strings never have the utf8 flag on.  So
108       if you want to decode it to perl utf8, You have to explicitly decode
109       via Encode.  Remember.  URIs have always been a byte sequence, not
110       UTF-8 characters.
111
112       If the %uHHHH sequence became standard, you could have safely told if a
113       given URI is in Unicode.  But more fortunately than unfortunately, the
114       RFC proposal was rejected so you cannot tell which encoding is used
115       just by looking at the URI.
116
117       <http://en.wikipedia.org/wiki/Percent-encoding#Non-standard_implementations>
118
119       I said fortunately because %uHHHH can be nasty for non-BMP characters.
120       Since each %uHHHH can hold one 16-bit value, you need a surrogate pair
121       to represent it if it is U+10000 and above.
122
123       In spite of that, there are a significant number of URIs with %uHHHH
124       escapes.  Therefore this module supports decoding only.
125

SPEED

127       Since this module uses XS, it is really fast except for
128       uri_escape("noop").
129
130       Regexp which is used in URI::Escape is really fast for non-matching but
131       slows down significantly when it has to replace string.
132
133   BENCHMARK
134       On Macbook Pro 2GHz, Perl 5.8.8.
135
136        http://www.google.co.jp/search?q=%E5%B0%8F%E9%A3%BC%E5%BC%BE
137        ============================================================
138        Unescape it
139        -----------
140        U::E      58526/s       --     -88%
141        U::E::XS 486968/s     732%       --
142        --------------
143        Escape it back
144        --------------
145        U::E      30046/s       --     -78%
146        U::E::XS 136992/s     356%       --
147
148        www.example.com
149        ===============
150        Unescape it
151        -----------
152                      Rate     U::E U::E::XS
153         U::E     821972/s       --      -4%
154         U::E::XS 854732/s       4%       --
155        --------------
156        Escape it back
157        -------------
158        U::E::XS 522969/s       --      -7%
159        U::E     565112/s       8%       --
160

AUTHOR

162       Dan Kogai, "<dankogai+cpan at gmail.com>"
163

BUGS

165       Please report any bugs or feature requests to "bug-uri-escape-xs at
166       rt.cpan.org", or through the web interface at
167       <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=URI-Escape-XS>.  I will
168       be notified, and then you'll automatically be notified of progress on
169       your bug as I make changes.
170

SUPPORT

172       You can find documentation for this module with the perldoc command.
173
174           perldoc URI::Escape::XS
175
176       You can also look for information at:
177
178       •   AnnoCPAN: Annotated CPAN documentation
179
180           <http://annocpan.org/dist/URI-Escape-XS>
181
182       •   CPAN Ratings
183
184           <http://cpanratings.perl.org/d/URI-Escape-XS>
185
186       •   RT: CPAN's request tracker
187
188           <http://rt.cpan.org/NoAuth/Bugs.html?Dist=URI-Escape-XS>
189
190       •   Search CPAN
191
192           <http://search.cpan.org/dist/URI-Escape-XS>
193

ACKNOWLEDGEMENTS

195       Gisle Aas for URI::Escape
196
197       Koichi Taniguchi for URI::Escape::JavaScript
198
199       Thomas Jacob for Net::LibIDN
200
201       Claus Färber for Net::IDN::Encode
202
204       Copyright 2007-2014 Dan Kogai, all rights reserved.
205
206       This program is free software; you can redistribute it and/or modify it
207       under the same terms as Perl itself.
208
209
210
211perl v5.34.0                      2022-01-21                URI::Escape::XS(3)
Impressum