1Net::IDN::Encode(3) User Contributed Perl Documentation Net::IDN::Encode(3)
2
3
4
6 Net::IDN::Encode - Internationalizing Domain Names in Applications
7 (IDNA)
8
10 use Net::IDN::Encode ':all';
11 my $a = domain_to_ascii("müller.example.org");
12 my $e = email_to_ascii("POSTMASTER@例。テスト");
13 my $u = domain_to_unicode('EXAMPLE.XN--11B5BS3A9AJ6G');
14
16 This module provides an easy-to-use interface for encoding and decoding
17 Internationalized Domain Names (IDNs).
18
19 IDNs use characters drawn from a large repertoire (Unicode), but IDNA
20 allows the non-ASCII characters to be represented using only the ASCII
21 characters already allowed in so-called host names today (letter-digit-
22 hyphen, "/[A-Z0-9-]/i").
23
24 Use this module if you just want to convert domain names (or email
25 addresses), using whatever IDNA standard is the best choice at the
26 moment.
27
28 You should be familiar with Unicode support in perl, as this module
29 expects correctly encoded input. See perlunitut, perluniintro and
30 perlunicode for details.
31
33 To convert labels correctly between Unicode and ASCII, each character
34 in the label must be present in the Unicode version supported by your
35 perl. Consequently, this module will refuse to convert labels with new
36 Unicode characters on older perl versions (see below).
37
39 By default, this module does not export any subroutines. You may use
40 the ":all" tag to import everything. You can also use regular
41 expressions such as "/^to_/" or "/^email_/" to select some of the
42 functions, see Exporter for details.
43
44 The following functions are available:
45
46 to_ascii( $label, %param )
47 Converts a single label $label to ASCII. Will throw an exception on
48 invalid input. If $label is already a valid ASCII domain label
49 (including most NON-LDH labels such as those used for SRV records
50 and fake A-labels), this function will never fail but return $label
51 as-is if conversion would fail.
52
53 This function takes the following optional parameters (%param):
54
55 AllowUnassigned
56 (boolean) If set to a true value, code points that are
57 unassigned in the Unicode version supported by your perl are
58 allowed. This is an extension over UTS #46.
59
60 While this increases the number of labels that can be converted
61 successfully (especially on older perls) and may thus maximizes
62 the compatibility with domain names created under future
63 versions of Unicode, it also introduces the risk of incorrect
64 conversions. Characters added in later versions of Unicode
65 might have properties that affect the conversion; if these
66 properties are not known on your version of perl, you might
67 therefore end up with an incorrect conversion.
68
69 The default is false.
70
71 UseSTD3ASCIIRules
72 (boolean) If set to a true value, checks the label for
73 compliance with STD 3 (RFC 1123) syntax for host name parts.
74 The exact checks done depend on the IDNA standard used.
75 Usually, you will want to set this to true.
76
77 Please note that UseSTD3ASCIIRules only affects the conversion
78 between ASCII labels (A-labels) and Unicode labels (U-labels).
79 Labels that are in ASCII may still be passed-through as-is.
80
81 For historical reasons, the default is false (unlike
82 "domain_to_ascii").
83
84 TransitionalProcessing
85 (boolean) If set to true, the conversion will be compatible
86 with IDNA2003. This only affects four characters: 'ß' (U+00DF),
87 'ς' (U+03C2), ZWJ (U+200D) and ZWNJ (U+200C). Usually, you will
88 want to set this to false.
89
90 The default is false.
91
92 This function does not handle strings that consist of multiple
93 labels (such as domain names). Use "domain_to_ascii" instead.
94
95 to_unicode( $label, %param )
96 Converts a single label $label to Unicode. Will throw an exception
97 on invalid input. If $label is an ASCII label (including most NON-
98 LDH labels such as those used for SRV records), this function will
99 not fail but return $label as-is if conversion would fail.
100
101 This function takes the same optional parameters as "to_ascii",
102 with the same defaults.
103
104 If $label is already in ASCII, this function will never fail but
105 return $label as is as a last resort (i.e. pass-through).
106
107 This function takes the following optional parameters (%param):
108
109 AllowUnassigned
110 UseSTD3ASCIIRules
111 See "to_unicode" above. Please note that there is no need for
112 "TransitionalProcessing" for "to_unicode".
113
114 This function does not handle strings that consist of multiple
115 labels (such as domain names). Use "domain_to_unicode" instead.
116
117 domain_to_ascii( $label, %param )
118 Converts all labels of the hostname $domain (with labels separated
119 by dots) to ASCII (using "to_ascii"). Will throw an exception on
120 invalid input.
121
122 This function takes the following optional parameters (%param):
123
124 AllowUnassigned
125 TransitionalProcessing
126 See "to_unicode" above.
127
128 UseSTD3ASCIIRules
129 (boolean) If set to a true value, checks the label for
130 compliance with STD 3 (RFC 1123) syntax for host name parts.
131
132 The default is true (unlike "to_ascii").
133
134 This function will convert all dots to ASCII, i.e. to U+002E (full
135 stop). The following characters are recognized as dots: U+002E
136 (full stop), U+3002 (ideographic full stop), U+FF0E (fullwidth full
137 stop), U+FF61 (halfwidth ideographic full stop).
138
139 domain_to_unicode( $domain, %param )
140 Converts all labels of the hostname $domain (with labels separated
141 by dots) to Unicode. Will throw an exception on invalid input.
142
143 This function takes the same optional parameters as
144 "domain_to_ascii", with the same defaults.
145
146 This function takes the following optional parameters (%param):
147
148 AllowUnassigned
149 UseSTD3ASCIIRules
150 See "domain_to_unicode" above. Please note that there is no
151 "TransitionalProcessing" for "domain_to_unicode".
152
153 This function will preserve the original version of dots. The
154 following characters are recognized as dots: U+002E (full stop),
155 U+3002 (ideographic full stop), U+FF0E (fullwidth full stop),
156 U+FF61 (halfwidth ideographic full stop).
157
158 email_to_ascii( $email, %param )
159 Converts the domain part (right hand side, separated by an at sign)
160 of an RFC 2821/2822 email address to ASCII, using
161 "domain_to_ascii". May throw an exception on invalid input.
162
163 It takes the same parameters as "domain_to_ascii".
164
165 This function currently does not handle internationalization of the
166 local-part (left hand side). Future versions of this module might
167 implement an ASCII conversion for the local-part, should one be
168 standardized.
169
170 This function will convert the at sign to ASCII, i.e. to U+0040
171 (commercial at), as well as label separators. The following
172 characters are recognized as at signs: U+0040 (commercial at),
173 U+FE6B (small commercial at) and U+FF20 (fullwidth commercial at).
174
175 email_to_unicode( $email, %param )
176 Converts the domain part (right hand side, separated by an at sign)
177 of an RFC 2821/2822 email address to Unicode, using
178 "domain_to_unicode". May throw an exception on invalid input.
179
180 It takes the same parameters as "domain_to_unicode".
181
182 This function currently does not handle internationalization of the
183 local-part (left hand side). Future versions of this module might
184 implement a conversion from ASCII for the local-part, should one be
185 standardized.
186
187 This function will preserve the original version of at signs (and
188 label separators). The following characters are recognized as at
189 signs: U+0040 (commercial at), U+FE6B (small commercial at) and
190 U+FF20 (fullwidth commercial at).
191
193 Claus Färber <CFAERBER@cpan.org>
194
196 Copyright 2007-2014 Claus Färber.
197
198 This library is free software; you can redistribute it and/or modify it
199 under the same terms as Perl itself.
200
202 Net::IDN::Punycode, Net::IDN::UTS46, Net::IDN::IDNA2003,
203 Net::IDN::IDNA2008, UTS #46 (<http://www.unicode.org/reports/tr46/>),
204 RFC 5890 (<http://tools.ietf.org/html/rfc5890>).
205
206
207
208perl v5.36.0 2023-01-20 Net::IDN::Encode(3)