1Encode::IMAPUTF7(3) User Contributed Perl Documentation Encode::IMAPUTF7(3)
2
3
4
6 Encode::IMAPUTF7 - modification of UTF-7 encoding for IMAP
7
9 use Encode qw/encode decode/;
10
11 print encode('IMAP-UTF-7', 'RĂ©pertoire');
12 print decode('IMAP-UTF-7', R&AOk-pertoire');
13
15 IMAP mailbox names are encoded in a modified UTF7 when names contains
16 international characters outside of the printable ASCII range. The
17 modified UTF-7 encoding is defined in RFC2060 (section 5.1.3).
18
19 There is another CPAN module with same purpose, Unicode::IMAPUtf7.
20 However, it works correctly only with strings, which encoded form does
21 not contain plus sign. For example, the Cyrillic string
22 \x{043f}\x{0440}\x{0435}\x{0434}\x{043b}\x{043e}\x{0433} is represented
23 in UTF-7 as +BD8EQAQ1BDQEOwQ+BDM- Note the second plus sign 4
24 characters before the end. Unicode::IMAPUtf7 encodes the above string
25 as +BD8EQAQ1BDQEOwQ&BDM- which is not valid modified UTF-7 (the
26 ampersand and the plus are swapped). The problem is solved by the
27 current module, which is slightly modified Encode::Unicode::UTF7 and
28 has nothing common with Unicode::IMAPUtf7.
29
31 By convention, international mailbox names are specified using a
32 modified version of the UTF-7 encoding described in [UTF-7]. The
33 purpose of these modifications is to correct the following problems
34 with UTF-7:
35
36 1) UTF-7 uses the "+" character for shifting; this conflicts with
37 the common use of "+" in mailbox names, in particular USENET
38 newsgroup names.
39
40 2) UTF-7's encoding is BASE64 which uses the "/" character; this
41 conflicts with the use of "/" as a popular hierarchy delimiter.
42
43 3) UTF-7 prohibits the unencoded usage of "\"; this conflicts with
44 the use of "\" as a popular hierarchy delimiter.
45
46 4) UTF-7 prohibits the unencoded usage of "~"; this conflicts with
47 the use of "~" in some servers as a home directory indicator.
48
49 5) UTF-7 permits multiple alternate forms to represent the same
50 string; in particular, printable US-ASCII chararacters can be
51 represented in encoded form.
52
53 In modified UTF-7, printable US-ASCII characters except for "&"
54 represent themselves; that is, characters with octet values 0x20-0x25
55 and 0x27-0x7e. The character "&" (0x26) is represented by the two-
56 octet sequence "&-".
57
58 All other characters (octet values 0x00-0x1f, 0x7f-0xff, and all
59 Unicode 16-bit octets) are represented in modified BASE64, with a
60 further modification from [UTF-7] that "," is used instead of "/".
61 Modified BASE64 MUST NOT be used to represent any printing US-ASCII
62 character which can represent itself.
63
64 "&" is used to shift to modified BASE64 and "-" to shift back to US-
65 ASCII. All names start in US-ASCII, and MUST end in US-ASCII (that is,
66 a name that ends with a Unicode 16-bit octet MUST end with a "- ").
67
68 For example, here is a mailbox name which mixes English, Japanese, and
69 Chinese text: ~peter/mail/&ZeVnLIqe-/&U,BTFw-
70
72 Please report any requests, suggestions or bugs via the RT bug-tracking
73 system at http://rt.cpan.org/ or email to
74 bug-Encode-IMAPUTF7@rt.cpan.org.
75
76 http://rt.cpan.org/NoAuth/Bugs.html?Dist=Encode-IMAPUTF7 is the RT
77 queue for Encode::IMAPUTF7. Please check to see if your bug has
78 already been reported.
79
81 Copyright 2005 Sava Chankov
82
83 Sava Chankov, sava@cpan.org
84
85 This software may be freely copied and distributed under the same terms
86 and conditions as Perl.
87
89 Peter Makholm <peter@makholm.net>, current maintainer
90
91 Sava Chankov <sava@cpan.org>, original author
92
94 perl(1), Encode.
95
97 Hey! The above document had some coding errors, which are explained
98 below:
99
100 Around line 90:
101 Non-ASCII character seen before =encoding in ''RĂ©pertoire');'.
102 Assuming UTF-8
103
104
105
106perl v5.30.1 2020-01-29 Encode::IMAPUTF7(3)