1PASSPHRASE-ENCODING(7) OpenSSL PASSPHRASE-ENCODING(7)
2
3
4
6 passphrase-encoding - How diverse parts of OpenSSL treat pass phrases
7 character encoding
8
10 In a modern world with all sorts of character encodings, the treatment
11 of pass phrases has become increasingly complex. This manual page
12 attempts to give an overview over how this problem is currently
13 addressed in different parts of the OpenSSL library.
14
15 The general case
16 The OpenSSL library doesn't treat pass phrases in any special way as a
17 general rule, and trusts the application or user to choose a suitable
18 character set and stick to that throughout the lifetime of affected
19 objects. This means that for an object that was encrypted using a pass
20 phrase encoded in ISO-8859-1, that object needs to be decrypted using a
21 pass phrase encoded in ISO-8859-1. Using the wrong encoding is
22 expected to cause a decryption failure.
23
24 PKCS#12
25 PKCS#12 is a bit different regarding pass phrase encoding. The
26 standard stipulates that the pass phrase shall be encoded as an ASN.1
27 BMPString, which consists of the code points of the basic multilingual
28 plane, encoded in big endian (UCS-2 BE).
29
30 OpenSSL tries to adapt to this requirements in one of the following
31 manners:
32
33 1. Treats the received pass phrase as UTF-8 encoded and tries to re-
34 encode it to UTF-16 (which is the same as UCS-2 for characters
35 U+0000 to U+D7FF and U+E000 to U+FFFF, but becomes an expansion for
36 any other character), or failing that, proceeds with step 2.
37
38 2. Assumes that the pass phrase is encoded in ASCII or ISO-8859-1 and
39 opportunistically prepends each byte with a zero byte to obtain the
40 UCS-2 encoding of the characters, which it stores as a BMPString.
41
42 Note that since there is no check of your locale, this may produce
43 UCS-2 / UTF-16 characters that do not correspond to the original
44 pass phrase characters for other character sets, such as any
45 ISO-8859-X encoding other than ISO-8859-1 (or for Windows, CP 1252
46 with exception for the extra "graphical" characters in the
47 0x80-0x9F range).
48
49 OpenSSL versions older than 1.1.0 do variant 2 only, and that is the
50 reason why OpenSSL still does this, to be able to read files produced
51 with older versions.
52
53 It should be noted that this approach isn't entirely fault free.
54
55 A pass phrase encoded in ISO-8859-2 could very well have a sequence
56 such as 0xC3 0xAF (which is the two characters "LATIN CAPITAL LETTER A
57 WITH BREVE" and "LATIN CAPITAL LETTER Z WITH DOT ABOVE" in ISO-8859-2
58 encoding), but would be misinterpreted as the perfectly valid UTF-8
59 encoded code point U+00EF (LATIN SMALL LETTER I WITH DIAERESIS) if the
60 pass phrase doesn't contain anything that would be invalid UTF-8. A
61 pass phrase that contains this kind of byte sequence will give a
62 different outcome in OpenSSL 1.1.0 and newer than in OpenSSL older than
63 1.1.0.
64
65 0x00 0xC3 0x00 0xAF # OpenSSL older than 1.1.0
66 0x00 0xEF # OpenSSL 1.1.0 and newer
67
68 On the same accord, anything encoded in UTF-8 that was given to OpenSSL
69 older than 1.1.0 was misinterpreted as ISO-8859-1 sequences.
70
71 OSSL_STORE
72 ossl_store(7) acts as a general interface to access all kinds of
73 objects, potentially protected with a pass phrase, a PIN or something
74 else. This API stipulates that pass phrases should be UTF-8 encoded,
75 and that any other pass phrase encoding may give undefined results.
76 This API relies on the application to ensure UTF-8 encoding, and
77 doesn't check that this is the case, so what it gets, it will also pass
78 to the underlying loader.
79
81 This section assumes that you know what pass phrase was used for
82 encryption, but that it may have been encoded in a different character
83 encoding than the one used by your current input method. For example,
84 the pass phrase may have been used at a time when your default encoding
85 was ISO-8859-1 (i.e. "naieve" resulting in the byte sequence 0x6E 0x61
86 0xEF 0x76 0x65), and you're now in an environment where your default
87 encoding is UTF-8 (i.e. "naieve" resulting in the byte sequence 0x6E
88 0x61 0xC3 0xAF 0x76 0x65). Whenever it's mentioned that you should use
89 a certain character encoding, it should be understood that you either
90 change the input method to use the mentioned encoding when you type in
91 your pass phrase, or use some suitable tool to convert your pass phrase
92 from your default encoding to the target encoding.
93
94 Also note that the sub-sections below discuss human readable pass
95 phrases. This is particularly relevant for PKCS#12 objects, where
96 human readable pass phrases are assumed. For other objects, it's as
97 legitimate to use any byte sequence (such as a sequence of bytes from
98 `/dev/urandom` that's been saved away), which makes any character
99 encoding discussion irrelevant; in such cases, simply use the same byte
100 sequence as it is.
101
102 Creating new objects
103 For creating new pass phrase protected objects, make sure the pass
104 phrase is encoded using UTF-8. This is default on most modern Unixes,
105 but may involve an effort on other platforms. Specifically for
106 Windows, setting the environment variable "OPENSSL_WIN32_UTF8" will
107 have anything entered on [Windows] console prompt converted to UTF-8
108 (command line and separately prompted pass phrases alike).
109
110 Opening existing objects
111 For opening pass phrase protected objects where you know what character
112 encoding was used for the encryption pass phrase, make sure to use the
113 same encoding again.
114
115 For opening pass phrase protected objects where the character encoding
116 that was used is unknown, or where the producing application is
117 unknown, try one of the following:
118
119 1. Try the pass phrase that you have as it is in the character
120 encoding of your environment. It's possible that its byte sequence
121 is exactly right.
122
123 2. Convert the pass phrase to UTF-8 and try with the result.
124 Specifically with PKCS#12, this should open up any object that was
125 created according to the specification.
126
127 3. Do a naieve (i.e. purely mathematical) ISO-8859-1 to UTF-8
128 conversion and try with the result. This differs from the previous
129 attempt because ISO-8859-1 maps directly to U+0000 to U+00FF, which
130 other non-UTF-8 character sets do not.
131
132 This also takes care of the case when a UTF-8 encoded string was
133 used with OpenSSL older than 1.1.0. (for example, "ie", which is
134 0xC3 0xAF when encoded in UTF-8, would become 0xC3 0x83 0xC2 0xAF
135 when re-encoded in the naieve manner. The conversion to BMPString
136 would then yield 0x00 0xC3 0x00 0xA4 0x00 0x00, the
137 erroneous/non-compliant encoding used by OpenSSL older than 1.1.0)
138
140 evp(7), ossl_store(7), EVP_BytesToKey(3), EVP_DecryptInit(3),
141 PEM_do_header(3), PKCS12_parse(3), PKCS12_newpass(3),
142 d2i_PKCS8PrivateKey_bio(3)
143
145 Copyright 2018-2020 The OpenSSL Project Authors. All Rights Reserved.
146
147 Licensed under the OpenSSL license (the "License"). You may not use
148 this file except in compliance with the License. You can obtain a copy
149 in the file LICENSE in the source distribution or at
150 <https://www.openssl.org/source/license.html>.
151
152
153
1541.1.1l 2021-09-15 PASSPHRASE-ENCODING(7)