1curl_url_get(3) libcurl curl_url_get(3)
2
3
4
6 curl_url_get - extract a part from a URL
7
9 #include <curl/curl.h>
10
11 CURLUcode curl_url_get(CURLU *url,
12 CURLUPart what,
13 char **part,
14 unsigned int flags)
15
17 Given the url handle of an already parsed URL, this function lets the
18 user extract individual pieces from it.
19
20 The what argument should be the particular part to extract (see list
21 below) and part points to a 'char *' to get updated to point to a newly
22 allocated string with the contents.
23
24 The URL API has no particular maximum length for URL fiends. In the
25 real world, excessively long field in URLs will cause problems even if
26 this API accepts them. This function can return very large ones.
27
28 The flags argument is a bitmask with individual features.
29
30 The returned part pointer must be freed with curl_free(3) after use.
31
33 The flags argument is zero, one or more bits set in a bitmask.
34
35 CURLU_DEFAULT_PORT
36 If the handle has no port stored, this option will make
37 curl_url_get(3) return the default port for the used scheme.
38
39 CURLU_DEFAULT_SCHEME
40 If the handle has no scheme stored, this option will make
41 curl_url_get(3) return the default scheme instead of error.
42
43 CURLU_NO_DEFAULT_PORT
44 Instructs curl_url_get(3) to not return a port number if it
45 matches the default port for the scheme.
46
47 CURLU_URLDECODE
48 Asks curl_url_get(3) to URL decode the contents before returning
49 it. It will not attempt to decode the scheme, the port number or
50 the full URL.
51
52 The query component will also get plus-to-space conversion as a
53 bonus when this bit is set.
54
55 Note that this URL decoding is charset unaware and you will get
56 a zero terminated string back with data that could be intended
57 for a particular encoding.
58
59 If there's any byte values lower than 32 in the decoded string,
60 the get operation will return an error instead.
61
62 CURLU_URLENCODE
63 If set, will make curl_url_get(3) URL encode the host name part
64 when a full URL is retrieved. If not set (default), libcurl re‐
65 turns the URL with the host name "raw" to support IDN names to
66 appear as-is. IDN host names are typically using non-ASCII bytes
67 that otherwise will be percent-encoded.
68
69 Note that even when not asking for URL encoding, the '%' (byte
70 37) will be URL encoded to make sure the host name remains
71 valid.
72
73 CURLU_PUNYCODE
74 If set and CURLU_URLENCODE is not set, and asked to retrieve the
75 CURLUPART_HOST or CURLUPART_URL parts, libcurl returns the host
76 name in its punycode version if it contains any non-ASCII octets
77 (and is an IDN name).
78
79 If libcurl is built without IDN capabilities, using this bit
80 will make curl_url_get(3) return CURLUE_LACKS_IDN if the host
81 name contains anything outside the ASCII range.
82
83 (Added in curl 7.88.0)
84
86 CURLUPART_URL
87 When asked to return the full URL, curl_url_get(3) will return a
88 normalized and possibly cleaned up version of what was previ‐
89 ously parsed.
90
91 We advise using the CURLU_PUNYCODE option to get the URL as
92 "normalized" as possible since IDN allows host names to be writ‐
93 ten in many different ways that still end up the same punycode
94 version.
95
96 CURLUPART_SCHEME
97 Scheme cannot be URL decoded on get.
98
99 CURLUPART_USER
100
101 CURLUPART_PASSWORD
102
103 CURLUPART_OPTIONS
104 The options field is an optional field that might follow the
105 password in the userinfo part. It is only recognized/used when
106 parsing URLs for the following schemes: pop3, smtp and imap. The
107 URL API still allows users to set and get this field indepen‐
108 dently of scheme when not parsing full URLs.
109
110 CURLUPART_HOST
111 The host name. If it is an IPv6 numeric address, the zone id
112 will not be part of it but is provided separately in CURLU‐
113 PART_ZONEID. IPv6 numerical addresses are returned within brack‐
114 ets ([]).
115
116 IPv6 names are normalized when set, which should make them as
117 short as possible while maintaining correct syntax.
118
119 CURLUPART_ZONEID
120 If the host name is a numeric IPv6 address, this field might
121 also be set.
122
123 CURLUPART_PORT
124 A port cannot be URL decoded on get. This number is returned in
125 a string just like all other parts. That string is guaranteed to
126 hold a valid port number in ASCII using base 10.
127
128 CURLUPART_PATH
129 The part will be '/' even if no path is supplied in the URL. A
130 URL path always starts with a slash.
131
132 CURLUPART_QUERY
133 The initial question mark that denotes the beginning of the
134 query part is a delimiter only. It is not part of the query
135 contents.
136
137 A not-present query will lead part to be set to NULL. A zero-
138 length query will lead part to be set to a zero-length string.
139
140 The query part will also get pluses converted to space when
141 asked to URL decode on get with the CURLU_URLDECODE bit.
142
143 CURLUPART_FRAGMENT
144 The initial hash sign that denotes the beginning of the fragment
145 is a delimiter only. It is not part of the fragment contents.
146
148 CURLUcode rc;
149 CURLU *url = curl_url();
150 rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0);
151 if(!rc) {
152 char *scheme;
153 rc = curl_url_get(url, CURLUPART_SCHEME, &scheme, 0);
154 if(!rc) {
155 printf("the scheme is %s\n", scheme);
156 curl_free(scheme);
157 }
158 curl_url_cleanup(url);
159 }
160
162 Added in 7.62.0. CURLUPART_ZONEID was added in 7.65.0.
163
165 Returns a CURLUcode error value, which is CURLUE_OK (0) if everything
166 went fine. See the libcurl-errors(3) man page for the full list with
167 descriptions.
168
169 If this function returns an error, no URL part is returned.
170
172 curl_url_cleanup(3), curl_url(3), curl_url_set(3), curl_url_dup(3),
173 curl_url_strerror(3), CURLOPT_CURLU(3)
174
175
176
177libcurl 8.2.1 April 26, 2023 curl_url_get(3)