1curl_url_get(3)                 libcurl Manual                 curl_url_get(3)
2
3
4

NAME

6       curl_url_get - extract a part from a URL
7

SYNOPSIS

9       #include <curl/curl.h>
10
11       CURLUcode curl_url_get(CURLU *url,
12                              CURLUPart what,
13                              char **part,
14                              unsigned int flags)
15

DESCRIPTION

17       Given  the  url handle of an already parsed URL, this function lets the
18       user extract individual pieces from it.
19
20       The what argument should be the particular part to  extract  (see  list
21       below) and part points to a 'char *' to get updated to point to a newly
22       allocated string with the contents.
23
24       The URL API has no particular maximum length for  URL  fiends.  In  the
25       real  world, excessively long field in URLs will cause problems even if
26       this API accepts them. This function can return very large ones.
27
28       The flags argument is a bitmask with individual features.
29
30       The returned part pointer must be freed with curl_free(3) after use.
31

FLAGS

33       The flags argument is zero, one or more bits set in a bitmask.
34
35       CURLU_DEFAULT_PORT
36              If the  handle  has  no  port  stored,  this  option  will  make
37              curl_url_get(3) return the default port for the used scheme.
38
39       CURLU_DEFAULT_SCHEME
40              If  the  handle  has  no  scheme  stored,  this option will make
41              curl_url_get(3) return the default scheme instead of error.
42
43       CURLU_NO_DEFAULT_PORT
44              Instructs curl_url_get(3) to not return  a  port  number  if  it
45              matches the default port for the scheme.
46
47       CURLU_URLDECODE
48              Asks curl_url_get(3) to URL decode the contents before returning
49              it. It will not attempt to decode the scheme, the port number or
50              the full URL.
51
52              The  query component will also get plus-to-space conversion as a
53              bonus when this bit is set.
54
55              Note that this URL decoding is charset unaware and you will  get
56              a  zero  terminated string back with data that could be intended
57              for a particular encoding.
58
59              If there's any byte values lower than 32 in the decoded  string,
60              the get operation will return an error instead.
61
62       CURLU_URLENCODE
63              If  set, will make curl_url_get(3) URL encode the host name part
64              when a full URL is retrieved. If not set (default), libcurl  re‐
65              turns  the  URL with the host name "raw" to support IDN names to
66              appear as-is. IDN host names are typically using non-ASCII bytes
67              that otherwise will be percent-encoded.
68
69              Note  that  even when not asking for URL encoding, the '%' (byte
70              37) will be URL encoded to  make  sure  the  host  name  remains
71              valid.
72
73       CURLU_PUNYCODE
74              If set and CURLU_URLENCODE is not set, and asked to retrieve the
75              CURLUPART_HOST or CURLUPART_URL parts, libcurl returns the  host
76              name in its punycode version if it contains any non-ASCII octets
77              (and is an IDN name).
78
79              If libcurl is built without IDN  capabilities,  using  this  bit
80              will  make  curl_url_get(3)  return CURLUE_LACKS_IDN if the host
81              name contains anything outside the ASCII range.
82
83              (Added in curl 7.88.0)
84

PARTS

86       CURLUPART_URL
87              When asked to return the full URL, curl_url_get(3) will return a
88              normalized  and  possibly  cleaned up version of what was previ‐
89              ously parsed.
90
91              We advise using the CURLU_PUNYCODE option  to  get  the  URL  as
92              "normalized" as possible since IDN allows host names to be writ‐
93              ten in many different ways that still end up the  same  punycode
94              version.
95
96       CURLUPART_SCHEME
97              Scheme cannot be URL decoded on get.
98
99       CURLUPART_USER
100
101       CURLUPART_PASSWORD
102
103       CURLUPART_OPTIONS
104              The  options  field  is  an optional field that might follow the
105              password in the userinfo part. It is only  recognized/used  when
106              parsing URLs for the following schemes: pop3, smtp and imap. The
107              URL API still allows users to set and get  this  field  indepen‐
108              dently of scheme when not parsing full URLs.
109
110       CURLUPART_HOST
111              The  host  name.  If  it is an IPv6 numeric address, the zone id
112              will not be part of it but  is  provided  separately  in  CURLU‐
113              PART_ZONEID. IPv6 numerical addresses are returned within brack‐
114              ets ([]).
115
116              IPv6 names are normalized when set, which should  make  them  as
117              short as possible while maintaining correct syntax.
118
119       CURLUPART_ZONEID
120              If  the  host  name  is a numeric IPv6 address, this field might
121              also be set.
122
123       CURLUPART_PORT
124              A port cannot be URL decoded on get. This number is returned  in
125              a string just like all other parts. That string is guaranteed to
126              hold a valid port number in ASCII using base 10.
127
128       CURLUPART_PATH
129              The part will be '/' even if no path is supplied in the  URL.  A
130              URL path always starts with a slash.
131
132       CURLUPART_QUERY
133              The  initial  question  mark  that  denotes the beginning of the
134              query part is a delimiter only.  It is not  part  of  the  query
135              contents.
136
137              A  not-present  query will lead part to be set to NULL.  A zero-
138              length query will lead part to be set to a zero-length string.
139
140              The query part will also get  pluses  converted  to  space  when
141              asked to URL decode on get with the CURLU_URLDECODE bit.
142
143       CURLUPART_FRAGMENT
144              The initial hash sign that denotes the beginning of the fragment
145              is a delimiter only. It is not part of the fragment contents.
146

EXAMPLE

148         CURLUcode rc;
149         CURLU *url = curl_url();
150         rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0);
151         if(!rc) {
152           char *scheme;
153           rc = curl_url_get(url, CURLUPART_SCHEME, &scheme, 0);
154           if(!rc) {
155             printf("the scheme is %s\n", scheme);
156             curl_free(scheme);
157           }
158           curl_url_cleanup(url);
159         }
160

AVAILABILITY

162       Added in 7.62.0. CURLUPART_ZONEID was added in 7.65.0.
163

RETURN VALUE

165       Returns a CURLUcode error value, which is CURLUE_OK (0)  if  everything
166       went  fine.  See  the libcurl-errors(3) man page for the full list with
167       descriptions.
168
169       If this function returns an error, no URL part is returned.
170

SEE ALSO

172       curl_url_cleanup(3),  curl_url(3),  curl_url_set(3),   curl_url_dup(3),
173       curl_url_strerror(3), CURLOPT_CURLU(3)
174
175
176
177libcurl 8.0.1                   March 08, 2023                 curl_url_get(3)
Impressum