1libcurl-tutorial(3) libcurl programming libcurl-tutorial(3)
2
3
4
6 libcurl-tutorial - libcurl programming tutorial
7
9 This document attempts to describe the general principles and some
10 basic approaches to consider when programming with libcurl. The text
11 will focus mainly on the C interface but might apply fairly well on
12 other interfaces as well as they usually follow the C one pretty
13 closely.
14
15 This document will refer to 'the user' as the person writing the source
16 code that uses libcurl. That would probably be you or someone in your
17 position. What will be generally referred to as 'the program' will be
18 the collected source code that you write that is using libcurl for
19 transfers. The program is outside libcurl and libcurl is outside of the
20 program.
21
22 To get more details on all options and functions described herein,
23 please refer to their respective man pages.
24
25
27 There are many different ways to build C programs. This chapter will
28 assume a UNIX-style build process. If you use a different build system,
29 you can still read this to get general information that may apply to
30 your environment as well.
31
32 Compiling the Program
33 Your compiler needs to know where the libcurl headers are
34 located. Therefore you must set your compiler's include path to
35 point to the directory where you installed them. The 'curl-con‐
36 fig'[3] tool can be used to get this information:
37
38 $ curl-config --cflags
39
40
41 Linking the Program with libcurl
42 When having compiled the program, you need to link your object
43 files to create a single executable. For that to succeed, you
44 need to link with libcurl and possibly also with other libraries
45 that libcurl itself depends on. Like the OpenSSL libraries, but
46 even some standard OS libraries may be needed on the command
47 line. To figure out which flags to use, once again the 'curl-
48 config' tool comes to the rescue:
49
50 $ curl-config --libs
51
52
53 SSL or Not
54 libcurl can be built and customized in many ways. One of the
55 things that varies from different libraries and builds is the
56 support for SSL-based transfers, like HTTPS and FTPS. If a sup‐
57 ported SSL library was detected properly at build-time, libcurl
58 will be built with SSL support. To figure out if an installed
59 libcurl has been built with SSL support enabled, use 'curl-con‐
60 fig' like this:
61
62 $ curl-config --feature
63
64 And if SSL is supported, the keyword 'SSL' will be written to
65 stdout, possibly together with a few other features that could
66 be either on or off on for different libcurls.
67
68 See also the "Features libcurl Provides" further down.
69
70 autoconf macro
71 When you write your configure script to detect libcurl and setup
72 variables accordingly, we offer a prewritten macro that probably
73 does everything you need in this area. See
74 docs/libcurl/libcurl.m4 file - it includes docs on how to use
75 it.
76
77
79 The people behind libcurl have put a considerable effort to make
80 libcurl work on a large amount of different operating systems and envi‐
81 ronments.
82
83 You program libcurl the same way on all platforms that libcurl runs on.
84 There are only very few minor considerations that differ. If you just
85 make sure to write your code portable enough, you may very well create
86 yourself a very portable program. libcurl shouldn't stop you from that.
87
88
90 The program must initialize some of the libcurl functionality globally.
91 That means it should be done exactly once, no matter how many times you
92 intend to use the library. Once for your program's entire life time.
93 This is done using
94
95 curl_global_init()
96
97 and it takes one parameter which is a bit pattern that tells libcurl
98 what to initialize. Using CURL_GLOBAL_ALL will make it initialize all
99 known internal sub modules, and might be a good default option. The
100 current two bits that are specified are:
101
102 CURL_GLOBAL_WIN32
103 which only does anything on Windows machines. When used
104 on a Windows machine, it'll make libcurl initialize the
105 win32 socket stuff. Without having that initialized prop‐
106 erly, your program cannot use sockets properly. You
107 should only do this once for each application, so if your
108 program already does this or of another library in use
109 does it, you should not tell libcurl to do this as well.
110
111 CURL_GLOBAL_SSL
112 which only does anything on libcurls compiled and built
113 SSL-enabled. On these systems, this will make libcurl
114 initialize the SSL library properly for this application.
115 This only needs to be done once for each application so
116 if your program or another library already does this,
117 this bit should not be needed.
118
119 libcurl has a default protection mechanism that detects if
120 curl_global_init(3) hasn't been called by the time curl_easy_perform(3)
121 is called and if that is the case, libcurl runs the function itself
122 with a guessed bit pattern. Please note that depending solely on this
123 is not considered nice nor very good.
124
125 When the program no longer uses libcurl, it should call
126 curl_global_cleanup(3), which is the opposite of the init call. It will
127 then do the reversed operations to cleanup the resources the
128 curl_global_init(3) call initialized.
129
130 Repeated calls to curl_global_init(3) and curl_global_cleanup(3) should
131 be avoided. They should only be called once each.
132
133
135 It is considered best-practice to determine libcurl features at run-
136 time rather than at build-time (if possible of course). By calling
137 curl_version_info(3) and checking out the details of the returned
138 struct, your program can figure out exactly what the currently running
139 libcurl supports.
140
141
143 libcurl first introduced the so called easy interface. All operations
144 in the easy interface are prefixed with 'curl_easy'.
145
146 Recent libcurl versions also offer the multi interface. More about that
147 interface, what it is targeted for and how to use it is detailed in a
148 separate chapter further down. You still need to understand the easy
149 interface first, so please continue reading for better understanding.
150
151 To use the easy interface, you must first create yourself an easy han‐
152 dle. You need one handle for each easy session you want to perform.
153 Basically, you should use one handle for every thread you plan to use
154 for transferring. You must never share the same handle in multiple
155 threads.
156
157 Get an easy handle with
158
159 easyhandle = curl_easy_init();
160
161 It returns an easy handle. Using that you proceed to the next step:
162 setting up your preferred actions. A handle is just a logic entity for
163 the upcoming transfer or series of transfers.
164
165 You set properties and options for this handle using
166 curl_easy_setopt(3). They control how the subsequent transfer or trans‐
167 fers will be made. Options remain set in the handle until set again to
168 something different. Alas, multiple requests using the same handle will
169 use the same options.
170
171 Many of the options you set in libcurl are "strings", pointers to data
172 terminated with a zero byte. When you set strings with
173 curl_easy_setopt(3), libcurl makes its own copy so that they don't need
174 to be kept around in your application after being set[4].
175
176 One of the most basic properties to set in the handle is the URL. You
177 set your preferred URL to transfer with CURLOPT_URL in a manner similar
178 to:
179
180 curl_easy_setopt(handle, CURLOPT_URL, "http://domain.com/");
181
182 Let's assume for a while that you want to receive data as the URL iden‐
183 tifies a remote resource you want to get here. Since you write a sort
184 of application that needs this transfer, I assume that you would like
185 to get the data passed to you directly instead of simply getting it
186 passed to stdout. So, you write your own function that matches this
187 prototype:
188
189 size_t write_data(void *buffer, size_t size, size_t nmemb, void
190 *userp);
191
192 You tell libcurl to pass all data to this function by issuing a func‐
193 tion similar to this:
194
195 curl_easy_setopt(easyhandle, CURLOPT_WRITEFUNCTION, write_data);
196
197 You can control what data your callback function gets in the fourth
198 argument by setting another property:
199
200 curl_easy_setopt(easyhandle, CURLOPT_WRITEDATA, &internal_struct);
201
202 Using that property, you can easily pass local data between your appli‐
203 cation and the function that gets invoked by libcurl. libcurl itself
204 won't touch the data you pass with CURLOPT_WRITEDATA.
205
206 libcurl offers its own default internal callback that will take care of
207 the data if you don't set the callback with CURLOPT_WRITEFUNCTION. It
208 will then simply output the received data to stdout. You can have the
209 default callback write the data to a different file handle by passing a
210 'FILE *' to a file opened for writing with the CURLOPT_WRITEDATA
211 option.
212
213 Now, we need to take a step back and have a deep breath. Here's one of
214 those rare platform-dependent nitpicks. Did you spot it? On some plat‐
215 forms[2], libcurl won't be able to operate on files opened by the pro‐
216 gram. Thus, if you use the default callback and pass in an open file
217 with CURLOPT_WRITEDATA, it will crash. You should therefore avoid this
218 to make your program run fine virtually everywhere.
219
220 (CURLOPT_WRITEDATA was formerly known as CURLOPT_FILE. Both names still
221 work and do the same thing).
222
223 If you're using libcurl as a win32 DLL, you MUST use the CURLOPT_WRITE‐
224 FUNCTION if you set CURLOPT_WRITEDATA - or you will experience crashes.
225
226 There are of course many more options you can set, and we'll get back
227 to a few of them later. Let's instead continue to the actual transfer:
228
229 success = curl_easy_perform(easyhandle);
230
231 curl_easy_perform(3) will connect to the remote site, do the necessary
232 commands and receive the transfer. Whenever it receives data, it calls
233 the callback function we previously set. The function may get one byte
234 at a time, or it may get many kilobytes at once. libcurl delivers as
235 much as possible as often as possible. Your callback function should
236 return the number of bytes it "took care of". If that is not the exact
237 same amount of bytes that was passed to it, libcurl will abort the
238 operation and return with an error code.
239
240 When the transfer is complete, the function returns a return code that
241 informs you if it succeeded in its mission or not. If a return code
242 isn't enough for you, you can use the CURLOPT_ERRORBUFFER to point
243 libcurl to a buffer of yours where it'll store a human readable error
244 message as well.
245
246 If you then want to transfer another file, the handle is ready to be
247 used again. Mind you, it is even preferred that you re-use an existing
248 handle if you intend to make another transfer. libcurl will then
249 attempt to re-use the previous connection.
250
251 For some protocols, downloading a file can involve a complicated
252 process of logging in, setting the transfer mode, changing the current
253 directory and finally transferring the file data. libcurl takes care of
254 all that complication for you. Given simply the URL to a file, libcurl
255 will take care of all the details needed to get the file moved from one
256 machine to another.
257
258
260 The first basic rule is that you must never share a libcurl handle (be
261 it easy or multi or whatever) between multiple threads. Only use one
262 handle in one thread at a time.
263
264 libcurl is completely thread safe, except for two issues: signals and
265 SSL/TLS handlers. Signals are used for timing out name resolves (during
266 DNS lookup) - when built without c-ares support and not on Windows.
267
268 If you are accessing HTTPS or FTPS URLs in a multi-threaded manner, you
269 are then of course using the underlying SSL library multi-threaded and
270 those libs might have their own requirements on this issue. Basically,
271 you need to provide one or two functions to allow it to function prop‐
272 erly. For all details, see this:
273
274 OpenSSL
275
276 http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION
277
278 GnuTLS
279
280 http://www.gnu.org/software/gnutls/man‐
281 ual/html_node/Multi_002dthreaded-applications.html
282
283 NSS
284
285 is claimed to be thread-safe already without anything required.
286
287 yassl
288
289 Required actions unknown.
290
291 When using multiple threads you should set the CURLOPT_NOSIGNAL option
292 to 1 for all handles. Everything will or might work fine except that
293 timeouts are not honored during the DNS lookup - which you can work
294 around by building libcurl with c-ares support. c-ares is a library
295 that provides asynchronous name resolves. On some platforms, libcurl
296 simply will not function properly multi-threaded unless this option is
297 set.
298
299 Also, note that CURLOPT_DNS_USE_GLOBAL_CACHE is not thread-safe.
300
301
303 There will always be times when the transfer fails for some reason. You
304 might have set the wrong libcurl option or misunderstood what the
305 libcurl option actually does, or the remote server might return non-
306 standard replies that confuse the library which then confuses your pro‐
307 gram.
308
309 There's one golden rule when these things occur: set the CURLOPT_VER‐
310 BOSE option to 1. It'll cause the library to spew out the entire proto‐
311 col details it sends, some internal info and some received protocol
312 data as well (especially when using FTP). If you're using HTTP, adding
313 the headers in the received output to study is also a clever way to get
314 a better understanding why the server behaves the way it does. Include
315 headers in the normal body output with CURLOPT_HEADER set 1.
316
317 Of course, there are bugs left. We need to know about them to be able
318 to fix them, so we're quite dependent on your bug reports! When you do
319 report suspected bugs in libcurl, please include as many details as you
320 possibly can: a protocol dump that CURLOPT_VERBOSE produces, library
321 version, as much as possible of your code that uses libcurl, operating
322 system name and version, compiler name and version etc.
323
324 If CURLOPT_VERBOSE is not enough, you increase the level of debug data
325 your application receive by using the CURLOPT_DEBUGFUNCTION.
326
327 Getting some in-depth knowledge about the protocols involved is never
328 wrong, and if you're trying to do funny things, you might very well
329 understand libcurl and how to use it better if you study the appropri‐
330 ate RFC documents at least briefly.
331
332
334 libcurl tries to keep a protocol independent approach to most trans‐
335 fers, thus uploading to a remote FTP site is very similar to uploading
336 data to a HTTP server with a PUT request.
337
338 Of course, first you either create an easy handle or you re-use one
339 existing one. Then you set the URL to operate on just like before. This
340 is the remote URL, that we now will upload.
341
342 Since we write an application, we most likely want libcurl to get the
343 upload data by asking us for it. To make it do that, we set the read
344 callback and the custom pointer libcurl will pass to our read callback.
345 The read callback should have a prototype similar to:
346
347 size_t function(char *bufptr, size_t size, size_t nitems, void
348 *userp);
349
350 Where bufptr is the pointer to a buffer we fill in with data to upload
351 and size*nitems is the size of the buffer and therefore also the maxi‐
352 mum amount of data we can return to libcurl in this call. The 'userp'
353 pointer is the custom pointer we set to point to a struct of ours to
354 pass private data between the application and the callback.
355
356 curl_easy_setopt(easyhandle, CURLOPT_READFUNCTION, read_function);
357
358 curl_easy_setopt(easyhandle, CURLOPT_READDATA, &filedata);
359
360 Tell libcurl that we want to upload:
361
362 curl_easy_setopt(easyhandle, CURLOPT_UPLOAD, 1L);
363
364 A few protocols won't behave properly when uploads are done without any
365 prior knowledge of the expected file size. So, set the upload file size
366 using the CURLOPT_INFILESIZE_LARGE for all known file sizes like
367 this[1]:
368
369 /* in this example, file_size must be an curl_off_t variable */
370 curl_easy_setopt(easyhandle, CURLOPT_INFILESIZE_LARGE, file_size);
371
372 When you call curl_easy_perform(3) this time, it'll perform all the
373 necessary operations and when it has invoked the upload it'll call your
374 supplied callback to get the data to upload. The program should return
375 as much data as possible in every invoke, as that is likely to make the
376 upload perform as fast as possible. The callback should return the num‐
377 ber of bytes it wrote in the buffer. Returning 0 will signal the end of
378 the upload.
379
380
382 Many protocols use or even require that user name and password are pro‐
383 vided to be able to download or upload the data of your choice. libcurl
384 offers several ways to specify them.
385
386 Most protocols support that you specify the name and password in the
387 URL itself. libcurl will detect this and use them accordingly. This is
388 written like this:
389
390 protocol://user:password@example.com/path/
391
392 If you need any odd letters in your user name or password, you should
393 enter them URL encoded, as %XX where XX is a two-digit hexadecimal num‐
394 ber.
395
396 libcurl also provides options to set various passwords. The user name
397 and password as shown embedded in the URL can instead get set with the
398 CURLOPT_USERPWD option. The argument passed to libcurl should be a char
399 * to a string in the format "user:password". In a manner like this:
400
401 curl_easy_setopt(easyhandle, CURLOPT_USERPWD, "myname:thesecret");
402
403 Another case where name and password might be needed at times, is for
404 those users who need to authenticate themselves to a proxy they use.
405 libcurl offers another option for this, the CURLOPT_PROXYUSERPWD. It is
406 used quite similar to the CURLOPT_USERPWD option like this:
407
408 curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "myname:these‐
409 cret");
410
411 There's a long time UNIX "standard" way of storing ftp user names and
412 passwords, namely in the $HOME/.netrc file. The file should be made
413 private so that only the user may read it (see also the "Security Con‐
414 siderations" chapter), as it might contain the password in plain text.
415 libcurl has the ability to use this file to figure out what set of user
416 name and password to use for a particular host. As an extension to the
417 normal functionality, libcurl also supports this file for non-FTP pro‐
418 tocols such as HTTP. To make curl use this file, use the CURLOPT_NETRC
419 option:
420
421 curl_easy_setopt(easyhandle, CURLOPT_NETRC, 1L);
422
423 And a very basic example of how such a .netrc file may look like:
424
425 machine myhost.mydomain.com
426 login userlogin
427 password secretword
428
429 All these examples have been cases where the password has been
430 optional, or at least you could leave it out and have libcurl attempt
431 to do its job without it. There are times when the password isn't
432 optional, like when you're using an SSL private key for secure trans‐
433 fers.
434
435 To pass the known private key password to libcurl:
436
437 curl_easy_setopt(easyhandle, CURLOPT_KEYPASSWD, "keypassword");
438
439
441 The previous chapter showed how to set user name and password for get‐
442 ting URLs that require authentication. When using the HTTP protocol,
443 there are many different ways a client can provide those credentials to
444 the server and you can control which way libcurl will (attempt to) use
445 them. The default HTTP authentication method is called 'Basic', which
446 is sending the name and password in clear-text in the HTTP request,
447 base64-encoded. This is insecure.
448
449 At the time of this writing, libcurl can be built to use: Basic,
450 Digest, NTLM, Negotiate, GSS-Negotiate and SPNEGO. You can tell libcurl
451 which one to use with CURLOPT_HTTPAUTH as in:
452
453 curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST);
454
455 And when you send authentication to a proxy, you can also set authenti‐
456 cation type the same way but instead with CURLOPT_PROXYAUTH:
457
458 curl_easy_setopt(easyhandle, CURLOPT_PROXYAUTH, CURLAUTH_NTLM);
459
460 Both these options allow you to set multiple types (by ORing them
461 together), to make libcurl pick the most secure one out of the types
462 the server/proxy claims to support. This method does however add a
463 round-trip since libcurl must first ask the server what it supports:
464
465 curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH,
466 CURLAUTH_DIGEST|CURLAUTH_BASIC);
467
468 For convenience, you can use the 'CURLAUTH_ANY' define (instead of a
469 list with specific types) which allows libcurl to use whatever method
470 it wants.
471
472 When asking for multiple types, libcurl will pick the available one it
473 considers "best" in its own internal order of preference.
474
475
477 We get many questions regarding how to issue HTTP POSTs with libcurl
478 the proper way. This chapter will thus include examples using both dif‐
479 ferent versions of HTTP POST that libcurl supports.
480
481 The first version is the simple POST, the most common version, that
482 most HTML pages using the <form> tag uses. We provide a pointer to the
483 data and tell libcurl to post it all to the remote site:
484
485 char *data="name=daniel&project=curl";
486 curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, data);
487 curl_easy_setopt(easyhandle, CURLOPT_URL, "http://posthere.com/");
488
489 curl_easy_perform(easyhandle); /* post away! */
490
491 Simple enough, huh? Since you set the POST options with the CUR‐
492 LOPT_POSTFIELDS, this automatically switches the handle to use POST in
493 the upcoming request.
494
495 Ok, so what if you want to post binary data that also requires you to
496 set the Content-Type: header of the post? Well, binary posts prevent
497 libcurl from being able to do strlen() on the data to figure out the
498 size, so therefore we must tell libcurl the size of the post data. Set‐
499 ting headers in libcurl requests are done in a generic way, by building
500 a list of our own headers and then passing that list to libcurl.
501
502 struct curl_slist *headers=NULL;
503 headers = curl_slist_append(headers, "Content-Type: text/xml");
504
505 /* post binary data */
506 curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, binaryptr);
507
508 /* set the size of the postfields data */
509 curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDSIZE, 23L);
510
511 /* pass our list of custom made headers */
512 curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers);
513
514 curl_easy_perform(easyhandle); /* post away! */
515
516 curl_slist_free_all(headers); /* free the header list */
517
518 While the simple examples above cover the majority of all cases where
519 HTTP POST operations are required, they don't do multi-part formposts.
520 Multi-part formposts were introduced as a better way to post (possibly
521 large) binary data and were first documented in the RFC1867 (updated in
522 RFC2388). They're called multi-part because they're built by a chain of
523 parts, each part being a single unit of data. Each part has its own
524 name and contents. You can in fact create and post a multi-part form‐
525 post with the regular libcurl POST support described above, but that
526 would require that you build a formpost yourself and provide to
527 libcurl. To make that easier, libcurl provides curl_formadd(3). Using
528 this function, you add parts to the form. When you're done adding
529 parts, you post the whole form.
530
531 The following example sets two simple text parts with plain textual
532 contents, and then a file with binary contents and uploads the whole
533 thing.
534
535 struct curl_httppost *post=NULL;
536 struct curl_httppost *last=NULL;
537 curl_formadd(&post, &last,
538 CURLFORM_COPYNAME, "name",
539 CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END);
540 curl_formadd(&post, &last,
541 CURLFORM_COPYNAME, "project",
542 CURLFORM_COPYCONTENTS, "curl", CURLFORM_END);
543 curl_formadd(&post, &last,
544 CURLFORM_COPYNAME, "logotype-image",
545 CURLFORM_FILECONTENT, "curl.png", CURLFORM_END);
546
547 /* Set the form info */
548 curl_easy_setopt(easyhandle, CURLOPT_HTTPPOST, post);
549
550 curl_easy_perform(easyhandle); /* post away! */
551
552 /* free the post data again */
553 curl_formfree(post);
554
555 Multipart formposts are chains of parts using MIME-style separators and
556 headers. It means that each one of these separate parts get a few head‐
557 ers set that describe the individual content-type, size etc. To enable
558 your application to handicraft this formpost even more, libcurl allows
559 you to supply your own set of custom headers to such an individual form
560 part. You can of course supply headers to as many parts as you like,
561 but this little example will show how you set headers to one specific
562 part when you add that to the post handle:
563
564 struct curl_slist *headers=NULL;
565 headers = curl_slist_append(headers, "Content-Type: text/xml");
566
567 curl_formadd(&post, &last,
568 CURLFORM_COPYNAME, "logotype-image",
569 CURLFORM_FILECONTENT, "curl.xml",
570 CURLFORM_CONTENTHEADER, headers,
571 CURLFORM_END);
572
573 curl_easy_perform(easyhandle); /* post away! */
574
575 curl_formfree(post); /* free post */
576 curl_slist_free_all(headers); /* free custom header list */
577
578 Since all options on an easyhandle are "sticky", they remain the same
579 until changed even if you do call curl_easy_perform(3), you may need to
580 tell curl to go back to a plain GET request if you intend to do one as
581 your next request. You force an easyhandle to go back to GET by using
582 the CURLOPT_HTTPGET option:
583
584 curl_easy_setopt(easyhandle, CURLOPT_HTTPGET, 1L);
585
586 Just setting CURLOPT_POSTFIELDS to "" or NULL will *not* stop libcurl
587 from doing a POST. It will just make it POST without any data to send!
588
589
591 For historical and traditional reasons, libcurl has a built-in progress
592 meter that can be switched on and then makes it present a progress
593 meter in your terminal.
594
595 Switch on the progress meter by, oddly enough, setting CUR‐
596 LOPT_NOPROGRESS to zero. This option is set to 1 by default.
597
598 For most applications however, the built-in progress meter is useless
599 and what instead is interesting is the ability to specify a progress
600 callback. The function pointer you pass to libcurl will then be called
601 on irregular intervals with information about the current transfer.
602
603 Set the progress callback by using CURLOPT_PROGRESSFUNCTION. And pass a
604 pointer to a function that matches this prototype:
605
606 int progress_callback(void *clientp,
607 double dltotal,
608 double dlnow,
609 double ultotal,
610 double ulnow);
611
612 If any of the input arguments is unknown, a 0 will be passed. The first
613 argument, the 'clientp' is the pointer you pass to libcurl with CUR‐
614 LOPT_PROGRESSDATA. libcurl won't touch it.
615
616
618 There's basically only one thing to keep in mind when using C++ instead
619 of C when interfacing libcurl:
620
621 The callbacks CANNOT be non-static class member functions
622
623 Example C++ code:
624
625 class AClass {
626 static size_t write_data(void *ptr, size_t size, size_t nmemb,
627 void *ourpointer)
628 {
629 /* do what you want with the data */
630 }
631 }
632
633
635 What "proxy" means according to Merriam-Webster: "a person authorized
636 to act for another" but also "the agency, function, or office of a
637 deputy who acts as a substitute for another".
638
639 Proxies are exceedingly common these days. Companies often only offer
640 Internet access to employees through their proxies. Network clients or
641 user-agents ask the proxy for documents, the proxy does the actual
642 request and then it returns them.
643
644 libcurl supports SOCKS and HTTP proxies. When a given URL is wanted,
645 libcurl will ask the proxy for it instead of trying to connect to the
646 actual host identified in the URL.
647
648 If you're using a SOCKS proxy, you may find that libcurl doesn't quite
649 support all operations through it.
650
651 For HTTP proxies: the fact that the proxy is a HTTP proxy puts certain
652 restrictions on what can actually happen. A requested URL that might
653 not be a HTTP URL will be still be passed to the HTTP proxy to deliver
654 back to libcurl. This happens transparently, and an application may not
655 need to know. I say "may", because at times it is very important to
656 understand that all operations over a HTTP proxy use the HTTP protocol.
657 For example, you can't invoke your own custom FTP commands or even
658 proper FTP directory listings.
659
660
661 Proxy Options
662
663 To tell libcurl to use a proxy at a given port number:
664
665 curl_easy_setopt(easyhandle, CURLOPT_PROXY, "proxy-
666 host.com:8080");
667
668 Some proxies require user authentication before allowing a
669 request, and you pass that information similar to this:
670
671 curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "user:pass‐
672 word");
673
674 If you want to, you can specify the host name only in the CUR‐
675 LOPT_PROXY option, and set the port number separately with CUR‐
676 LOPT_PROXYPORT.
677
678 Tell libcurl what kind of proxy it is with CURLOPT_PROXYTYPE (if
679 not, it will default to assume a HTTP proxy):
680
681 curl_easy_setopt(easyhandle, CURLOPT_PROXYTYPE, CURL‐
682 PROXY_SOCKS4);
683
684
685 Environment Variables
686
687 libcurl automatically checks and uses a set of environment vari‐
688 ables to know what proxies to use for certain protocols. The
689 names of the variables are following an ancient de facto stan‐
690 dard and are built up as "[protocol]_proxy" (note the lower cas‐
691 ing). Which makes the variable 'http_proxy' checked for a name
692 of a proxy to use when the input URL is HTTP. Following the same
693 rule, the variable named 'ftp_proxy' is checked for FTP URLs.
694 Again, the proxies are always HTTP proxies, the different names
695 of the variables simply allows different HTTP proxies to be
696 used.
697
698 The proxy environment variable contents should be in the format
699 "[protocol://][user:password@]machine[:port]". Where the proto‐
700 col:// part is simply ignored if present (so http://proxy and
701 bluerk://proxy will do the same) and the optional port number
702 specifies on which port the proxy operates on the host. If not
703 specified, the internal default port number will be used and
704 that is most likely *not* the one you would like it to be.
705
706 There are two special environment variables. 'all_proxy' is what
707 sets proxy for any URL in case the protocol specific variable
708 wasn't set, and 'no_proxy' defines a list of hosts that should
709 not use a proxy even though a variable may say so. If 'no_proxy'
710 is a plain asterisk ("*") it matches all hosts.
711
712 To explicitly disable libcurl's checking for and using the proxy
713 environment variables, set the proxy name to "" - an empty
714 string - with CURLOPT_PROXY.
715
716 SSL and Proxies
717
718 SSL is for secure point-to-point connections. This involves
719 strong encryption and similar things, which effectively makes it
720 impossible for a proxy to operate as a "man in between" which
721 the proxy's task is, as previously discussed. Instead, the only
722 way to have SSL work over a HTTP proxy is to ask the proxy to
723 tunnel trough everything without being able to check or fiddle
724 with the traffic.
725
726 Opening an SSL connection over a HTTP proxy is therefor a matter
727 of asking the proxy for a straight connection to the target host
728 on a specified port. This is made with the HTTP request CONNECT.
729 ("please mr proxy, connect me to that remote host").
730
731 Because of the nature of this operation, where the proxy has no
732 idea what kind of data that is passed in and out through this
733 tunnel, this breaks some of the very few advantages that come
734 from using a proxy, such as caching. Many organizations prevent
735 this kind of tunneling to other destination port numbers than
736 443 (which is the default HTTPS port number).
737
738
739 Tunneling Through Proxy
740 As explained above, tunneling is required for SSL to work and
741 often even restricted to the operation intended for SSL; HTTPS.
742
743 This is however not the only time proxy-tunneling might offer
744 benefits to you or your application.
745
746 As tunneling opens a direct connection from your application to
747 the remote machine, it suddenly also re-introduces the ability
748 to do non-HTTP operations over a HTTP proxy. You can in fact use
749 things such as FTP upload or FTP custom commands this way.
750
751 Again, this is often prevented by the administrators of proxies
752 and is rarely allowed.
753
754 Tell libcurl to use proxy tunneling like this:
755
756 curl_easy_setopt(easyhandle, CURLOPT_HTTPPROXYTUNNEL, 1L);
757
758 In fact, there might even be times when you want to do plain
759 HTTP operations using a tunnel like this, as it then enables you
760 to operate on the remote server instead of asking the proxy to
761 do so. libcurl will not stand in the way for such innovative
762 actions either!
763
764
765 Proxy Auto-Config
766
767 Netscape first came up with this. It is basically a web page
768 (usually using a .pac extension) with a Javascript that when
769 executed by the browser with the requested URL as input, returns
770 information to the browser on how to connect to the URL. The
771 returned information might be "DIRECT" (which means no proxy
772 should be used), "PROXY host:port" (to tell the browser where
773 the proxy for this particular URL is) or "SOCKS host:port" (to
774 direct the browser to a SOCKS proxy).
775
776 libcurl has no means to interpret or evaluate Javascript and
777 thus it doesn't support this. If you get yourself in a position
778 where you face this nasty invention, the following advice have
779 been mentioned and used in the past:
780
781 - Depending on the Javascript complexity, write up a script that
782 translates it to another language and execute that.
783
784 - Read the Javascript code and rewrite the same logic in another
785 language.
786
787 - Implement a Javascript interpreter; people have successfully
788 used the Mozilla Javascript engine in the past.
789
790 - Ask your admins to stop this, for a static proxy setup or sim‐
791 ilar.
792
793
795 Re-cycling the same easy handle several times when doing multiple
796 requests is the way to go.
797
798 After each single curl_easy_perform(3) operation, libcurl will keep the
799 connection alive and open. A subsequent request using the same easy
800 handle to the same host might just be able to use the already open con‐
801 nection! This reduces network impact a lot.
802
803 Even if the connection is dropped, all connections involving SSL to the
804 same host again, will benefit from libcurl's session ID cache that
805 drastically reduces re-connection time.
806
807 FTP connections that are kept alive save a lot of time, as the command-
808 response round-trips are skipped, and also you don't risk getting
809 blocked without permission to login again like on many FTP servers only
810 allowing N persons to be logged in at the same time.
811
812 libcurl caches DNS name resolving results, to make lookups of a previ‐
813 ously looked up name a lot faster.
814
815 Other interesting details that improve performance for subsequent
816 requests may also be added in the future.
817
818 Each easy handle will attempt to keep the last few connections alive
819 for a while in case they are to be used again. You can set the size of
820 this "cache" with the CURLOPT_MAXCONNECTS option. Default is 5. There
821 is very seldom any point in changing this value, and if you think of
822 changing this it is often just a matter of thinking again.
823
824 To force your upcoming request to not use an already existing connec‐
825 tion (it will even close one first if there happens to be one alive to
826 the same host you're about to operate on), you can do that by setting
827 CURLOPT_FRESH_CONNECT to 1. In a similar spirit, you can also forbid
828 the upcoming request to be "lying" around and possibly get re-used
829 after the request by setting CURLOPT_FORBID_REUSE to 1.
830
831
833 When you use libcurl to do HTTP requests, it'll pass along a series of
834 headers automatically. It might be good for you to know and understand
835 these. You can replace or remove them by using the CURLOPT_HTTPHEADER
836 option.
837
838
839 Host This header is required by HTTP 1.1 and even many 1.0 servers
840 and should be the name of the server we want to talk to. This
841 includes the port number if anything but default.
842
843
844 Pragma "no-cache". Tells a possible proxy to not grab a copy from the
845 cache but to fetch a fresh one.
846
847
848 Accept "*/*".
849
850
851 Expect When doing POST requests, libcurl sets this header to "100-con‐
852 tinue" to ask the server for an "OK" message before it proceeds
853 with sending the data part of the post. If the POSTed data
854 amount is deemed "small", libcurl will not use this header.
855
856
858 There is an ongoing development today where more and more protocols are
859 built upon HTTP for transport. This has obvious benefits as HTTP is a
860 tested and reliable protocol that is widely deployed and has excellent
861 proxy-support.
862
863 When you use one of these protocols, and even when doing other kinds of
864 programming you may need to change the traditional HTTP (or FTP or...)
865 manners. You may need to change words, headers or various data.
866
867 libcurl is your friend here too.
868
869
870 CUSTOMREQUEST
871 If just changing the actual HTTP request keyword is what you
872 want, like when GET, HEAD or POST is not good enough for you,
873 CURLOPT_CUSTOMREQUEST is there for you. It is very simple to
874 use:
875
876 curl_easy_setopt(easyhandle, CURLOPT_CUSTOMREQUEST, "MYOWNRE‐
877 QUEST");
878
879 When using the custom request, you change the request keyword of
880 the actual request you are performing. Thus, by default you make
881 a GET request but you can also make a POST operation (as
882 described before) and then replace the POST keyword if you want
883 to. You're the boss.
884
885
886 Modify Headers
887 HTTP-like protocols pass a series of headers to the server when
888 doing the request, and you're free to pass any amount of extra
889 headers that you think fit. Adding headers is this easy:
890
891 struct curl_slist *headers=NULL; /* init to NULL is important */
892
893 headers = curl_slist_append(headers, "Hey-server-hey: how are you?");
894 headers = curl_slist_append(headers, "X-silly-content: yes");
895
896 /* pass our list of custom made headers */
897 curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers);
898
899 curl_easy_perform(easyhandle); /* transfer http */
900
901 curl_slist_free_all(headers); /* free the header list */
902
903 ... and if you think some of the internally generated headers,
904 such as Accept: or Host: don't contain the data you want them to
905 contain, you can replace them by simply setting them too:
906
907 headers = curl_slist_append(headers, "Accept: Agent-007");
908 headers = curl_slist_append(headers, "Host: munged.host.line");
909
910
911 Delete Headers
912 If you replace an existing header with one with no contents, you
913 will prevent the header from being sent. For instance, if you
914 want to completely prevent the "Accept:" header from being sent,
915 you can disable it with code similar to this:
916
917 headers = curl_slist_append(headers, "Accept:");
918
919 Both replacing and canceling internal headers should be done
920 with careful consideration and you should be aware that you may
921 violate the HTTP protocol when doing so.
922
923
924 Enforcing chunked transfer-encoding
925
926 By making sure a request uses the custom header "Transfer-Encod‐
927 ing: chunked" when doing a non-GET HTTP operation, libcurl will
928 switch over to "chunked" upload, even though the size of the
929 data to upload might be known. By default, libcurl usually
930 switches over to chunked upload automatically if the upload data
931 size is unknown.
932
933
934 HTTP Version
935
936 All HTTP requests includes the version number to tell the server
937 which version we support. libcurl speaks HTTP 1.1 by default.
938 Some very old servers don't like getting 1.1-requests and when
939 dealing with stubborn old things like that, you can tell libcurl
940 to use 1.0 instead by doing something like this:
941
942 curl_easy_setopt(easyhandle, CURLOPT_HTTP_VERSION,
943 CURL_HTTP_VERSION_1_0);
944
945
946 FTP Custom Commands
947
948 Not all protocols are HTTP-like, and thus the above may not help
949 you when you want to make, for example, your FTP transfers to
950 behave differently.
951
952 Sending custom commands to a FTP server means that you need to
953 send the commands exactly as the FTP server expects them (RFC959
954 is a good guide here), and you can only use commands that work
955 on the control-connection alone. All kinds of commands that
956 require data interchange and thus need a data-connection must be
957 left to libcurl's own judgement. Also be aware that libcurl will
958 do its very best to change directory to the target directory
959 before doing any transfer, so if you change directory (with CWD
960 or similar) you might confuse libcurl and then it might not
961 attempt to transfer the file in the correct remote directory.
962
963 A little example that deletes a given file before an operation:
964
965 headers = curl_slist_append(headers, "DELE file-to-remove");
966
967 /* pass the list of custom commands to the handle */
968 curl_easy_setopt(easyhandle, CURLOPT_QUOTE, headers);
969
970 curl_easy_perform(easyhandle); /* transfer ftp data! */
971
972 curl_slist_free_all(headers); /* free the header list */
973
974 If you would instead want this operation (or chain of opera‐
975 tions) to happen _after_ the data transfer took place the option
976 to curl_easy_setopt(3) would instead be called CURLOPT_POSTQUOTE
977 and used the exact same way.
978
979 The custom FTP command will be issued to the server in the same
980 order they are added to the list, and if a command gets an error
981 code returned back from the server, no more commands will be
982 issued and libcurl will bail out with an error code
983 (CURLE_QUOTE_ERROR). Note that if you use CURLOPT_QUOTE to send
984 commands before a transfer, no transfer will actually take place
985 when a quote command has failed.
986
987 If you set the CURLOPT_HEADER to 1, you will tell libcurl to get
988 information about the target file and output "headers" about it.
989 The headers will be in "HTTP-style", looking like they do in
990 HTTP.
991
992 The option to enable headers or to run custom FTP commands may
993 be useful to combine with CURLOPT_NOBODY. If this option is set,
994 no actual file content transfer will be performed.
995
996
997 FTP Custom CUSTOMREQUEST
998 If you do want to list the contents of a FTP directory using
999 your own defined FTP command, CURLOPT_CUSTOMREQUEST will do just
1000 that. "NLST" is the default one for listing directories but
1001 you're free to pass in your idea of a good alternative.
1002
1003
1005 In the HTTP sense, a cookie is a name with an associated value. A
1006 server sends the name and value to the client, and expects it to get
1007 sent back on every subsequent request to the server that matches the
1008 particular conditions set. The conditions include that the domain name
1009 and path match and that the cookie hasn't become too old.
1010
1011 In real-world cases, servers send new cookies to replace existing ones
1012 to update them. Server use cookies to "track" users and to keep "ses‐
1013 sions".
1014
1015 Cookies are sent from server to clients with the header Set-Cookie: and
1016 they're sent from clients to servers with the Cookie: header.
1017
1018 To just send whatever cookie you want to a server, you can use CUR‐
1019 LOPT_COOKIE to set a cookie string like this:
1020
1021 curl_easy_setopt(easyhandle, CURLOPT_COOKIE, "name1=var1;
1022 name2=var2;");
1023
1024 In many cases, that is not enough. You might want to dynamically save
1025 whatever cookies the remote server passes to you, and make sure those
1026 cookies are then used accordingly on later requests.
1027
1028 One way to do this, is to save all headers you receive in a plain file
1029 and when you make a request, you tell libcurl to read the previous
1030 headers to figure out which cookies to use. Set the header file to read
1031 cookies from with CURLOPT_COOKIEFILE.
1032
1033 The CURLOPT_COOKIEFILE option also automatically enables the cookie
1034 parser in libcurl. Until the cookie parser is enabled, libcurl will not
1035 parse or understand incoming cookies and they will just be ignored.
1036 However, when the parser is enabled the cookies will be understood and
1037 the cookies will be kept in memory and used properly in subsequent
1038 requests when the same handle is used. Many times this is enough, and
1039 you may not have to save the cookies to disk at all. Note that the file
1040 you specify to CURLOPT_COOKIEFILE doesn't have to exist to enable the
1041 parser, so a common way to just enable the parser and not read any
1042 cookies is to use the name of a file you know doesn't exist.
1043
1044 If you would rather use existing cookies that you've previously
1045 received with your Netscape or Mozilla browsers, you can make libcurl
1046 use that cookie file as input. The CURLOPT_COOKIEFILE is used for that
1047 too, as libcurl will automatically find out what kind of file it is and
1048 act accordingly.
1049
1050 Perhaps the most advanced cookie operation libcurl offers, is saving
1051 the entire internal cookie state back into a Netscape/Mozilla formatted
1052 cookie file. We call that the cookie-jar. When you set a file name with
1053 CURLOPT_COOKIEJAR, that file name will be created and all received
1054 cookies will be stored in it when curl_easy_cleanup(3) is called. This
1055 enables cookies to get passed on properly between multiple handles
1056 without any information getting lost.
1057
1058
1060 FTP transfers use a second TCP/IP connection for the data transfer.
1061 This is usually a fact you can forget and ignore but at times this fact
1062 will come back to haunt you. libcurl offers several different ways to
1063 customize how the second connection is being made.
1064
1065 libcurl can either connect to the server a second time or tell the
1066 server to connect back to it. The first option is the default and it is
1067 also what works best for all the people behind firewalls, NATs or IP-
1068 masquerading setups. libcurl then tells the server to open up a new
1069 port and wait for a second connection. This is by default attempted
1070 with EPSV first, and if that doesn't work it tries PASV instead. (EPSV
1071 is an extension to the original FTP spec and does not exist nor work on
1072 all FTP servers.)
1073
1074 You can prevent libcurl from first trying the EPSV command by setting
1075 CURLOPT_FTP_USE_EPSV to zero.
1076
1077 In some cases, you will prefer to have the server connect back to you
1078 for the second connection. This might be when the server is perhaps
1079 behind a firewall or something and only allows connections on a single
1080 port. libcurl then informs the remote server which IP address and port
1081 number to connect to. This is made with the CURLOPT_FTPPORT option. If
1082 you set it to "-", libcurl will use your system's "default IP address".
1083 If you want to use a particular IP, you can set the full IP address, a
1084 host name to resolve to an IP address or even a local network interface
1085 name that libcurl will get the IP address from.
1086
1087 When doing the "PORT" approach, libcurl will attempt to use the EPRT
1088 and the LPRT before trying PORT, as they work with more protocols. You
1089 can disable this behavior by setting CURLOPT_FTP_USE_EPRT to zero.
1090
1091
1093 Some protocols provide "headers", meta-data separated from the normal
1094 data. These headers are by default not included in the normal data
1095 stream, but you can make them appear in the data stream by setting CUR‐
1096 LOPT_HEADER to 1.
1097
1098 What might be even more useful, is libcurl's ability to separate the
1099 headers from the data and thus make the callbacks differ. You can for
1100 example set a different pointer to pass to the ordinary write callback
1101 by setting CURLOPT_WRITEHEADER.
1102
1103 Or, you can set an entirely separate function to receive the headers,
1104 by using CURLOPT_HEADERFUNCTION.
1105
1106 The headers are passed to the callback function one by one, and you can
1107 depend on that fact. It makes it easier for you to add custom header
1108 parsers etc.
1109
1110 "Headers" for FTP transfers equal all the FTP server responses. They
1111 aren't actually true headers, but in this case we pretend they are! ;-)
1112
1113
1115 [ curl_easy_getinfo ]
1116
1117
1119 The libcurl project takes security seriously. The library is written
1120 with caution and precautions are taken to mitigate many kinds of risks
1121 encountered while operating with potentially malicious servers on the
1122 Internet. It is a powerful library, however, which allows application
1123 writers to make trade offs between ease of writing and exposure to
1124 potential risky operations. If used the right way, you can use libcurl
1125 to transfer data pretty safely.
1126
1127 Many applications are used in closed networks where users and servers
1128 can be trusted, but many others are used on arbitrary servers and are
1129 fed input from potentially untrusted users. Following is a discussion
1130 about some risks in the ways in which applications commonly use libcurl
1131 and potential mitigations of those risks. It is by no means comprehen‐
1132 sive, but shows classes of attacks that robust applications should con‐
1133 sider. The Common Weakness Enumeration project at http://cwe.mitre.org/
1134 is a good reference for many of these and similar types of weaknesses
1135 of which application writers should be aware.
1136
1137
1138 Command Lines
1139 If you use a command line tool (such as curl) that uses libcurl,
1140 and you give options to the tool on the command line those
1141 options can very likely get read by other users of your system
1142 when they use 'ps' or other tools to list currently running pro‐
1143 cesses.
1144
1145 To avoid this problem, never feed sensitive things to programs
1146 using command line options. Write them to a protected file and
1147 use the -K option to avoid this.
1148
1149
1150 .netrc .netrc is a pretty handy file/feature that allows you to login
1151 quickly and automatically to frequently visited sites. The file
1152 contains passwords in clear text and is a real security risk. In
1153 some cases, your .netrc is also stored in a home directory that
1154 is NFS mounted or used on another network based file system, so
1155 the clear text password will fly through your network every time
1156 anyone reads that file!
1157
1158 To avoid this problem, don't use .netrc files and never store
1159 passwords in plain text anywhere.
1160
1161
1162 Clear Text Passwords
1163 Many of the protocols libcurl supports send name and password
1164 unencrypted as clear text (HTTP Basic authentication, FTP, TEL‐
1165 NET etc). It is very easy for anyone on your network or a net‐
1166 work nearby yours to just fire up a network analyzer tool and
1167 eavesdrop on your passwords. Don't let the fact that HTTP Basic
1168 uses base64 encoded passwords fool you. They may not look read‐
1169 able at a first glance, but they very easily "deciphered" by
1170 anyone within seconds.
1171
1172 To avoid this problem, use HTTP authentication methods or other
1173 protocols that don't let snoopers see your password: HTTP with
1174 Digest, NTLM or GSS authentication, HTTPS, FTPS, SCP, SFTP and
1175 FTP-Kerberos are a few examples.
1176
1177
1178 Redirects
1179 The CURLOPT_FOLLOWLOCATION option automatically follows HTTP
1180 redirects sent by a remote server. These redirects can refer to
1181 any kind of URL, not just HTTP. A redirect to a file: URL would
1182 cause the libcurl to read (or write) arbitrary files from the
1183 local filesystem. If the application returns the data back to
1184 the user (as would happen in some kinds of CGI scripts), an
1185 attacker could leverage this to read otherwise forbidden data
1186 (e.g. file://localhost/etc/passwd).
1187
1188 If authentication credentials are stored in the ~/.netrc file,
1189 or Kerberos is in use, any other URL type (not just file:) that
1190 requires authentication is also at risk. A redirect such as
1191 ftp://some-internal-server/private-file would then return data
1192 even when the server is password protected.
1193
1194 In the same way, if an unencrypted SSH private key has been con‐
1195 figured for the user running the libcurl application, SCP: or
1196 SFTP: URLs could access password or private-key protected
1197 resources, e.g. sftp://user@some-internal-server/etc/passwd
1198
1199 The CURLOPT_REDIR_PROTOCOLS and CURLOPT_NETRC options can be
1200 used to mitigate against this kind of attack.
1201
1202 A redirect can also specify a location available only on the
1203 machine running libcurl, including servers hidden behind a fire‐
1204 wall from the attacker. e.g. http://127.0.0.1/ or
1205 http://intranet/delete-stuff.cgi?delete=all or tftp://bootp-
1206 server/pc-config-data
1207
1208 Apps can mitigate against this by disabling CURLOPT_FOLLOWLOCA‐
1209 TION and handling redirects itself, sanitizing URLs as neces‐
1210 sary. Alternately, an app could leave CURLOPT_FOLLOWLOCATION
1211 enabled but set CURLOPT_REDIR_PROTOCOLS and install a CUR‐
1212 LOPT_OPENSOCKETFUNCTION callback function in which addresses are
1213 sanitized before use.
1214
1215
1216 Private Resources
1217 A user who can control the DNS server of a domain being passed
1218 in within a URL can change the address of the host to a local,
1219 private address which the libcurl application will then use.
1220 e.g. The innocuous URL http://fuzzybunnies.example.com/ could
1221 actually resolve to the IP address of a server behind a fire‐
1222 wall, such as 127.0.0.1 or 10.1.2.3 Apps can mitigate against
1223 this by setting a CURLOPT_OPENSOCKETFUNCTION and checking the
1224 address before a connection.
1225
1226 All the malicious scenarios regarding redirected URLs apply just
1227 as well to non-redirected URLs, if the user is allowed to spec‐
1228 ify an arbitrary URL that could point to a private resource. For
1229 example, a web app providing a translation service might happily
1230 translate file://localhost/etc/passwd and display the result.
1231 Apps can mitigate against this with the CURLOPT_PROTOCOLS option
1232 as well as by similar mitigation techniques for redirections.
1233
1234 A malicious FTP server could in response to the PASV command
1235 return an IP address and port number for a server local to the
1236 app running libcurl but behind a firewall. Apps can mitigate
1237 against this by using the CURLOPT_FTP_SKIP_PASV_IP option or
1238 CURLOPT_FTPPORT.
1239
1240
1241 Uploads
1242 When uploading, a redirect can cause a local (or remote) file to
1243 be overwritten. Apps must not allow any unsanitized URL to be
1244 passed in for uploads. Also, CURLOPT_FOLLOWLOCATION should not
1245 be used on uploads. Instead, the app should handle redirects
1246 itself, sanitizing each URL first.
1247
1248
1249 Authentication
1250 Use of CURLOPT_UNRESTRICTED_AUTH could cause authentication
1251 information to be sent to an unknown second server. Apps can
1252 mitigate against this by disabling CURLOPT_FOLLOWLOCATION and
1253 handling redirects itself, sanitizing where necessary.
1254
1255 Use of the CURLAUTH_ANY option to CURLOPT_HTTPAUTH could result
1256 in user name and password being sent in clear text to an HTTP
1257 server. Instead, use CURLAUTH_ANYSAFE which ensures that the
1258 password is encrypted over the network, or else fail the
1259 request.
1260
1261 Use of the CURLUSESSL_TRY option to CURLOPT_USE_SSL could result
1262 in user name and password being sent in clear text to an FTP
1263 server. Instead, use CURLUSESSL_CONTROL to ensure that an
1264 encrypted connection is used or else fail the request.
1265
1266
1267 Cookies
1268 If cookies are enabled and cached, then a user could craft a URL
1269 which performs some malicious action to a site whose authentica‐
1270 tion is already stored in a cookie. e.g. http://mail.exam‐
1271 ple.com/delete-stuff.cgi?delete=all Apps can mitigate against
1272 this by disabling cookies or clearing them between requests.
1273
1274
1275 Dangerous URLs
1276 SCP URLs can contain raw commands within the scp: URL, which is
1277 a side effect of how the SCP protocol is designed. e.g.
1278 scp://user:pass@host/a;date >/tmp/test; Apps must not allow
1279 unsanitized SCP: URLs to be passed in for downloads.
1280
1281
1282 Denial of Service
1283 A malicious server could cause libcurl to effectively hang by
1284 sending a trickle of data through, or even no data at all but
1285 just keeping the TCP connection open. This could result in a
1286 denial-of-service attack. The CURLOPT_TIMEOUT and/or CUR‐
1287 LOPT_LOW_SPEED_LIMIT options can be used to mitigate against
1288 this.
1289
1290 A malicious server could cause libcurl to effectively hang by
1291 starting to send data, then severing the connection without
1292 cleanly closing the TCP connection. The app could install a
1293 CURLOPT_SOCKOPTFUNCTION callback function and set the TCP
1294 SO_KEEPALIVE option to mitigate against this. Setting one of
1295 the timeout options would also work against this attack.
1296
1297 A malicious server could cause libcurl to download an infinite
1298 amount of data, potentially causing all of memory or disk to be
1299 filled. Setting the CURLOPT_MAXFILESIZE_LARGE option is not suf‐
1300 ficient to guard against this. Instead, the app should monitor
1301 the amount of data received within the write or progress call‐
1302 back and abort once the limit is reached.
1303
1304 A malicious HTTP server could cause an infinite redirection
1305 loop, causing a denial-of-service. This can be mitigated by
1306 using the CURLOPT_MAXREDIRS option.
1307
1308
1309 Arbitrary Headers
1310 User-supplied data must be sanitized when used in options like
1311 CURLOPT_USERAGENT, CURLOPT_HTTPHEADER, CURLOPT_POSTFIELDS and
1312 others that are used to generate structured data. Characters
1313 like embedded carriage returns or ampersands could allow the
1314 user to create additional headers or fields that could cause
1315 malicious transactions.
1316
1317
1318 Server Certificates
1319 A secure application should never use the CURLOPT_SSL_VERIFYPEER
1320 option to disable certificate validation. There are numerous
1321 attacks that are enabled by apps that fail to properly validate
1322 server TLS/SSL certificates, thus enabling a malicious server to
1323 spoof a legitimate one. HTTPS without validated certificates is
1324 potentially as insecure as a plain HTTP connection.
1325
1326
1327 Showing What You Do
1328 On a related issue, be aware that even in situations like when
1329 you have problems with libcurl and ask someone for help, every‐
1330 thing you reveal in order to get best possible help might also
1331 impose certain security related risks. Host names, user names,
1332 paths, operating system specifics, etc (not to mention passwords
1333 of course) may in fact be used by intruders to gain additional
1334 information of a potential target.
1335
1336 To avoid this problem, you must of course use your common sense.
1337 Often, you can just edit out the sensitive data or just
1338 search/replace your true information with faked data.
1339
1340
1342 The easy interface as described in detail in this document is a syn‐
1343 chronous interface that transfers one file at a time and doesn't return
1344 until it is done.
1345
1346 The multi interface, on the other hand, allows your program to transfer
1347 multiple files in both directions at the same time, without forcing you
1348 to use multiple threads. The name might make it seem that the multi
1349 interface is for multi-threaded programs, but the truth is almost the
1350 reverse. The multi interface can allow a single-threaded application
1351 to perform the same kinds of multiple, simultaneous transfers that
1352 multi-threaded programs can perform. It allows many of the benefits of
1353 multi-threaded transfers without the complexity of managing and syn‐
1354 chronizing many threads.
1355
1356 To use this interface, you are better off if you first understand the
1357 basics of how to use the easy interface. The multi interface is simply
1358 a way to make multiple transfers at the same time by adding up multiple
1359 easy handles into a "multi stack".
1360
1361 You create the easy handles you want and you set all the options just
1362 like you have been told above, and then you create a multi handle with
1363 curl_multi_init(3) and add all those easy handles to that multi handle
1364 with curl_multi_add_handle(3).
1365
1366 When you've added the handles you have for the moment (you can still
1367 add new ones at any time), you start the transfers by calling
1368 curl_multi_perform(3).
1369
1370 curl_multi_perform(3) is asynchronous. It will only execute as little
1371 as possible and then return back control to your program. It is
1372 designed to never block. If it returns CURLM_CALL_MULTI_PERFORM you
1373 better call it again soon, as that is a signal that it still has local
1374 data to send or remote data to receive.
1375
1376 The best usage of this interface is when you do a select() on all pos‐
1377 sible file descriptors or sockets to know when to call libcurl again.
1378 This also makes it easy for you to wait and respond to actions on your
1379 own application's sockets/handles. You figure out what to select() for
1380 by using curl_multi_fdset(3), that fills in a set of fd_set variables
1381 for you with the particular file descriptors libcurl uses for the
1382 moment.
1383
1384 When you then call select(), it'll return when one of the file handles
1385 signal action and you then call curl_multi_perform(3) to allow libcurl
1386 to do what it wants to do. Take note that libcurl does also feature
1387 some time-out code so we advise you to never use very long timeouts on
1388 select() before you call curl_multi_perform(3), which thus should be
1389 called unconditionally every now and then even if none of its file
1390 descriptors have signaled ready. Another precaution you should use:
1391 always call curl_multi_fdset(3) immediately before the select() call
1392 since the current set of file descriptors may change when calling a
1393 curl function.
1394
1395 If you want to stop the transfer of one of the easy handles in the
1396 stack, you can use curl_multi_remove_handle(3) to remove individual
1397 easy handles. Remember that easy handles should be
1398 curl_easy_cleanup(3)ed.
1399
1400 When a transfer within the multi stack has finished, the counter of
1401 running transfers (as filled in by curl_multi_perform(3)) will
1402 decrease. When the number reaches zero, all transfers are done.
1403
1404 curl_multi_info_read(3) can be used to get information about completed
1405 transfers. It then returns the CURLcode for each easy transfer, to
1406 allow you to figure out success on each individual transfer.
1407
1408
1410 [ seeding, passwords, keys, certificates, ENGINE, ca certs ]
1411
1412
1414 [ fill in ]
1415
1416
1418 [1] libcurl 7.10.3 and later have the ability to switch over to
1419 chunked Transfer-Encoding in cases where HTTP uploads are done
1420 with data of an unknown size.
1421
1422 [2] This happens on Windows machines when libcurl is built and used
1423 as a DLL. However, you can still do this on Windows if you link
1424 with a static library.
1425
1426 [3] The curl-config tool is generated at build-time (on UNIX-like
1427 systems) and should be installed with the 'make install' or sim‐
1428 ilar instruction that installs the library, header files, man
1429 pages etc.
1430
1431 [4] This behavior was different in versions before 7.17.0, where
1432 strings had to remain valid past the end of the
1433 curl_easy_setopt(3) call.
1434
1435
1436
1437libcurl 4 Mar 2009 libcurl-tutorial(3)