1libcurl-tutorial(3) libcurl programming libcurl-tutorial(3)
2
3
4
6 libcurl-tutorial - libcurl programming tutorial
7
9 This document attempts to describe the general principles and some
10 basic approaches to consider when programming with libcurl. The text
11 will focus mainly on the C interface but might apply fairly well on
12 other interfaces as well as they usually follow the C one pretty
13 closely.
14
15 This document will refer to 'the user' as the person writing the source
16 code that uses libcurl. That would probably be you or someone in your
17 position. What will be generally referred to as 'the program' will be
18 the collected source code that you write that is using libcurl for
19 transfers. The program is outside libcurl and libcurl is outside of the
20 program.
21
22 To get more details on all options and functions described herein,
23 please refer to their respective man pages.
24
25
27 There are many different ways to build C programs. This chapter will
28 assume a UNIX-style build process. If you use a different build system,
29 you can still read this to get general information that may apply to
30 your environment as well.
31
32 Compiling the Program
33 Your compiler needs to know where the libcurl headers are
34 located. Therefore you must set your compiler's include path to
35 point to the directory where you installed them. The 'curl-con‐
36 fig'[3] tool can be used to get this information:
37
38 $ curl-config --cflags
39
40
41 Linking the Program with libcurl
42 When having compiled the program, you need to link your object
43 files to create a single executable. For that to succeed, you
44 need to link with libcurl and possibly also with other libraries
45 that libcurl itself depends on. Like the OpenSSL libraries, but
46 even some standard OS libraries may be needed on the command
47 line. To figure out which flags to use, once again the 'curl-
48 config' tool comes to the rescue:
49
50 $ curl-config --libs
51
52
53 SSL or Not
54 libcurl can be built and customized in many ways. One of the
55 things that varies from different libraries and builds is the
56 support for SSL-based transfers, like HTTPS and FTPS. If a sup‐
57 ported SSL library was detected properly at build-time, libcurl
58 will be built with SSL support. To figure out if an installed
59 libcurl has been built with SSL support enabled, use 'curl-con‐
60 fig' like this:
61
62 $ curl-config --feature
63
64 And if SSL is supported, the keyword 'SSL' will be written to
65 stdout, possibly together with a few other features that could
66 be either on or off on for different libcurls.
67
68 See also the "Features libcurl Provides" further down.
69
70 autoconf macro
71 When you write your configure script to detect libcurl and setup
72 variables accordingly, we offer a prewritten macro that probably
73 does everything you need in this area. See
74 docs/libcurl/libcurl.m4 file - it includes docs on how to use
75 it.
76
77
79 The people behind libcurl have put a considerable effort to make
80 libcurl work on a large amount of different operating systems and envi‐
81 ronments.
82
83 You program libcurl the same way on all platforms that libcurl runs on.
84 There are only very few minor considerations that differ. If you just
85 make sure to write your code portable enough, you may very well create
86 yourself a very portable program. libcurl shouldn't stop you from that.
87
88
90 The program must initialize some of the libcurl functionality globally.
91 That means it should be done exactly once, no matter how many times you
92 intend to use the library. Once for your program's entire life time.
93 This is done using
94
95 curl_global_init()
96
97 and it takes one parameter which is a bit pattern that tells libcurl
98 what to initialize. Using CURL_GLOBAL_ALL will make it initialize all
99 known internal sub modules, and might be a good default option. The
100 current two bits that are specified are:
101
102 CURL_GLOBAL_WIN32
103 which only does anything on Windows machines. When used
104 on a Windows machine, it'll make libcurl initialize the
105 win32 socket stuff. Without having that initialized prop‐
106 erly, your program cannot use sockets properly. You
107 should only do this once for each application, so if your
108 program already does this or of another library in use
109 does it, you should not tell libcurl to do this as well.
110
111 CURL_GLOBAL_SSL
112 which only does anything on libcurls compiled and built
113 SSL-enabled. On these systems, this will make libcurl
114 initialize the SSL library properly for this application.
115 This only needs to be done once for each application so
116 if your program or another library already does this,
117 this bit should not be needed.
118
119 libcurl has a default protection mechanism that detects if
120 curl_global_init(3) hasn't been called by the time curl_easy_perform(3)
121 is called and if that is the case, libcurl runs the function itself
122 with a guessed bit pattern. Please note that depending solely on this
123 is not considered nice nor very good.
124
125 When the program no longer uses libcurl, it should call
126 curl_global_cleanup(3), which is the opposite of the init call. It will
127 then do the reversed operations to cleanup the resources the
128 curl_global_init(3) call initialized.
129
130 Repeated calls to curl_global_init(3) and curl_global_cleanup(3) should
131 be avoided. They should only be called once each.
132
133
135 It is considered best-practice to determine libcurl features at run-
136 time rather than at build-time (if possible of course). By calling
137 curl_version_info(3) and checking out the details of the returned
138 struct, your program can figure out exactly what the currently running
139 libcurl supports.
140
141
143 libcurl first introduced the so called easy interface. All operations
144 in the easy interface are prefixed with 'curl_easy'.
145
146 Recent libcurl versions also offer the multi interface. More about that
147 interface, what it is targeted for and how to use it is detailed in a
148 separate chapter further down. You still need to understand the easy
149 interface first, so please continue reading for better understanding.
150
151 To use the easy interface, you must first create yourself an easy han‐
152 dle. You need one handle for each easy session you want to perform.
153 Basically, you should use one handle for every thread you plan to use
154 for transferring. You must never share the same handle in multiple
155 threads.
156
157 Get an easy handle with
158
159 easyhandle = curl_easy_init();
160
161 It returns an easy handle. Using that you proceed to the next step:
162 setting up your preferred actions. A handle is just a logic entity for
163 the upcoming transfer or series of transfers.
164
165 You set properties and options for this handle using
166 curl_easy_setopt(3). They control how the subsequent transfer or trans‐
167 fers will be made. Options remain set in the handle until set again to
168 something different. Alas, multiple requests using the same handle will
169 use the same options.
170
171 Many of the options you set in libcurl are "strings", pointers to data
172 terminated with a zero byte. When you set strings with
173 curl_easy_setopt(3), libcurl makes its own copy so that they don't need
174 to be kept around in your application after being set[4].
175
176 One of the most basic properties to set in the handle is the URL. You
177 set your preferred URL to transfer with CURLOPT_URL in a manner similar
178 to:
179
180 curl_easy_setopt(handle, CURLOPT_URL, "http://domain.com/");
181
182 Let's assume for a while that you want to receive data as the URL iden‐
183 tifies a remote resource you want to get here. Since you write a sort
184 of application that needs this transfer, I assume that you would like
185 to get the data passed to you directly instead of simply getting it
186 passed to stdout. So, you write your own function that matches this
187 prototype:
188
189 size_t write_data(void *buffer, size_t size, size_t nmemb, void
190 *userp);
191
192 You tell libcurl to pass all data to this function by issuing a func‐
193 tion similar to this:
194
195 curl_easy_setopt(easyhandle, CURLOPT_WRITEFUNCTION, write_data);
196
197 You can control what data your callback function gets in the fourth
198 argument by setting another property:
199
200 curl_easy_setopt(easyhandle, CURLOPT_WRITEDATA, &internal_struct);
201
202 Using that property, you can easily pass local data between your appli‐
203 cation and the function that gets invoked by libcurl. libcurl itself
204 won't touch the data you pass with CURLOPT_WRITEDATA.
205
206 libcurl offers its own default internal callback that will take care of
207 the data if you don't set the callback with CURLOPT_WRITEFUNCTION. It
208 will then simply output the received data to stdout. You can have the
209 default callback write the data to a different file handle by passing a
210 'FILE *' to a file opened for writing with the CURLOPT_WRITEDATA
211 option.
212
213 Now, we need to take a step back and have a deep breath. Here's one of
214 those rare platform-dependent nitpicks. Did you spot it? On some plat‐
215 forms[2], libcurl won't be able to operate on files opened by the pro‐
216 gram. Thus, if you use the default callback and pass in an open file
217 with CURLOPT_WRITEDATA, it will crash. You should therefore avoid this
218 to make your program run fine virtually everywhere.
219
220 (CURLOPT_WRITEDATA was formerly known as CURLOPT_FILE. Both names still
221 work and do the same thing).
222
223 If you're using libcurl as a win32 DLL, you MUST use the CURLOPT_WRITE‐
224 FUNCTION if you set CURLOPT_WRITEDATA - or you will experience crashes.
225
226 There are of course many more options you can set, and we'll get back
227 to a few of them later. Let's instead continue to the actual transfer:
228
229 success = curl_easy_perform(easyhandle);
230
231 curl_easy_perform(3) will connect to the remote site, do the necessary
232 commands and receive the transfer. Whenever it receives data, it calls
233 the callback function we previously set. The function may get one byte
234 at a time, or it may get many kilobytes at once. libcurl delivers as
235 much as possible as often as possible. Your callback function should
236 return the number of bytes it "took care of". If that is not the exact
237 same amount of bytes that was passed to it, libcurl will abort the
238 operation and return with an error code.
239
240 When the transfer is complete, the function returns a return code that
241 informs you if it succeeded in its mission or not. If a return code
242 isn't enough for you, you can use the CURLOPT_ERRORBUFFER to point
243 libcurl to a buffer of yours where it'll store a human readable error
244 message as well.
245
246 If you then want to transfer another file, the handle is ready to be
247 used again. Mind you, it is even preferred that you re-use an existing
248 handle if you intend to make another transfer. libcurl will then
249 attempt to re-use the previous connection.
250
251 For some protocols, downloading a file can involve a complicated
252 process of logging in, setting the transfer mode, changing the current
253 directory and finally transferring the file data. libcurl takes care of
254 all that complication for you. Given simply the URL to a file, libcurl
255 will take care of all the details needed to get the file moved from one
256 machine to another.
257
258
260 The first basic rule is that you must never simultaneously share a
261 libcurl handle (be it easy or multi or whatever) between multiple
262 threads. Only use one handle in one thread at any time. You can pass
263 the handles around among threads, but you must never use a single han‐
264 dle from more than one thread at any given time.
265
266 libcurl is completely thread safe, except for two issues: signals and
267 SSL/TLS handlers. Signals are used for timing out name resolves (during
268 DNS lookup) - when built without c-ares support and not on Windows.
269
270 If you are accessing HTTPS or FTPS URLs in a multi-threaded manner, you
271 are then of course using the underlying SSL library multi-threaded and
272 those libs might have their own requirements on this issue. Basically,
273 you need to provide one or two functions to allow it to function prop‐
274 erly. For all details, see this:
275
276 OpenSSL
277
278 http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION
279
280 GnuTLS
281
282 http://www.gnu.org/software/gnutls/man‐
283 ual/html_node/Multi_002dthreaded-applications.html
284
285 NSS
286
287 is claimed to be thread-safe already without anything required.
288
289 PolarSSL
290
291 Required actions unknown.
292
293 yassl
294
295 Required actions unknown.
296
297 axTLS
298
299 Required actions unknown.
300
301 When using multiple threads you should set the CURLOPT_NOSIGNAL option
302 to 1 for all handles. Everything will or might work fine except that
303 timeouts are not honored during the DNS lookup - which you can work
304 around by building libcurl with c-ares support. c-ares is a library
305 that provides asynchronous name resolves. On some platforms, libcurl
306 simply will not function properly multi-threaded unless this option is
307 set.
308
309 Also, note that CURLOPT_DNS_USE_GLOBAL_CACHE is not thread-safe.
310
311
313 There will always be times when the transfer fails for some reason. You
314 might have set the wrong libcurl option or misunderstood what the
315 libcurl option actually does, or the remote server might return non-
316 standard replies that confuse the library which then confuses your pro‐
317 gram.
318
319 There's one golden rule when these things occur: set the CURLOPT_VER‐
320 BOSE option to 1. It'll cause the library to spew out the entire proto‐
321 col details it sends, some internal info and some received protocol
322 data as well (especially when using FTP). If you're using HTTP, adding
323 the headers in the received output to study is also a clever way to get
324 a better understanding why the server behaves the way it does. Include
325 headers in the normal body output with CURLOPT_HEADER set 1.
326
327 Of course, there are bugs left. We need to know about them to be able
328 to fix them, so we're quite dependent on your bug reports! When you do
329 report suspected bugs in libcurl, please include as many details as you
330 possibly can: a protocol dump that CURLOPT_VERBOSE produces, library
331 version, as much as possible of your code that uses libcurl, operating
332 system name and version, compiler name and version etc.
333
334 If CURLOPT_VERBOSE is not enough, you increase the level of debug data
335 your application receive by using the CURLOPT_DEBUGFUNCTION.
336
337 Getting some in-depth knowledge about the protocols involved is never
338 wrong, and if you're trying to do funny things, you might very well
339 understand libcurl and how to use it better if you study the appropri‐
340 ate RFC documents at least briefly.
341
342
344 libcurl tries to keep a protocol independent approach to most trans‐
345 fers, thus uploading to a remote FTP site is very similar to uploading
346 data to a HTTP server with a PUT request.
347
348 Of course, first you either create an easy handle or you re-use one
349 existing one. Then you set the URL to operate on just like before. This
350 is the remote URL, that we now will upload.
351
352 Since we write an application, we most likely want libcurl to get the
353 upload data by asking us for it. To make it do that, we set the read
354 callback and the custom pointer libcurl will pass to our read callback.
355 The read callback should have a prototype similar to:
356
357 size_t function(char *bufptr, size_t size, size_t nitems, void
358 *userp);
359
360 Where bufptr is the pointer to a buffer we fill in with data to upload
361 and size*nitems is the size of the buffer and therefore also the maxi‐
362 mum amount of data we can return to libcurl in this call. The 'userp'
363 pointer is the custom pointer we set to point to a struct of ours to
364 pass private data between the application and the callback.
365
366 curl_easy_setopt(easyhandle, CURLOPT_READFUNCTION, read_function);
367
368 curl_easy_setopt(easyhandle, CURLOPT_READDATA, &filedata);
369
370 Tell libcurl that we want to upload:
371
372 curl_easy_setopt(easyhandle, CURLOPT_UPLOAD, 1L);
373
374 A few protocols won't behave properly when uploads are done without any
375 prior knowledge of the expected file size. So, set the upload file size
376 using the CURLOPT_INFILESIZE_LARGE for all known file sizes like
377 this[1]:
378
379 /* in this example, file_size must be an curl_off_t variable */
380 curl_easy_setopt(easyhandle, CURLOPT_INFILESIZE_LARGE, file_size);
381
382 When you call curl_easy_perform(3) this time, it'll perform all the
383 necessary operations and when it has invoked the upload it'll call your
384 supplied callback to get the data to upload. The program should return
385 as much data as possible in every invoke, as that is likely to make the
386 upload perform as fast as possible. The callback should return the num‐
387 ber of bytes it wrote in the buffer. Returning 0 will signal the end of
388 the upload.
389
390
392 Many protocols use or even require that user name and password are pro‐
393 vided to be able to download or upload the data of your choice. libcurl
394 offers several ways to specify them.
395
396 Most protocols support that you specify the name and password in the
397 URL itself. libcurl will detect this and use them accordingly. This is
398 written like this:
399
400 protocol://user:password@example.com/path/
401
402 If you need any odd letters in your user name or password, you should
403 enter them URL encoded, as %XX where XX is a two-digit hexadecimal num‐
404 ber.
405
406 libcurl also provides options to set various passwords. The user name
407 and password as shown embedded in the URL can instead get set with the
408 CURLOPT_USERPWD option. The argument passed to libcurl should be a char
409 * to a string in the format "user:password". In a manner like this:
410
411 curl_easy_setopt(easyhandle, CURLOPT_USERPWD, "myname:thesecret");
412
413 Another case where name and password might be needed at times, is for
414 those users who need to authenticate themselves to a proxy they use.
415 libcurl offers another option for this, the CURLOPT_PROXYUSERPWD. It is
416 used quite similar to the CURLOPT_USERPWD option like this:
417
418 curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "myname:these‐
419 cret");
420
421 There's a long time UNIX "standard" way of storing ftp user names and
422 passwords, namely in the $HOME/.netrc file. The file should be made
423 private so that only the user may read it (see also the "Security Con‐
424 siderations" chapter), as it might contain the password in plain text.
425 libcurl has the ability to use this file to figure out what set of user
426 name and password to use for a particular host. As an extension to the
427 normal functionality, libcurl also supports this file for non-FTP pro‐
428 tocols such as HTTP. To make curl use this file, use the CURLOPT_NETRC
429 option:
430
431 curl_easy_setopt(easyhandle, CURLOPT_NETRC, 1L);
432
433 And a very basic example of how such a .netrc file may look like:
434
435 machine myhost.mydomain.com
436 login userlogin
437 password secretword
438
439 All these examples have been cases where the password has been
440 optional, or at least you could leave it out and have libcurl attempt
441 to do its job without it. There are times when the password isn't
442 optional, like when you're using an SSL private key for secure trans‐
443 fers.
444
445 To pass the known private key password to libcurl:
446
447 curl_easy_setopt(easyhandle, CURLOPT_KEYPASSWD, "keypassword");
448
449
451 The previous chapter showed how to set user name and password for get‐
452 ting URLs that require authentication. When using the HTTP protocol,
453 there are many different ways a client can provide those credentials to
454 the server and you can control which way libcurl will (attempt to) use
455 them. The default HTTP authentication method is called 'Basic', which
456 is sending the name and password in clear-text in the HTTP request,
457 base64-encoded. This is insecure.
458
459 At the time of this writing, libcurl can be built to use: Basic,
460 Digest, NTLM, Negotiate, GSS-Negotiate and SPNEGO. You can tell libcurl
461 which one to use with CURLOPT_HTTPAUTH as in:
462
463 curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST);
464
465 And when you send authentication to a proxy, you can also set authenti‐
466 cation type the same way but instead with CURLOPT_PROXYAUTH:
467
468 curl_easy_setopt(easyhandle, CURLOPT_PROXYAUTH, CURLAUTH_NTLM);
469
470 Both these options allow you to set multiple types (by ORing them
471 together), to make libcurl pick the most secure one out of the types
472 the server/proxy claims to support. This method does however add a
473 round-trip since libcurl must first ask the server what it supports:
474
475 curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH,
476 CURLAUTH_DIGEST|CURLAUTH_BASIC);
477
478 For convenience, you can use the 'CURLAUTH_ANY' define (instead of a
479 list with specific types) which allows libcurl to use whatever method
480 it wants.
481
482 When asking for multiple types, libcurl will pick the available one it
483 considers "best" in its own internal order of preference.
484
485
487 We get many questions regarding how to issue HTTP POSTs with libcurl
488 the proper way. This chapter will thus include examples using both dif‐
489 ferent versions of HTTP POST that libcurl supports.
490
491 The first version is the simple POST, the most common version, that
492 most HTML pages using the <form> tag uses. We provide a pointer to the
493 data and tell libcurl to post it all to the remote site:
494
495 char *data="name=daniel&project=curl";
496 curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, data);
497 curl_easy_setopt(easyhandle, CURLOPT_URL, "http://posthere.com/");
498
499 curl_easy_perform(easyhandle); /* post away! */
500
501 Simple enough, huh? Since you set the POST options with the CUR‐
502 LOPT_POSTFIELDS, this automatically switches the handle to use POST in
503 the upcoming request.
504
505 Ok, so what if you want to post binary data that also requires you to
506 set the Content-Type: header of the post? Well, binary posts prevent
507 libcurl from being able to do strlen() on the data to figure out the
508 size, so therefore we must tell libcurl the size of the post data. Set‐
509 ting headers in libcurl requests are done in a generic way, by building
510 a list of our own headers and then passing that list to libcurl.
511
512 struct curl_slist *headers=NULL;
513 headers = curl_slist_append(headers, "Content-Type: text/xml");
514
515 /* post binary data */
516 curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, binaryptr);
517
518 /* set the size of the postfields data */
519 curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDSIZE, 23L);
520
521 /* pass our list of custom made headers */
522 curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers);
523
524 curl_easy_perform(easyhandle); /* post away! */
525
526 curl_slist_free_all(headers); /* free the header list */
527
528 While the simple examples above cover the majority of all cases where
529 HTTP POST operations are required, they don't do multi-part formposts.
530 Multi-part formposts were introduced as a better way to post (possibly
531 large) binary data and were first documented in the RFC1867 (updated in
532 RFC2388). They're called multi-part because they're built by a chain of
533 parts, each part being a single unit of data. Each part has its own
534 name and contents. You can in fact create and post a multi-part form‐
535 post with the regular libcurl POST support described above, but that
536 would require that you build a formpost yourself and provide to
537 libcurl. To make that easier, libcurl provides curl_formadd(3). Using
538 this function, you add parts to the form. When you're done adding
539 parts, you post the whole form.
540
541 The following example sets two simple text parts with plain textual
542 contents, and then a file with binary contents and uploads the whole
543 thing.
544
545 struct curl_httppost *post=NULL;
546 struct curl_httppost *last=NULL;
547 curl_formadd(&post, &last,
548 CURLFORM_COPYNAME, "name",
549 CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END);
550 curl_formadd(&post, &last,
551 CURLFORM_COPYNAME, "project",
552 CURLFORM_COPYCONTENTS, "curl", CURLFORM_END);
553 curl_formadd(&post, &last,
554 CURLFORM_COPYNAME, "logotype-image",
555 CURLFORM_FILECONTENT, "curl.png", CURLFORM_END);
556
557 /* Set the form info */
558 curl_easy_setopt(easyhandle, CURLOPT_HTTPPOST, post);
559
560 curl_easy_perform(easyhandle); /* post away! */
561
562 /* free the post data again */
563 curl_formfree(post);
564
565 Multipart formposts are chains of parts using MIME-style separators and
566 headers. It means that each one of these separate parts get a few head‐
567 ers set that describe the individual content-type, size etc. To enable
568 your application to handicraft this formpost even more, libcurl allows
569 you to supply your own set of custom headers to such an individual form
570 part. You can of course supply headers to as many parts as you like,
571 but this little example will show how you set headers to one specific
572 part when you add that to the post handle:
573
574 struct curl_slist *headers=NULL;
575 headers = curl_slist_append(headers, "Content-Type: text/xml");
576
577 curl_formadd(&post, &last,
578 CURLFORM_COPYNAME, "logotype-image",
579 CURLFORM_FILECONTENT, "curl.xml",
580 CURLFORM_CONTENTHEADER, headers,
581 CURLFORM_END);
582
583 curl_easy_perform(easyhandle); /* post away! */
584
585 curl_formfree(post); /* free post */
586 curl_slist_free_all(headers); /* free custom header list */
587
588 Since all options on an easyhandle are "sticky", they remain the same
589 until changed even if you do call curl_easy_perform(3), you may need to
590 tell curl to go back to a plain GET request if you intend to do one as
591 your next request. You force an easyhandle to go back to GET by using
592 the CURLOPT_HTTPGET option:
593
594 curl_easy_setopt(easyhandle, CURLOPT_HTTPGET, 1L);
595
596 Just setting CURLOPT_POSTFIELDS to "" or NULL will *not* stop libcurl
597 from doing a POST. It will just make it POST without any data to send!
598
599
601 For historical and traditional reasons, libcurl has a built-in progress
602 meter that can be switched on and then makes it present a progress
603 meter in your terminal.
604
605 Switch on the progress meter by, oddly enough, setting CUR‐
606 LOPT_NOPROGRESS to zero. This option is set to 1 by default.
607
608 For most applications however, the built-in progress meter is useless
609 and what instead is interesting is the ability to specify a progress
610 callback. The function pointer you pass to libcurl will then be called
611 on irregular intervals with information about the current transfer.
612
613 Set the progress callback by using CURLOPT_PROGRESSFUNCTION. And pass a
614 pointer to a function that matches this prototype:
615
616 int progress_callback(void *clientp,
617 double dltotal,
618 double dlnow,
619 double ultotal,
620 double ulnow);
621
622 If any of the input arguments is unknown, a 0 will be passed. The first
623 argument, the 'clientp' is the pointer you pass to libcurl with CUR‐
624 LOPT_PROGRESSDATA. libcurl won't touch it.
625
626
628 There's basically only one thing to keep in mind when using C++ instead
629 of C when interfacing libcurl:
630
631 The callbacks CANNOT be non-static class member functions
632
633 Example C++ code:
634
635 class AClass {
636 static size_t write_data(void *ptr, size_t size, size_t nmemb,
637 void *ourpointer)
638 {
639 /* do what you want with the data */
640 }
641 }
642
643
645 What "proxy" means according to Merriam-Webster: "a person authorized
646 to act for another" but also "the agency, function, or office of a
647 deputy who acts as a substitute for another".
648
649 Proxies are exceedingly common these days. Companies often only offer
650 Internet access to employees through their proxies. Network clients or
651 user-agents ask the proxy for documents, the proxy does the actual
652 request and then it returns them.
653
654 libcurl supports SOCKS and HTTP proxies. When a given URL is wanted,
655 libcurl will ask the proxy for it instead of trying to connect to the
656 actual host identified in the URL.
657
658 If you're using a SOCKS proxy, you may find that libcurl doesn't quite
659 support all operations through it.
660
661 For HTTP proxies: the fact that the proxy is a HTTP proxy puts certain
662 restrictions on what can actually happen. A requested URL that might
663 not be a HTTP URL will be still be passed to the HTTP proxy to deliver
664 back to libcurl. This happens transparently, and an application may not
665 need to know. I say "may", because at times it is very important to
666 understand that all operations over a HTTP proxy use the HTTP protocol.
667 For example, you can't invoke your own custom FTP commands or even
668 proper FTP directory listings.
669
670
671 Proxy Options
672
673 To tell libcurl to use a proxy at a given port number:
674
675 curl_easy_setopt(easyhandle, CURLOPT_PROXY, "proxy-
676 host.com:8080");
677
678 Some proxies require user authentication before allowing a
679 request, and you pass that information similar to this:
680
681 curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "user:pass‐
682 word");
683
684 If you want to, you can specify the host name only in the CUR‐
685 LOPT_PROXY option, and set the port number separately with CUR‐
686 LOPT_PROXYPORT.
687
688 Tell libcurl what kind of proxy it is with CURLOPT_PROXYTYPE (if
689 not, it will default to assume a HTTP proxy):
690
691 curl_easy_setopt(easyhandle, CURLOPT_PROXYTYPE, CURL‐
692 PROXY_SOCKS4);
693
694
695 Environment Variables
696
697 libcurl automatically checks and uses a set of environment vari‐
698 ables to know what proxies to use for certain protocols. The
699 names of the variables are following an ancient de facto stan‐
700 dard and are built up as "[protocol]_proxy" (note the lower cas‐
701 ing). Which makes the variable 'http_proxy' checked for a name
702 of a proxy to use when the input URL is HTTP. Following the same
703 rule, the variable named 'ftp_proxy' is checked for FTP URLs.
704 Again, the proxies are always HTTP proxies, the different names
705 of the variables simply allows different HTTP proxies to be
706 used.
707
708 The proxy environment variable contents should be in the format
709 "[protocol://][user:password@]machine[:port]". Where the proto‐
710 col:// part is simply ignored if present (so http://proxy and
711 bluerk://proxy will do the same) and the optional port number
712 specifies on which port the proxy operates on the host. If not
713 specified, the internal default port number will be used and
714 that is most likely *not* the one you would like it to be.
715
716 There are two special environment variables. 'all_proxy' is what
717 sets proxy for any URL in case the protocol specific variable
718 wasn't set, and 'no_proxy' defines a list of hosts that should
719 not use a proxy even though a variable may say so. If 'no_proxy'
720 is a plain asterisk ("*") it matches all hosts.
721
722 To explicitly disable libcurl's checking for and using the proxy
723 environment variables, set the proxy name to "" - an empty
724 string - with CURLOPT_PROXY.
725
726 SSL and Proxies
727
728 SSL is for secure point-to-point connections. This involves
729 strong encryption and similar things, which effectively makes it
730 impossible for a proxy to operate as a "man in between" which
731 the proxy's task is, as previously discussed. Instead, the only
732 way to have SSL work over a HTTP proxy is to ask the proxy to
733 tunnel trough everything without being able to check or fiddle
734 with the traffic.
735
736 Opening an SSL connection over a HTTP proxy is therefor a matter
737 of asking the proxy for a straight connection to the target host
738 on a specified port. This is made with the HTTP request CONNECT.
739 ("please mr proxy, connect me to that remote host").
740
741 Because of the nature of this operation, where the proxy has no
742 idea what kind of data that is passed in and out through this
743 tunnel, this breaks some of the very few advantages that come
744 from using a proxy, such as caching. Many organizations prevent
745 this kind of tunneling to other destination port numbers than
746 443 (which is the default HTTPS port number).
747
748
749 Tunneling Through Proxy
750 As explained above, tunneling is required for SSL to work and
751 often even restricted to the operation intended for SSL; HTTPS.
752
753 This is however not the only time proxy-tunneling might offer
754 benefits to you or your application.
755
756 As tunneling opens a direct connection from your application to
757 the remote machine, it suddenly also re-introduces the ability
758 to do non-HTTP operations over a HTTP proxy. You can in fact use
759 things such as FTP upload or FTP custom commands this way.
760
761 Again, this is often prevented by the administrators of proxies
762 and is rarely allowed.
763
764 Tell libcurl to use proxy tunneling like this:
765
766 curl_easy_setopt(easyhandle, CURLOPT_HTTPPROXYTUNNEL, 1L);
767
768 In fact, there might even be times when you want to do plain
769 HTTP operations using a tunnel like this, as it then enables you
770 to operate on the remote server instead of asking the proxy to
771 do so. libcurl will not stand in the way for such innovative
772 actions either!
773
774
775 Proxy Auto-Config
776
777 Netscape first came up with this. It is basically a web page
778 (usually using a .pac extension) with a Javascript that when
779 executed by the browser with the requested URL as input, returns
780 information to the browser on how to connect to the URL. The
781 returned information might be "DIRECT" (which means no proxy
782 should be used), "PROXY host:port" (to tell the browser where
783 the proxy for this particular URL is) or "SOCKS host:port" (to
784 direct the browser to a SOCKS proxy).
785
786 libcurl has no means to interpret or evaluate Javascript and
787 thus it doesn't support this. If you get yourself in a position
788 where you face this nasty invention, the following advice have
789 been mentioned and used in the past:
790
791 - Depending on the Javascript complexity, write up a script that
792 translates it to another language and execute that.
793
794 - Read the Javascript code and rewrite the same logic in another
795 language.
796
797 - Implement a Javascript interpreter; people have successfully
798 used the Mozilla Javascript engine in the past.
799
800 - Ask your admins to stop this, for a static proxy setup or sim‐
801 ilar.
802
803
805 Re-cycling the same easy handle several times when doing multiple
806 requests is the way to go.
807
808 After each single curl_easy_perform(3) operation, libcurl will keep the
809 connection alive and open. A subsequent request using the same easy
810 handle to the same host might just be able to use the already open con‐
811 nection! This reduces network impact a lot.
812
813 Even if the connection is dropped, all connections involving SSL to the
814 same host again, will benefit from libcurl's session ID cache that
815 drastically reduces re-connection time.
816
817 FTP connections that are kept alive save a lot of time, as the command-
818 response round-trips are skipped, and also you don't risk getting
819 blocked without permission to login again like on many FTP servers only
820 allowing N persons to be logged in at the same time.
821
822 libcurl caches DNS name resolving results, to make lookups of a previ‐
823 ously looked up name a lot faster.
824
825 Other interesting details that improve performance for subsequent
826 requests may also be added in the future.
827
828 Each easy handle will attempt to keep the last few connections alive
829 for a while in case they are to be used again. You can set the size of
830 this "cache" with the CURLOPT_MAXCONNECTS option. Default is 5. There
831 is very seldom any point in changing this value, and if you think of
832 changing this it is often just a matter of thinking again.
833
834 To force your upcoming request to not use an already existing connec‐
835 tion (it will even close one first if there happens to be one alive to
836 the same host you're about to operate on), you can do that by setting
837 CURLOPT_FRESH_CONNECT to 1. In a similar spirit, you can also forbid
838 the upcoming request to be "lying" around and possibly get re-used
839 after the request by setting CURLOPT_FORBID_REUSE to 1.
840
841
843 When you use libcurl to do HTTP requests, it'll pass along a series of
844 headers automatically. It might be good for you to know and understand
845 these. You can replace or remove them by using the CURLOPT_HTTPHEADER
846 option.
847
848
849 Host This header is required by HTTP 1.1 and even many 1.0 servers
850 and should be the name of the server we want to talk to. This
851 includes the port number if anything but default.
852
853
854 Accept "*/*".
855
856
857 Expect When doing POST requests, libcurl sets this header to "100-con‐
858 tinue" to ask the server for an "OK" message before it proceeds
859 with sending the data part of the post. If the POSTed data
860 amount is deemed "small", libcurl will not use this header.
861
862
864 There is an ongoing development today where more and more protocols are
865 built upon HTTP for transport. This has obvious benefits as HTTP is a
866 tested and reliable protocol that is widely deployed and has excellent
867 proxy-support.
868
869 When you use one of these protocols, and even when doing other kinds of
870 programming you may need to change the traditional HTTP (or FTP or...)
871 manners. You may need to change words, headers or various data.
872
873 libcurl is your friend here too.
874
875
876 CUSTOMREQUEST
877 If just changing the actual HTTP request keyword is what you
878 want, like when GET, HEAD or POST is not good enough for you,
879 CURLOPT_CUSTOMREQUEST is there for you. It is very simple to
880 use:
881
882 curl_easy_setopt(easyhandle, CURLOPT_CUSTOMREQUEST, "MYOWNRE‐
883 QUEST");
884
885 When using the custom request, you change the request keyword of
886 the actual request you are performing. Thus, by default you make
887 a GET request but you can also make a POST operation (as
888 described before) and then replace the POST keyword if you want
889 to. You're the boss.
890
891
892 Modify Headers
893 HTTP-like protocols pass a series of headers to the server when
894 doing the request, and you're free to pass any amount of extra
895 headers that you think fit. Adding headers is this easy:
896
897 struct curl_slist *headers=NULL; /* init to NULL is important */
898
899 headers = curl_slist_append(headers, "Hey-server-hey: how are you?");
900 headers = curl_slist_append(headers, "X-silly-content: yes");
901
902 /* pass our list of custom made headers */
903 curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers);
904
905 curl_easy_perform(easyhandle); /* transfer http */
906
907 curl_slist_free_all(headers); /* free the header list */
908
909 ... and if you think some of the internally generated headers,
910 such as Accept: or Host: don't contain the data you want them to
911 contain, you can replace them by simply setting them too:
912
913 headers = curl_slist_append(headers, "Accept: Agent-007");
914 headers = curl_slist_append(headers, "Host: munged.host.line");
915
916
917 Delete Headers
918 If you replace an existing header with one with no contents, you
919 will prevent the header from being sent. For instance, if you
920 want to completely prevent the "Accept:" header from being sent,
921 you can disable it with code similar to this:
922
923 headers = curl_slist_append(headers, "Accept:");
924
925 Both replacing and canceling internal headers should be done
926 with careful consideration and you should be aware that you may
927 violate the HTTP protocol when doing so.
928
929
930 Enforcing chunked transfer-encoding
931
932 By making sure a request uses the custom header "Transfer-Encod‐
933 ing: chunked" when doing a non-GET HTTP operation, libcurl will
934 switch over to "chunked" upload, even though the size of the
935 data to upload might be known. By default, libcurl usually
936 switches over to chunked upload automatically if the upload data
937 size is unknown.
938
939
940 HTTP Version
941
942 All HTTP requests includes the version number to tell the server
943 which version we support. libcurl speaks HTTP 1.1 by default.
944 Some very old servers don't like getting 1.1-requests and when
945 dealing with stubborn old things like that, you can tell libcurl
946 to use 1.0 instead by doing something like this:
947
948 curl_easy_setopt(easyhandle, CURLOPT_HTTP_VERSION,
949 CURL_HTTP_VERSION_1_0);
950
951
952 FTP Custom Commands
953
954 Not all protocols are HTTP-like, and thus the above may not help
955 you when you want to make, for example, your FTP transfers to
956 behave differently.
957
958 Sending custom commands to a FTP server means that you need to
959 send the commands exactly as the FTP server expects them (RFC959
960 is a good guide here), and you can only use commands that work
961 on the control-connection alone. All kinds of commands that
962 require data interchange and thus need a data-connection must be
963 left to libcurl's own judgement. Also be aware that libcurl will
964 do its very best to change directory to the target directory
965 before doing any transfer, so if you change directory (with CWD
966 or similar) you might confuse libcurl and then it might not
967 attempt to transfer the file in the correct remote directory.
968
969 A little example that deletes a given file before an operation:
970
971 headers = curl_slist_append(headers, "DELE file-to-remove");
972
973 /* pass the list of custom commands to the handle */
974 curl_easy_setopt(easyhandle, CURLOPT_QUOTE, headers);
975
976 curl_easy_perform(easyhandle); /* transfer ftp data! */
977
978 curl_slist_free_all(headers); /* free the header list */
979
980 If you would instead want this operation (or chain of opera‐
981 tions) to happen _after_ the data transfer took place the option
982 to curl_easy_setopt(3) would instead be called CURLOPT_POSTQUOTE
983 and used the exact same way.
984
985 The custom FTP command will be issued to the server in the same
986 order they are added to the list, and if a command gets an error
987 code returned back from the server, no more commands will be
988 issued and libcurl will bail out with an error code
989 (CURLE_QUOTE_ERROR). Note that if you use CURLOPT_QUOTE to send
990 commands before a transfer, no transfer will actually take place
991 when a quote command has failed.
992
993 If you set the CURLOPT_HEADER to 1, you will tell libcurl to get
994 information about the target file and output "headers" about it.
995 The headers will be in "HTTP-style", looking like they do in
996 HTTP.
997
998 The option to enable headers or to run custom FTP commands may
999 be useful to combine with CURLOPT_NOBODY. If this option is set,
1000 no actual file content transfer will be performed.
1001
1002
1003 FTP Custom CUSTOMREQUEST
1004 If you do want to list the contents of a FTP directory using
1005 your own defined FTP command, CURLOPT_CUSTOMREQUEST will do just
1006 that. "NLST" is the default one for listing directories but
1007 you're free to pass in your idea of a good alternative.
1008
1009
1011 In the HTTP sense, a cookie is a name with an associated value. A
1012 server sends the name and value to the client, and expects it to get
1013 sent back on every subsequent request to the server that matches the
1014 particular conditions set. The conditions include that the domain name
1015 and path match and that the cookie hasn't become too old.
1016
1017 In real-world cases, servers send new cookies to replace existing ones
1018 to update them. Server use cookies to "track" users and to keep "ses‐
1019 sions".
1020
1021 Cookies are sent from server to clients with the header Set-Cookie: and
1022 they're sent from clients to servers with the Cookie: header.
1023
1024 To just send whatever cookie you want to a server, you can use CUR‐
1025 LOPT_COOKIE to set a cookie string like this:
1026
1027 curl_easy_setopt(easyhandle, CURLOPT_COOKIE, "name1=var1;
1028 name2=var2;");
1029
1030 In many cases, that is not enough. You might want to dynamically save
1031 whatever cookies the remote server passes to you, and make sure those
1032 cookies are then used accordingly on later requests.
1033
1034 One way to do this, is to save all headers you receive in a plain file
1035 and when you make a request, you tell libcurl to read the previous
1036 headers to figure out which cookies to use. Set the header file to read
1037 cookies from with CURLOPT_COOKIEFILE.
1038
1039 The CURLOPT_COOKIEFILE option also automatically enables the cookie
1040 parser in libcurl. Until the cookie parser is enabled, libcurl will not
1041 parse or understand incoming cookies and they will just be ignored.
1042 However, when the parser is enabled the cookies will be understood and
1043 the cookies will be kept in memory and used properly in subsequent
1044 requests when the same handle is used. Many times this is enough, and
1045 you may not have to save the cookies to disk at all. Note that the file
1046 you specify to CURLOPT_COOKIEFILE doesn't have to exist to enable the
1047 parser, so a common way to just enable the parser and not read any
1048 cookies is to use the name of a file you know doesn't exist.
1049
1050 If you would rather use existing cookies that you've previously
1051 received with your Netscape or Mozilla browsers, you can make libcurl
1052 use that cookie file as input. The CURLOPT_COOKIEFILE is used for that
1053 too, as libcurl will automatically find out what kind of file it is and
1054 act accordingly.
1055
1056 Perhaps the most advanced cookie operation libcurl offers, is saving
1057 the entire internal cookie state back into a Netscape/Mozilla formatted
1058 cookie file. We call that the cookie-jar. When you set a file name with
1059 CURLOPT_COOKIEJAR, that file name will be created and all received
1060 cookies will be stored in it when curl_easy_cleanup(3) is called. This
1061 enables cookies to get passed on properly between multiple handles
1062 without any information getting lost.
1063
1064
1066 FTP transfers use a second TCP/IP connection for the data transfer.
1067 This is usually a fact you can forget and ignore but at times this fact
1068 will come back to haunt you. libcurl offers several different ways to
1069 customize how the second connection is being made.
1070
1071 libcurl can either connect to the server a second time or tell the
1072 server to connect back to it. The first option is the default and it is
1073 also what works best for all the people behind firewalls, NATs or IP-
1074 masquerading setups. libcurl then tells the server to open up a new
1075 port and wait for a second connection. This is by default attempted
1076 with EPSV first, and if that doesn't work it tries PASV instead. (EPSV
1077 is an extension to the original FTP spec and does not exist nor work on
1078 all FTP servers.)
1079
1080 You can prevent libcurl from first trying the EPSV command by setting
1081 CURLOPT_FTP_USE_EPSV to zero.
1082
1083 In some cases, you will prefer to have the server connect back to you
1084 for the second connection. This might be when the server is perhaps
1085 behind a firewall or something and only allows connections on a single
1086 port. libcurl then informs the remote server which IP address and port
1087 number to connect to. This is made with the CURLOPT_FTPPORT option. If
1088 you set it to "-", libcurl will use your system's "default IP address".
1089 If you want to use a particular IP, you can set the full IP address, a
1090 host name to resolve to an IP address or even a local network interface
1091 name that libcurl will get the IP address from.
1092
1093 When doing the "PORT" approach, libcurl will attempt to use the EPRT
1094 and the LPRT before trying PORT, as they work with more protocols. You
1095 can disable this behavior by setting CURLOPT_FTP_USE_EPRT to zero.
1096
1097
1099 Some protocols provide "headers", meta-data separated from the normal
1100 data. These headers are by default not included in the normal data
1101 stream, but you can make them appear in the data stream by setting CUR‐
1102 LOPT_HEADER to 1.
1103
1104 What might be even more useful, is libcurl's ability to separate the
1105 headers from the data and thus make the callbacks differ. You can for
1106 example set a different pointer to pass to the ordinary write callback
1107 by setting CURLOPT_WRITEHEADER.
1108
1109 Or, you can set an entirely separate function to receive the headers,
1110 by using CURLOPT_HEADERFUNCTION.
1111
1112 The headers are passed to the callback function one by one, and you can
1113 depend on that fact. It makes it easier for you to add custom header
1114 parsers etc.
1115
1116 "Headers" for FTP transfers equal all the FTP server responses. They
1117 aren't actually true headers, but in this case we pretend they are! ;-)
1118
1119
1121 [ curl_easy_getinfo ]
1122
1123
1125 The libcurl project takes security seriously. The library is written
1126 with caution and precautions are taken to mitigate many kinds of risks
1127 encountered while operating with potentially malicious servers on the
1128 Internet. It is a powerful library, however, which allows application
1129 writers to make trade offs between ease of writing and exposure to
1130 potential risky operations. If used the right way, you can use libcurl
1131 to transfer data pretty safely.
1132
1133 Many applications are used in closed networks where users and servers
1134 can be trusted, but many others are used on arbitrary servers and are
1135 fed input from potentially untrusted users. Following is a discussion
1136 about some risks in the ways in which applications commonly use libcurl
1137 and potential mitigations of those risks. It is by no means comprehen‐
1138 sive, but shows classes of attacks that robust applications should con‐
1139 sider. The Common Weakness Enumeration project at http://cwe.mitre.org/
1140 is a good reference for many of these and similar types of weaknesses
1141 of which application writers should be aware.
1142
1143
1144 Command Lines
1145 If you use a command line tool (such as curl) that uses libcurl,
1146 and you give options to the tool on the command line those
1147 options can very likely get read by other users of your system
1148 when they use 'ps' or other tools to list currently running pro‐
1149 cesses.
1150
1151 To avoid this problem, never feed sensitive things to programs
1152 using command line options. Write them to a protected file and
1153 use the -K option to avoid this.
1154
1155
1156 .netrc .netrc is a pretty handy file/feature that allows you to login
1157 quickly and automatically to frequently visited sites. The file
1158 contains passwords in clear text and is a real security risk. In
1159 some cases, your .netrc is also stored in a home directory that
1160 is NFS mounted or used on another network based file system, so
1161 the clear text password will fly through your network every time
1162 anyone reads that file!
1163
1164 To avoid this problem, don't use .netrc files and never store
1165 passwords in plain text anywhere.
1166
1167
1168 Clear Text Passwords
1169 Many of the protocols libcurl supports send name and password
1170 unencrypted as clear text (HTTP Basic authentication, FTP, TEL‐
1171 NET etc). It is very easy for anyone on your network or a net‐
1172 work nearby yours to just fire up a network analyzer tool and
1173 eavesdrop on your passwords. Don't let the fact that HTTP Basic
1174 uses base64 encoded passwords fool you. They may not look read‐
1175 able at a first glance, but they very easily "deciphered" by
1176 anyone within seconds.
1177
1178 To avoid this problem, use HTTP authentication methods or other
1179 protocols that don't let snoopers see your password: HTTP with
1180 Digest, NTLM or GSS authentication, HTTPS, FTPS, SCP, SFTP and
1181 FTP-Kerberos are a few examples.
1182
1183
1184 Redirects
1185 The CURLOPT_FOLLOWLOCATION option automatically follows HTTP
1186 redirects sent by a remote server. These redirects can refer to
1187 any kind of URL, not just HTTP. A redirect to a file: URL would
1188 cause the libcurl to read (or write) arbitrary files from the
1189 local filesystem. If the application returns the data back to
1190 the user (as would happen in some kinds of CGI scripts), an
1191 attacker could leverage this to read otherwise forbidden data
1192 (e.g. file://localhost/etc/passwd).
1193
1194 If authentication credentials are stored in the ~/.netrc file,
1195 or Kerberos is in use, any other URL type (not just file:) that
1196 requires authentication is also at risk. A redirect such as
1197 ftp://some-internal-server/private-file would then return data
1198 even when the server is password protected.
1199
1200 In the same way, if an unencrypted SSH private key has been con‐
1201 figured for the user running the libcurl application, SCP: or
1202 SFTP: URLs could access password or private-key protected
1203 resources, e.g. sftp://user@some-internal-server/etc/passwd
1204
1205 The CURLOPT_REDIR_PROTOCOLS and CURLOPT_NETRC options can be
1206 used to mitigate against this kind of attack.
1207
1208 A redirect can also specify a location available only on the
1209 machine running libcurl, including servers hidden behind a fire‐
1210 wall from the attacker. e.g. http://127.0.0.1/ or
1211 http://intranet/delete-stuff.cgi?delete=all or tftp://bootp-
1212 server/pc-config-data
1213
1214 Apps can mitigate against this by disabling CURLOPT_FOLLOWLOCA‐
1215 TION and handling redirects itself, sanitizing URLs as neces‐
1216 sary. Alternately, an app could leave CURLOPT_FOLLOWLOCATION
1217 enabled but set CURLOPT_REDIR_PROTOCOLS and install a CUR‐
1218 LOPT_OPENSOCKETFUNCTION callback function in which addresses are
1219 sanitized before use.
1220
1221
1222 Private Resources
1223 A user who can control the DNS server of a domain being passed
1224 in within a URL can change the address of the host to a local,
1225 private address which the libcurl application will then use.
1226 e.g. The innocuous URL http://fuzzybunnies.example.com/ could
1227 actually resolve to the IP address of a server behind a fire‐
1228 wall, such as 127.0.0.1 or 10.1.2.3 Apps can mitigate against
1229 this by setting a CURLOPT_OPENSOCKETFUNCTION and checking the
1230 address before a connection.
1231
1232 All the malicious scenarios regarding redirected URLs apply just
1233 as well to non-redirected URLs, if the user is allowed to spec‐
1234 ify an arbitrary URL that could point to a private resource. For
1235 example, a web app providing a translation service might happily
1236 translate file://localhost/etc/passwd and display the result.
1237 Apps can mitigate against this with the CURLOPT_PROTOCOLS option
1238 as well as by similar mitigation techniques for redirections.
1239
1240 A malicious FTP server could in response to the PASV command
1241 return an IP address and port number for a server local to the
1242 app running libcurl but behind a firewall. Apps can mitigate
1243 against this by using the CURLOPT_FTP_SKIP_PASV_IP option or
1244 CURLOPT_FTPPORT.
1245
1246
1247 Uploads
1248 When uploading, a redirect can cause a local (or remote) file to
1249 be overwritten. Apps must not allow any unsanitized URL to be
1250 passed in for uploads. Also, CURLOPT_FOLLOWLOCATION should not
1251 be used on uploads. Instead, the app should handle redirects
1252 itself, sanitizing each URL first.
1253
1254
1255 Authentication
1256 Use of CURLOPT_UNRESTRICTED_AUTH could cause authentication
1257 information to be sent to an unknown second server. Apps can
1258 mitigate against this by disabling CURLOPT_FOLLOWLOCATION and
1259 handling redirects itself, sanitizing where necessary.
1260
1261 Use of the CURLAUTH_ANY option to CURLOPT_HTTPAUTH could result
1262 in user name and password being sent in clear text to an HTTP
1263 server. Instead, use CURLAUTH_ANYSAFE which ensures that the
1264 password is encrypted over the network, or else fail the
1265 request.
1266
1267 Use of the CURLUSESSL_TRY option to CURLOPT_USE_SSL could result
1268 in user name and password being sent in clear text to an FTP
1269 server. Instead, use CURLUSESSL_CONTROL to ensure that an
1270 encrypted connection is used or else fail the request.
1271
1272
1273 Cookies
1274 If cookies are enabled and cached, then a user could craft a URL
1275 which performs some malicious action to a site whose authentica‐
1276 tion is already stored in a cookie. e.g. http://mail.exam‐
1277 ple.com/delete-stuff.cgi?delete=all Apps can mitigate against
1278 this by disabling cookies or clearing them between requests.
1279
1280
1281 Dangerous URLs
1282 SCP URLs can contain raw commands within the scp: URL, which is
1283 a side effect of how the SCP protocol is designed. e.g.
1284 scp://user:pass@host/a;date >/tmp/test; Apps must not allow
1285 unsanitized SCP: URLs to be passed in for downloads.
1286
1287
1288 Denial of Service
1289 A malicious server could cause libcurl to effectively hang by
1290 sending a trickle of data through, or even no data at all but
1291 just keeping the TCP connection open. This could result in a
1292 denial-of-service attack. The CURLOPT_TIMEOUT and/or CUR‐
1293 LOPT_LOW_SPEED_LIMIT options can be used to mitigate against
1294 this.
1295
1296 A malicious server could cause libcurl to effectively hang by
1297 starting to send data, then severing the connection without
1298 cleanly closing the TCP connection. The app could install a
1299 CURLOPT_SOCKOPTFUNCTION callback function and set the TCP
1300 SO_KEEPALIVE option to mitigate against this. Setting one of
1301 the timeout options would also work against this attack.
1302
1303 A malicious server could cause libcurl to download an infinite
1304 amount of data, potentially causing all of memory or disk to be
1305 filled. Setting the CURLOPT_MAXFILESIZE_LARGE option is not suf‐
1306 ficient to guard against this. Instead, the app should monitor
1307 the amount of data received within the write or progress call‐
1308 back and abort once the limit is reached.
1309
1310 A malicious HTTP server could cause an infinite redirection
1311 loop, causing a denial-of-service. This can be mitigated by
1312 using the CURLOPT_MAXREDIRS option.
1313
1314
1315 Arbitrary Headers
1316 User-supplied data must be sanitized when used in options like
1317 CURLOPT_USERAGENT, CURLOPT_HTTPHEADER, CURLOPT_POSTFIELDS and
1318 others that are used to generate structured data. Characters
1319 like embedded carriage returns or ampersands could allow the
1320 user to create additional headers or fields that could cause
1321 malicious transactions.
1322
1323
1324 Server-supplied Names
1325 A server can supply data which the application may, in some
1326 cases, use as a file name. The curl command-line tool does this
1327 with --remote-header-name, using the Content-disposition: header
1328 to generate a file name. An application could also use
1329 CURLINFO_EFFECTIVE_URL to generate a file name from a server-
1330 supplied redirect URL. Special care must be taken to sanitize
1331 such names to avoid the possibility of a malicious server sup‐
1332 plying one like "/etc/passwd", "utoexec.bat" or even ".bashrc".
1333
1334
1335 Server Certificates
1336 A secure application should never use the CURLOPT_SSL_VERIFYPEER
1337 option to disable certificate validation. There are numerous
1338 attacks that are enabled by apps that fail to properly validate
1339 server TLS/SSL certificates, thus enabling a malicious server to
1340 spoof a legitimate one. HTTPS without validated certificates is
1341 potentially as insecure as a plain HTTP connection.
1342
1343
1344 Showing What You Do
1345 On a related issue, be aware that even in situations like when
1346 you have problems with libcurl and ask someone for help, every‐
1347 thing you reveal in order to get best possible help might also
1348 impose certain security related risks. Host names, user names,
1349 paths, operating system specifics, etc (not to mention passwords
1350 of course) may in fact be used by intruders to gain additional
1351 information of a potential target.
1352
1353 To avoid this problem, you must of course use your common sense.
1354 Often, you can just edit out the sensitive data or just
1355 search/replace your true information with faked data.
1356
1357
1359 The easy interface as described in detail in this document is a syn‐
1360 chronous interface that transfers one file at a time and doesn't return
1361 until it is done.
1362
1363 The multi interface, on the other hand, allows your program to transfer
1364 multiple files in both directions at the same time, without forcing you
1365 to use multiple threads. The name might make it seem that the multi
1366 interface is for multi-threaded programs, but the truth is almost the
1367 reverse. The multi interface can allow a single-threaded application
1368 to perform the same kinds of multiple, simultaneous transfers that
1369 multi-threaded programs can perform. It allows many of the benefits of
1370 multi-threaded transfers without the complexity of managing and syn‐
1371 chronizing many threads.
1372
1373 To use this interface, you are better off if you first understand the
1374 basics of how to use the easy interface. The multi interface is simply
1375 a way to make multiple transfers at the same time by adding up multiple
1376 easy handles into a "multi stack".
1377
1378 You create the easy handles you want and you set all the options just
1379 like you have been told above, and then you create a multi handle with
1380 curl_multi_init(3) and add all those easy handles to that multi handle
1381 with curl_multi_add_handle(3).
1382
1383 When you've added the handles you have for the moment (you can still
1384 add new ones at any time), you start the transfers by calling
1385 curl_multi_perform(3).
1386
1387 curl_multi_perform(3) is asynchronous. It will only execute as little
1388 as possible and then return back control to your program. It is
1389 designed to never block.
1390
1391 The best usage of this interface is when you do a select() on all pos‐
1392 sible file descriptors or sockets to know when to call libcurl again.
1393 This also makes it easy for you to wait and respond to actions on your
1394 own application's sockets/handles. You figure out what to select() for
1395 by using curl_multi_fdset(3), that fills in a set of fd_set variables
1396 for you with the particular file descriptors libcurl uses for the
1397 moment.
1398
1399 When you then call select(), it'll return when one of the file handles
1400 signal action and you then call curl_multi_perform(3) to allow libcurl
1401 to do what it wants to do. Take note that libcurl does also feature
1402 some time-out code so we advise you to never use very long timeouts on
1403 select() before you call curl_multi_perform(3), which thus should be
1404 called unconditionally every now and then even if none of its file
1405 descriptors have signaled ready. Another precaution you should use:
1406 always call curl_multi_fdset(3) immediately before the select() call
1407 since the current set of file descriptors may change when calling a
1408 curl function.
1409
1410 If you want to stop the transfer of one of the easy handles in the
1411 stack, you can use curl_multi_remove_handle(3) to remove individual
1412 easy handles. Remember that easy handles should be
1413 curl_easy_cleanup(3)ed.
1414
1415 When a transfer within the multi stack has finished, the counter of
1416 running transfers (as filled in by curl_multi_perform(3)) will
1417 decrease. When the number reaches zero, all transfers are done.
1418
1419 curl_multi_info_read(3) can be used to get information about completed
1420 transfers. It then returns the CURLcode for each easy transfer, to
1421 allow you to figure out success on each individual transfer.
1422
1423
1425 [ seeding, passwords, keys, certificates, ENGINE, ca certs ]
1426
1427
1429 You can share some data between easy handles when the easy interface is
1430 used, and some data is share automatically when you use the multi
1431 interface.
1432
1433 When you add easy handles to a multi handle, these easy handles will
1434 automatically share a lot of the data that otherwise would be kept on a
1435 per-easy handle basis when the easy interface is used.
1436
1437 The DNS cache is shared between handles within a multi handle, making
1438 subsequent name resolvings faster and the connection pool that is kept
1439 to better allow persistent connections and connection re-use is shared.
1440 If you're using the easy interface, you can still share these between
1441 specific easy handles by using the share interface, see libcurl-
1442 share(3).
1443
1444 Some things are never shared automatically, not within multi handles,
1445 like for example cookies so the only way to share that is with the
1446 share interface.
1447
1449 [1] libcurl 7.10.3 and later have the ability to switch over to
1450 chunked Transfer-Encoding in cases where HTTP uploads are done
1451 with data of an unknown size.
1452
1453 [2] This happens on Windows machines when libcurl is built and used
1454 as a DLL. However, you can still do this on Windows if you link
1455 with a static library.
1456
1457 [3] The curl-config tool is generated at build-time (on UNIX-like
1458 systems) and should be installed with the 'make install' or sim‐
1459 ilar instruction that installs the library, header files, man
1460 pages etc.
1461
1462 [4] This behavior was different in versions before 7.17.0, where
1463 strings had to remain valid past the end of the
1464 curl_easy_setopt(3) call.
1465
1466
1467
1468libcurl 4 Mar 2009 libcurl-tutorial(3)