AnyEvent::Intro(3pm)

1AnyEvent::Intro(3)    User Contributed Perl Documentation   AnyEvent::Intro(3)
2
3
4

NAME

6       AnyEvent::Intro - an introductory tutorial to AnyEvent
7

Introduction to AnyEvent

9       This is a tutorial that will introduce you to the features of AnyEvent.
10
11       The first part introduces the core AnyEvent module (after swamping you
12       a bit in evangelism), which might already provide all you ever need: If
13       you are only interested in AnyEvent's event handling capabilities, read
14       no further.
15
16       The second part focuses on network programming using sockets, for which
17       AnyEvent offers a lot of support you can use, and a lot of workarounds
18       around portability quirks.
19

What is AnyEvent?

21       If you don't care for the whys and want to see code, skip this section!
22
23       AnyEvent is first of all just a framework to do event-based
24       programming. Typically such frameworks are an all-or-nothing thing: If
25       you use one such framework, you can't (easily, or even at all) use
26       another in the same program.
27
28       AnyEvent is different - it is a thin abstraction layer on top of other
29       of event loops, just like DBI is an abstraction of many different
30       database APIs. Its main purpose is to move the choice of the underlying
31       framework (the event loop) from the module author to the program author
32       using the module.
33
34       That means you can write code that uses events to control what it does,
35       without forcing other code in the same program to use the same
36       underlying framework as you do - i.e. you can create a Perl module that
37       is event-based using AnyEvent, and users of that module can still
38       choose between using Gtk2, Tk, Event (or run inside Irssi or rxvt-
39       unicode) or any other supported event loop. AnyEvent even comes with
40       its own pure-perl event loop implementation, so your code works
41       regardless of other modules that might or might not be installed. The
42       latter is important, as AnyEvent does not have any hard dependencies to
43       other modules, which makes it easy to install, for example, when you
44       lack a C compiler. No mater what environment, AnyEvent will just cope
45       with it.
46
47       A typical limitation of existing Perl modules such as Net::IRC is that
48       they come with their own event loop: In Net::IRC, the program who uses
49       it needs to start the event loop of Net::IRC. That means that one
50       cannot integrate this module into a Gtk2 GUI for instance, as that
51       module, too, enforces the use of its own event loop (namely Glib).
52
53       Another example is LWP: it provides no event interface at all. It's a
54       pure blocking HTTP (and FTP etc.) client library, which usually means
55       that you either have to start another process or have to fork for a
56       HTTP request, or use threads (e.g. Coro::LWP), if you want to do
57       something else while waiting for the request to finish.
58
59       The motivation behind these designs is often that a module doesn't want
60       to depend on some complicated XS-module (Net::IRC), or that it doesn't
61       want to force the user to use some specific event loop at all (LWP),
62       out of fear of severly limiting the usefulness of the module: If your
63       module requires Glib, it will not run in a Tk program.
64
65       AnyEvent solves this dilemma, by not forcing module authors to either:
66
67       - write their own event loop (because it guarantees the availability of
68       an event loop everywhere - even on windows with no extra modules
69       installed).
70       - choose one specific event loop (because AnyEvent works with most
71       event loops available for Perl).
72
73       If the module author uses AnyEvent for all his (or her) event needs (IO
74       events, timers, signals, ...) then all other modules can just use his
75       module and don't have to choose an event loop or adapt to his event
76       loop. The choice of the event loop is ultimately made by the program
77       author who uses all the modules and writes the main program. And even
78       there he doesn't have to choose, he can just let AnyEvent choose the
79       most efficient event loop available on the system.
80
81       Read more about this in the main documentation of the AnyEvent module.
82

Introduction to Event-Based Programming

84       So what exactly is programming using events? It quite simply means that
85       instead of your code actively waiting for something, such as the user
86       entering something on STDIN:
87
88          $| = 1; print "enter your name> ";
89
90          my $name = <STDIN>;
91
92       You instead tell your event framework to notify you in the event of
93       some data being available on STDIN, by using a callback mechanism:
94
95          use AnyEvent;
96
97          $| = 1; print "enter your name> ";
98
99          my $name;
100
101          my $wait_for_input = AnyEvent->io (
102             fh   => \*STDIN, # which file handle to check
103             poll => "r",     # which event to wait for ("r"ead data)
104             cb   => sub {    # what callback to execute
105                $name = <STDIN>; # read it
106             }
107          );
108
109          # do something else here
110
111       Looks more complicated, and surely is, but the advantage of using
112       events is that your program can do something else instead of waiting
113       for input (side note: combining AnyEvent with a thread package such as
114       Coro can recoup much of the simplicity, effectively getting the best of
115       two worlds).
116
117       Waiting as done in the first example is also called "blocking" the
118       process because you "block"/keep your process from executing anything
119       else while you do so.
120
121       The second example avoids blocking by only registering interest in a
122       read event, which is fast and doesn't block your process. Only when
123       read data is available will the callback be called, which can then
124       proceed to read the data.
125
126       The "interest" is represented by an object returned by "AnyEvent->io"
127       called a "watcher" object - called like that because it "watches" your
128       file handle (or other event sources) for the event you are interested
129       in.
130
131       In the example above, we create an I/O watcher by calling the
132       "AnyEvent->io" method. Disinterest in some event is simply expressed by
133       forgetting about the watcher, for example, by "undef"'ing the only
134       variable it is stored in. AnyEvent will automatically clean up the
135       watcher if it is no longer used, much like Perl closes your file
136       handles if you no longer use them anywhere.
137
138       A short note on callbacks
139
140       A common issue that hits people is the problem of passing parameters to
141       callbacks. Programmers used to languages such as C or C++ are often
142       used to a style where one passes the address of a function (a function
143       reference) and some data value, e.g.:
144
145          sub callback {
146             my ($arg) = @_;
147
148             $arg->method;
149          }
150
151          my $arg = ...;
152
153          call_me_back_later \&callback, $arg;
154
155       This is clumsy, as the place where behaviour is specified (when the
156       callback is registered) is often far away from the place where
157       behaviour is implemented. It also doesn't use Perl syntax to invoke the
158       code. There is also an abstraction penalty to pay as one has to name
159       the callback, which often is unnecessary and leads to nonsensical or
160       duplicated names.
161
162       In Perl, one can specify behaviour much more directly by using
163       closures. Closures are code blocks that take a reference to the
164       enclosing scope(s) when they are created. This means lexical variables
165       in scope at the time of creating the closure can simply be used inside
166       the closure:
167
168          my $arg = ...;
169
170          call_me_back_later sub { $arg->method };
171
172       Under most circumstances, closures are faster, use fewer resources and
173       result in much clearer code then the traditional approach. Faster,
174       because parameter passing and storing them in local variables in Perl
175       is relatively slow. Fewer resources, because closures take references
176       to existing variables without having to create new ones, and clearer
177       code because it is immediately obvious that the second example calls
178       the "method" method when the callback is invoked.
179
180       Apart from these, the strongest argument for using closures with
181       AnyEvent is that AnyEvent does not allow passing parameters to the
182       callback, so closures are the only way to achieve that in most cases
183       :->
184
185       A hint on debugging
186
187       AnyEvent does, by default, not do any argument checking. This can lead
188       to strange and unexpected results especially if you are trying to learn
189       your ways with AnyEvent.
190
191       AnyEvent supports a special "strict" mode - off by default - which does
192       very strict argument checking, at the expense of being somewhat slower.
193       During development, however, this mode is very useful.
194
195       You can enable this strict mode either by having an environment
196       variable "PERL_ANYEVENT_STRICT" with a true value in your environment:
197
198          PERL_ANYEVENT_STRICT=1 perl test.pl
199
200       Or you can write "use AnyEvent::Strict" in your program, which has the
201       same effect (do not do this in production, however).
202
203   Condition Variables
204       Back to the I/O watcher example: The code is not yet a fully working
205       program, and will not work as-is. The reason is that your callback will
206       not be invoked out of the blue, you have to run the event loop. Also,
207       event-based programs sometimes have to block, too, as when there simply
208       is nothing else to do and everything waits for some events, it needs to
209       block the process as well until new events arrive.
210
211       In AnyEvent, this is done using condition variables. Condition
212       variables are named "condition variables" because they represent a
213       condition that is initially false and needs to be fulfilled.
214
215       You can also call them "merge points", "sync points", "rendezvous
216       ports" or even callbacks and many other things (and they are often
217       called like this in other frameworks). The important point is that you
218       can create them freely and later wait for them to become true.
219
220       Condition variables have two sides - one side is the "producer" of the
221       condition (whatever code detects and flags the condition), the other
222       side is the "consumer" (the code that waits for that condition).
223
224       In our example in the previous section, the producer is the event
225       callback and there is no consumer yet - let's change that right now:
226
227          use AnyEvent;
228
229          $| = 1; print "enter your name> ";
230
231          my $name;
232
233          my $name_ready = AnyEvent->condvar;
234
235          my $wait_for_input = AnyEvent->io (
236             fh   => \*STDIN,
237             poll => "r",
238             cb   => sub {
239                $name = <STDIN>;
240                $name_ready->send;
241             }
242          );
243
244          # do something else here
245
246          # now wait until the name is available:
247          $name_ready->recv;
248
249          undef $wait_for_input; # watche rno longer needed
250
251          print "your name is $name\n";
252
253       This program creates an AnyEvent condvar by calling the
254       "AnyEvent->condvar" method. It then creates a watcher as usual, but
255       inside the callback it "send"'s the $name_ready condition variable,
256       which causes whoever is waiting on it to continue.
257
258       The "whoever" in this case is the code that follows, which calls
259       "$name_ready->recv": The producer calls "send", the consumer calls
260       "recv".
261
262       If there is no $name available yet, then the call to
263       "$name_ready->recv" will halt your program until the condition becomes
264       true.
265
266       As the names "send" and "recv" imply, you can actually send and receive
267       data using this, for example, the above code could also be written like
268       this, without an extra variable to store the name in:
269
270          use AnyEvent;
271
272          $| = 1; print "enter your name> ";
273
274          my $name_ready = AnyEvent->condvar;
275
276          my $wait_for_input = AnyEvent->io (
277             fh => \*STDIN, poll => "r",
278             cb => sub { $name_ready->send (scalar <STDIN>) }
279          );
280
281          # do something else here
282
283          # now wait and fetch the name
284          my $name = $name_ready->recv;
285
286          undef $wait_for_input; # watche rno longer needed
287
288          print "your name is $name\n";
289
290       You can pass any number of arguments to "send", and everybody call to
291       "recv" will return them.
292
293   The "main loop"
294       Most event-based frameworks have something called a "main loop" or
295       "event loop run function" or something similar.
296
297       Just like in "recv" AnyEvent, these functions need to be called
298       eventually so that your event loop has a chance of actually looking for
299       those events you are interested in.
300
301       For example, in a Gtk2 program, the above example could also be written
302       like this:
303
304          use Gtk2 -init;
305          use AnyEvent;
306
307          ############################################
308          # create a window and some label
309
310          my $window = new Gtk2::Window "toplevel";
311          $window->add (my $label = new Gtk2::Label "soon replaced by name");
312
313          $window->show_all;
314
315          ############################################
316          # do our AnyEvent stuff
317
318          $| = 1; print "enter your name> ";
319
320          my $name_ready = AnyEvent->condvar;
321
322          my $wait_for_input = AnyEvent->io (
323             fh => \*STDIN, poll => "r",
324             cb => sub {
325                # set the label
326                $label->set_text (scalar <STDIN>);
327                print "enter another name> ";
328             }
329          );
330
331          ############################################
332          # Now enter Gtk2's event loop
333
334          main Gtk2;
335
336       No condition variable anywhere in sight - instead, we just read a line
337       from STDIN and replace the text in the label. In fact, since nobody
338       "undef"'s $wait_for_input you can enter multiple lines.
339
340       Instead of waiting for a condition variable, the program enters the
341       Gtk2 main loop by calling "Gtk2->main", which will block the program
342       and wait for events to arrive.
343
344       This also shows that AnyEvent is quite flexible - you didn't have
345       anything to do to make the AnyEvent watcher use Gtk2 (actually Glib) -
346       it just worked.
347
348       Admittedly, the example is a bit silly - who would want to read names
349       from standard input in a Gtk+ application. But imagine that instead of
350       doing that, you would make a HTTP request in the background and display
351       it's results. In fact, with event-based programming you can make many
352       http-requests in parallel in your program and still provide feedback to
353       the user and stay interactive.
354
355       And in the next part you will see how to do just that - by implementing
356       an HTTP request, on our own, with the utility modules AnyEvent comes
357       with.
358
359       Before that, however, let's briefly look at how you would write your
360       program with using only AnyEvent, without ever calling some other event
361       loop's run function.
362
363       In the example using condition variables, we used those to start
364       waiting for events, and in fact, condition variables are the solution:
365
366          my $quit_program = AnyEvent->condvar;
367
368          # create AnyEvent watchers (or not) here
369
370          $quit_program->recv;
371
372       If any of your watcher callbacks decide to quit (this is often called
373       an "unloop" in other frameworks), they can simply call
374       "$quit_program->send". Of course, they could also decide not to and
375       simply call "exit" instead, or they could decide not to quit, ever
376       (e.g.  in a long-running daemon program).
377
378       If you don't need some clean quit functionality and just want to run
379       the event loop, you can simply do this:
380
381          AnyEvent->condvar->recv;
382
383       And this is, in fact, closest to the idea of a main loop run function
384       that AnyEvent offers.
385
386   Timers and other event sources
387       So far, we have only used I/O watchers. These are useful mainly to find
388       out whether a socket has data to read, or space to write more data. On
389       sane operating systems this also works for console windows/terminals
390       (typically on standard input), serial lines, all sorts of other
391       devices, basically almost everything that has a file descriptor but
392       isn't a file itself. (As usual, "sane" excludes windows - on that
393       platform you would need different functions for all of these,
394       complicating code immensely - think "socket only" on windows).
395
396       However, I/O is not everything - the second most important event source
397       is the clock. For example when doing an HTTP request you might want to
398       time out when the server doesn't answer within some predefined amount
399       of time.
400
401       In AnyEvent, timer event watchers are created by calling the
402       "AnyEvent->timer" method:
403
404          use AnyEvent;
405
406          my $cv = AnyEvent->condvar;
407
408          my $wait_one_and_a_half_seconds = AnyEvent->timer (
409             after => 1.5,  # after how many seconds to invoke the cb?
410             cb    => sub { # the callback to invoke
411                $cv->send;
412             },
413          );
414
415          # can do something else here
416
417          # now wait till our time has come
418          $cv->recv;
419
420       Unlike I/O watchers, timers are only interested in the amount of
421       seconds they have to wait. When (at least) that amount of time has
422       passed, AnyEvent will invoke your callback.
423
424       Unlike I/O watchers, which will call your callback as many times as
425       there is data available, timers are normally one-shot: after they have
426       "fired" once and invoked your callback, they are dead and no longer do
427       anything.
428
429       To get a repeating timer, such as a timer firing roughly once per
430       second, you can specify an "interval" parameter:
431
432          my $once_per_second = AnyEvent->timer (
433             after => 0,    # first invoke ASAP
434             interval => 1, # then invoke every second
435             cb    => sub { # the callback to invoke
436                $cv->send;
437             },
438          );
439
440       More esoteric sources
441
442       AnyEvent also has some other, more esoteric event sources you can tap
443       into: signal, child and idle watchers.
444
445       Signal watchers can be used to wait for "signal events", which simply
446       means your process got send a signal (such as "SIGTERM" or "SIGUSR1").
447
448       Child-process watchers wait for a child process to exit. They are
449       useful when you fork a separate process and need to know when it exits,
450       but you do not wait for that by blocking.
451
452       Idle watchers invoke their callback when the event loop has handled all
453       outstanding events, polled for new events and didn't find any, i.e.,
454       when your process is otherwise idle. They are useful if you want to do
455       some non-trivial data processing that can be done when your program
456       doesn't have anything better to do.
457
458       All these watcher types are described in detail in the main AnyEvent
459       manual page.
460
461       Sometimes you also need to know what the current time is:
462       "AnyEvent->now" returns the time the event toolkit uses to schedule
463       relative timers, and is usually what you want. It is often cached
464       (which means it can be a bit outdated). In that case, you can use the
465       more costly "AnyEvent->time" method which will ask your operating
466       system for the current time, which is slower, but also more up to date.
467

Network programming and AnyEvent

469       So far you have seen how to register event watchers and handle events.
470
471       This is a great foundation to write network clients and servers, and
472       might be all that your module (or program) ever requires, but writing
473       your own I/O buffering again and again becomes tedious, not to mention
474       that it attracts errors.
475
476       While the core AnyEvent module is still small and self-contained, the
477       distribution comes with some very useful utility modules such as
478       AnyEvent::Handle, AnyEvent::DNS and AnyEvent::Socket. These can make
479       your life as non-blocking network programmer a lot easier.
480
481       Here is a quick overview over these three modules:
482
483   AnyEvent::DNS
484       This module allows fully asynchronous DNS resolution. It is used mainly
485       by AnyEvent::Socket to resolve hostnames and service ports for you, but
486       is a great way to do other DNS resolution tasks, such as reverse
487       lookups of IP addresses for log files.
488
489   AnyEvent::Handle
490       This module handles non-blocking IO on (socket-, pipe- etc.) file
491       handles in an event based manner. It provides a wrapper object around
492       your file handle that provides queueing and buffering of incoming and
493       outgoing data for you.
494
495       It also implements the most common data formats, such as text lines, or
496       fixed and variable-width data blocks.
497
498   AnyEvent::Socket
499       This module provides you with functions that handle socket creation and
500       IP address magic. The two main functions are "tcp_connect" and
501       "tcp_server". The former will connect a (streaming) socket to an
502       internet host for you and the later will make a server socket for you,
503       to accept connections.
504
505       This module also comes with transparent IPv6 support, this means: If
506       you write your programs with this module, you will be IPv6 ready
507       without doing anything special.
508
509       It also works around a lot of portability quirks (especially on the
510       windows platform), which makes it even easier to write your programs in
511       a portable way (did you know that windows uses different error codes
512       for all socket functions and that Perl does not know about these? That
513       "Unknown error 10022" (which is "WSAEINVAL") can mean that our
514       "connect" call was successful? That unsuccessful TCP connects might
515       never be reported back to your program? That "WSAEINPROGRESS" means
516       your "connect" call was ignored instead of being in progress?
517       AnyEvent::Socket works around all of these Windows/Perl bugs for you).
518
519   Implementing a parallel finger client with non-blocking connects and
520       AnyEvent::Socket
521       The finger protocol is one of the simplest protocols in use on the
522       internet. Or in use in the past, as almost nobody uses it anymore.
523
524       It works by connecting to the finger port on another host, writing a
525       single line with a user name and then reading the finger response, as
526       specified by that user. OK, RFC 1288 specifies a vastly more complex
527       protocol, but it basically boils down to this:
528
529          # telnet kernel.org finger
530          Trying 204.152.191.37...
531          Connected to kernel.org (204.152.191.37).
532          Escape character is '^]'.
533
534          The latest stable version of the Linux kernel is: [...]
535          Connection closed by foreign host.
536
537       So let's write a little AnyEvent function that makes a finger request:
538
539          use AnyEvent;
540          use AnyEvent::Socket;
541
542          sub finger($$) {
543             my ($user, $host) = @_;
544
545             # use a condvar to return results
546             my $cv = AnyEvent->condvar;
547
548             # first, connect to the host
549             tcp_connect $host, "finger", sub {
550                # the callback receives the socket handle - or nothing
551                my ($fh) = @_
552                   or return $cv->send;
553
554                # now write the username
555                syswrite $fh, "$user\015\012";
556
557                my $response;
558
559                # register a read watcher
560                my $read_watcher; $read_watcher = AnyEvent->io (
561                   fh   => $fh,
562                   poll => "r",
563                   cb   => sub {
564                      my $len = sysread $fh, $response, 1024, length $response;
565
566                      if ($len <= 0) {
567                         # we are done, or an error occured, lets ignore the latter
568                         undef $read_watcher; # no longer interested
569                         $cv->send ($response); # send results
570                      }
571                   },
572                );
573             };
574
575             # pass $cv to the caller
576             $cv
577          }
578
579       That's a mouthful! Let's dissect this function a bit, first the overall
580       function and execution flow:
581
582          sub finger($$) {
583             my ($user, $host) = @_;
584
585             # use a condvar to return results
586             my $cv = AnyEvent->condvar;
587
588             # first, connect to the host
589             tcp_connect $host, "finger", sub {
590                ...
591             };
592
593             $cv
594          }
595
596       This isn't too complicated, just a function with two parameters, that
597       creates a condition variable, returns it, and while it does that,
598       initiates a TCP connect to $host. The condition variable will be used
599       by the caller to receive the finger response, but one could equally
600       well pass a third argument, a callback, to the function.
601
602       Since we are programming event'ish, we do not wait for the connect to
603       finish - it could block the program for a minute or longer!
604
605       Instead, we pass the callback it should invoke when the connect is done
606       to "tcp_connect". If it is successful, that callback gets called with
607       the socket handle as first argument, otherwise, nothing will be passed
608       to our callback. The important point is that it will always be called
609       as soon as the outcome of the TCP connect is known.
610
611       This style of programming is also called "continuation style": the
612       "continuation" is simply the way the program continues - normally at
613       the next line after some statement (the exception is loops or things
614       like "return"). When we are interested in events, however, we instead
615       specify the "continuation" of our program by passing a closure, which
616       makes that closure the "continuation" of the program.
617
618       The "tcp_connect" call is like saying "return now, and when the
619       connection is established or it failed, continue there".
620
621       Now let's look at the callback/closure in more detail:
622
623                # the callback receives the socket handle - or nothing
624                my ($fh) = @_
625                   or return $cv->send;
626
627       The first thing the callback does is indeed save the socket handle in
628       $fh. When there was an error (no arguments), then our instinct as
629       expert Perl programmers would tell us to "die":
630
631                my ($fh) = @_
632                   or die "$host: $!";
633
634       While this would give good feedback to the user (if he happens to watch
635       standard error), our program would probably stop working here, as we
636       never report the results to anybody, certainly not the caller of our
637       "finger" function, and most event loops continue even after a "die"!
638
639       This is why we instead "return", but also call "$cv->send" without any
640       arguments to signal to the condvar consumer that something bad has
641       happened. The return value of "$cv->send" is irrelevant, as is the
642       return value of our callback. The "return" statement is simply used for
643       the side effect of, well, returning immediately from the callback.
644       Checking for errors and handling them this way is very common, which is
645       why this compact idiom is so handy.
646
647       As the next step in the finger protocol, we send the username to the
648       finger daemon on the other side of our connection (the kernel.org
649       finger service doesn't actually wait for a username, but the net is
650       running out of finger servers fast):
651
652                syswrite $fh, "$user\015\012";
653
654       Note that this isn't 100% clean socket programming - the socket could,
655       for whatever reasons, not accept our data. When writing a small amount
656       of data like in this example it doesn't matter, as a socket buffer is
657       almost always big enough for a mere "username", but for real-world
658       cases you might need to implement some kind of write buffering - or use
659       AnyEvent::Handle, which handles these matters for you, as shown in the
660       next section.
661
662       What we do have to do is to implement our own read buffer - the
663       response data could arrive late or in multiple chunks, and we cannot
664       just wait for it (event-based programming, you know?).
665
666       To do that, we register a read watcher on the socket which waits for
667       data:
668
669                my $read_watcher; $read_watcher = AnyEvent->io (
670                   fh   => $fh,
671                   poll => "r",
672
673       There is a trick here, however: the read watcher isn't stored in a
674       global variable, but in a local one - if the callback returns, it would
675       normally destroy the variable and its contents, which would in turn
676       unregister our watcher.
677
678       To avoid that, we "undef"ine the variable in the watcher callback. This
679       means that, when the "tcp_connect" callback returns, perl thinks (quite
680       correctly) that the read watcher is still in use - namely in the
681       callback, and thus keeps it alive even if nothing else in the program
682       refers to it anymore (it is much like Baron MA~Xnchhausen keeping
683       himself from dying by pulling himself out of a swamp).
684
685       The trick, however, is that instead of:
686
687          my $read_watcher = AnyEvent->io (...
688
689       The program does:
690
691          my $read_watcher; $read_watcher = AnyEvent->io (...
692
693       The reason for this is a quirk in the way Perl works: variable names
694       declared with "my" are only visible in the next statement. If the whole
695       "AnyEvent->io" call, including the callback, would be done in a single
696       statement, the callback could not refer to the $read_watcher variable
697       to undefine it, so it is done in two statements.
698
699       Whether you'd want to format it like this is of course a matter of
700       style, this way emphasizes that the declaration and assignment really
701       are one logical statement.
702
703       The callback itself calls "sysread" for as many times as necessary,
704       until "sysread" returns either an error or end-of-file:
705
706                   cb   => sub {
707                      my $len = sysread $fh, $response, 1024, length $response;
708
709                      if ($len <= 0) {
710
711       Note that "sysread" has the ability to append data it reads to a
712       scalar, by specifying an offset, a feature of which we make good use of
713       in this example.
714
715       When "sysread" indicates we are done, the callback "undef"ines the
716       watcher and then "send"'s the response data to the condition variable.
717       All this has the following effects:
718
719       Undefining the watcher destroys it, as our callback was the only one
720       still having a reference to it. When the watcher gets destroyed, it
721       destroys the callback, which in turn means the $fh handle is no longer
722       used, so that gets destroyed as well. The result is that all resources
723       will be nicely cleaned up by perl for us.
724
725       Using the finger client
726
727       Now, we could probably write the same finger client in a simpler way if
728       we used "IO::Socket::INET", ignored the problem of multiple hosts and
729       ignored IPv6 and a few other things that "tcp_connect" handles for us.
730
731       But the main advantage is that we can not only run this finger function
732       in the background, we even can run multiple sessions in parallel, like
733       this:
734
735          my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets
736          my $f2 = finger "1736"   , "noc.dfn.de"; # fetch ticket 1736
737          my $f3 = finger "hpa"    , "kernel.org"; # finger hpa
738
739          print "trouble tickets:\n"     , $f1->recv, "\n";
740          print "trouble ticket #1736:\n", $f2->recv, "\n";
741          print "kernel release info: "  , $f3->recv, "\n";
742
743       It doesn't look like it, but in fact all three requests run in
744       parallel. The code waits for the first finger request to finish first,
745       but that doesn't keep it from executing them parallel: when the first
746       "recv" call sees that the data isn't ready yet, it serves events for
747       all three requests automatically, until the first request has finished.
748
749       The second "recv" call might either find the data is already there, or
750       it will continue handling events until that is the case, and so on.
751
752       By taking advantage of network latencies, which allows us to serve
753       other requests and events while we wait for an event on one socket, the
754       overall time to do these three requests will be greatly reduced,
755       typically all three are done in the same time as the slowest of them
756       would need to finish.
757
758       By the way, you do not actually have to wait in the "recv" method on an
759       AnyEvent condition variable - after all, waiting is evil - you can also
760       register a callback:
761
762          $cv->cb (sub {
763             my $response = shift->recv;
764             # ...
765          });
766
767       The callback will only be invoked when "send" was called. In fact,
768       instead of returning a condition variable you could also pass a third
769       parameter to your finger function, the callback to invoke with the
770       response:
771
772          sub finger($$$) {
773             my ($user, $host, $cb) = @_;
774
775       How you implement it is a matter of taste - if you expect your function
776       to be used mainly in an event-based program you would normally prefer
777       to pass a callback directly. If you write a module and expect your
778       users to use it "synchronously" often (for example, a simple http-get
779       script would not really care much for events), then you would use a
780       condition variable and tell them "simply "->recv" the data".
781
782       Problems with the implementation and how to fix them
783
784       To make this example more real-world-ready, we would not only implement
785       some write buffering (for the paranoid, or maybe denial-of-service
786       aware security expert), but we would also have to handle timeouts and
787       maybe protocol errors.
788
789       Doing this quickly gets unwieldy, which is why we introduce
790       AnyEvent::Handle in the next section, which takes care of all these
791       details for you and let's you concentrate on the actual protocol.
792
793   Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle
794       The AnyEvent::Handle module has been hyped quite a bit in this document
795       so far, so let's see what it really offers.
796
797       As finger is such a simple protocol, let's try something slightly more
798       complicated: HTTP/1.0.
799
800       An HTTP GET request works by sending a single request line that
801       indicates what you want the server to do and the URI you want to act it
802       on, followed by as many "header" lines ("Header: data", same as e-mail
803       headers) as required for the request, ended by an empty line.
804
805       The response is formatted very similarly, first a line with the
806       response status, then again as many header lines as required, then an
807       empty line, followed by any data that the server might send.
808
809       Again, let's try it out with "telnet" (I condensed the output a bit -
810       if you want to see the full response, do it yourself).
811
812          # telnet www.google.com 80
813          Trying 209.85.135.99...
814          Connected to www.google.com (209.85.135.99).
815          Escape character is '^]'.
816          GET /test HTTP/1.0
817
818          HTTP/1.0 404 Not Found
819          Date: Mon, 02 Jun 2008 07:05:54 GMT
820          Content-Type: text/html; charset=UTF-8
821
822          <html><head>
823          [...]
824          Connection closed by foreign host.
825
826       The "GET ..." and the empty line were entered manually, the rest of the
827       telnet output is google's response, in which case a "404 not found"
828       one.
829
830       So, here is how you would do it with "AnyEvent::Handle":
831
832          sub http_get {
833             my ($host, $uri, $cb) = @_;
834
835             # store results here
836             my ($response, $header, $body);
837
838             my $handle; $handle = new AnyEvent::Handle
839                connect  => [$host => 'http'],
840                on_error => sub {
841                   $cb->("HTTP/1.0 500 $!");
842                   $handle->destroy; # explicitly destroy handle
843                },
844                on_eof   => sub {
845                   $cb->($response, $header, $body);
846                   $handle->destroy; # explicitly destroy handle
847                };
848
849             $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
850
851             # now fetch response status line
852             $handle->push_read (line => sub {
853                my ($handle, $line) = @_;
854                $response = $line;
855             });
856
857             # then the headers
858             $handle->push_read (line => "\015\012\015\012", sub {
859                my ($handle, $line) = @_;
860                $header = $line;
861             });
862
863             # and finally handle any remaining data as body
864             $handle->on_read (sub {
865                $body .= $_[0]->rbuf;
866                $_[0]->rbuf = "";
867             });
868          }
869
870       And now let's go through it step by step. First, as usual, the overall
871       "http_get" function structure:
872
873          sub http_get {
874             my ($host, $uri, $cb) = @_;
875
876             # store results here
877             my ($response, $header, $body);
878
879             my $handle; $handle = new AnyEvent::Handle
880                ... create handle object
881
882             ... push data to write
883
884             ... push what to expect to read queue
885          }
886
887       Unlike in the finger example, this time the caller has to pass a
888       callback to "http_get". Also, instead of passing a URL as one would
889       expect, the caller has to provide the hostname and URI - normally you
890       would use the "URI" module to parse a URL and separate it into those
891       parts, but that is left to the inspired reader :)
892
893       Since everything else is left to the caller, all "http_get" does it to
894       initiate the connection by creating the AnyEvent::Handle object (which
895       calls "tcp_connect" for us) and leave everything else to it's callback.
896
897       The handle object is created, unsurprisingly, by calling the "new"
898       method of AnyEvent::Handle:
899
900             my $handle; $handle = new AnyEvent::Handle
901                connect  => [$host => 'http'],
902                on_error => sub {
903                   $cb->("HTTP/1.0 500 $!");
904                   $handle->destroy; # explicitly destroy handle
905                },
906                on_eof   => sub {
907                   $cb->($response, $header, $body);
908                   $handle->destroy; # explicitly destroy handle
909                };
910
911       The "connect" argument tells AnyEvent::Handle to call "tcp_connect" for
912       the specified host and service/port.
913
914       The "on_error" callback will be called on any unexpected error, such as
915       a refused connection, or unexpected connection while reading the
916       header.
917
918       Instead of having an extra mechanism to signal errors, connection
919       errors are signalled by crafting a special "response status line", like
920       this:
921
922          HTTP/1.0 500 Connection refused
923
924       This means the caller cannot distinguish (easily) between locally-
925       generated errors and server errors, but it simplifies error handling
926       for the caller a lot.
927
928       The error callback also destroys the handle explicitly, because we are
929       not interested in continuing after any errors. In AnyEvent::Handle
930       callbacks you have to call "destroy" explicitly to destroy a handle.
931       Outside of those callbacks you cna just forget the object reference and
932       it will be automatically cleaned up.
933
934       Last not least, we set an "on_eof" callback that is called when the
935       other side indicates it has stopped writing data, which we will use to
936       gracefully shut down the handle and report the results. This callback
937       is only called when the read queue is empty - if the read queue expects
938       some data and the handle gets an EOF from the other side this will be
939       an error - after all, you did expect more to come.
940
941       If you wanted to write a server using AnyEvent::Handle, you would use
942       "tcp_accept" and then create the AnyEvent::Handle with the "fh"
943       argument.
944
945       The write queue
946
947       The next line sends the actual request:
948
949          $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
950
951       No headers will be sent (this is fine for simple requests), so the
952       whole request is just a single line followed by an empty line to signal
953       the end of the headers to the server.
954
955       The more interesting question is why the method is called "push_write"
956       and not just write. The reason is that you can always add some write
957       data without blocking, and to do this, AnyEvent::Handle needs some
958       write queue internally - and "push_write" simply pushes some data onto
959       the end of that queue, just like Perl's "push" pushes data onto the end
960       of an array.
961
962       The deeper reason is that at some point in the future, there might be
963       "unshift_write" as well, and in any case, we will shortly meet
964       "push_read" and "unshift_read", and it's usually easiest to remember if
965       all those functions have some symmetry in their name. So "push" is used
966       as the opposite of "unshift" in AnyEvent::Handle, not as the opposite
967       of "pull" - just like in Perl.
968
969       Note that we call "push_write" right after creating the
970       AnyEvent::Handle object, before it has had time to actually connect to
971       the server. This is fine, pushing the read and write requests will
972       simply queue them in the handle object until the connection has been
973       established. Alternatively, we could do this "on demand" in the
974       "on_connect" callback.
975
976       If "push_write" is called with more than one argument, then you can
977       even do formatted I/O, which simply means your data will be transformed
978       in some ways. For example, this would JSON-encode your data before
979       pushing it to the write queue:
980
981          $handle->push_write (json => [1, 2, 3]);
982
983       Apart from that, this pretty much summarises the write queue, there is
984       little else to it.
985
986       Reading the response is far more interesting, because it involves the
987       more powerful and complex read queue:
988
989       The read queue
990
991       The response consists of three parts: a single line with the response
992       status, a single paragraph of headers ended by an empty line, and the
993       request body, which is simply the remaining data on that connection.
994
995       For the first two, we push two read requests onto the read queue:
996
997          # now fetch response status line
998          $handle->push_read (line => sub {
999             my ($handle, $line) = @_;
1000             $response = $line;
1001          });
1002
1003          # then the headers
1004          $handle->push_read (line => "\015\012\015\012", sub {
1005             my ($handle, $line) = @_;
1006             $header = $line;
1007          });
1008
1009       While one can simply push a single callback to parse the data the
1010       queue, formatted I/O really comes to our advantage here, as there is a
1011       ready-made "read line" read type. The first read expects a single line,
1012       ended by "\015\012" (the standard end-of-line marker in internet
1013       protocols).
1014
1015       The second "line" is actually a single paragraph - instead of reading
1016       it line by line we tell "push_read" that the end-of-line marker is
1017       really "\015\012\015\012", which is an empty line. The result is that
1018       the whole header paragraph will be treated as a single line and read.
1019       The word "line" is interpreted very freely, much like Perl itself does
1020       it.
1021
1022       Note that push read requests are pushed immediately after creating the
1023       handle object - since AnyEvent::Handle provides a queue we can push as
1024       many requests as we want, and AnyEvent::Handle will handle them in
1025       order.
1026
1027       There is, however, no read type for "the remaining data". For that, we
1028       install our own "on_read" callback:
1029
1030          # and finally handle any remaining data as body
1031          $handle->on_read (sub {
1032             $body .= $_[0]->rbuf;
1033             $_[0]->rbuf = "";
1034          });
1035
1036       This callback is invoked every time data arrives and the read queue is
1037       empty - which in this example will only be the case when both response
1038       and header have been read. The "on_read" callback could actually have
1039       been specified when constructing the object, but doing it this way
1040       preserves logical ordering.
1041
1042       The read callback simply adds the current read buffer to it's $body
1043       variable and, most importantly, empties the buffer by assigning the
1044       empty string to it.
1045
1046       After AnyEvent::Handle has been so instructed, it will handle incoming
1047       data according to these instructions - if all goes well, the callback
1048       will be invoked with the response data, if not, it will get an error.
1049
1050       In general, you can implement pipelining (a semi-advanced feature of
1051       many protocols) very easy with AnyEvent::Handle: If you have a protocol
1052       with a request/response structure, your request methods/functions will
1053       all look like this (simplified):
1054
1055          sub request {
1056
1057             # send the request to the server
1058             $handle->push_write (...);
1059
1060             # push some response handlers
1061             $handle->push_read (...);
1062          }
1063
1064       This means you can queue as many requests as you want, and while
1065       AnyEvent::Handle goes through its read queue to handle the response
1066       data, the other side can work on the next request - queueing the
1067       request just appends some data to the write queue and installs a
1068       handler to be called later.
1069
1070       You might ask yourself how to handle decisions you can only make after
1071       you have received some data (such as handling a short error response or
1072       a long and differently-formatted response). The answer to this problem
1073       is "unshift_read", which we will introduce together with an example in
1074       the coming sections.
1075
1076       Using "http_get"
1077
1078       Finally, here is how you would use "http_get":
1079
1080          http_get "www.google.com", "/", sub {
1081             my ($response, $header, $body) = @_;
1082
1083             print
1084                $response, "\n",
1085                $body;
1086          };
1087
1088       And of course, you can run as many of these requests in parallel as you
1089       want (and your memory supports).
1090
1091       HTTPS
1092
1093       Now, as promised, let's implement the same thing for HTTPS, or more
1094       correctly, let's change our "http_get" function into a function that
1095       speaks HTTPS instead.
1096
1097       HTTPS is, quite simply, a standard TLS connection (Transport Layer
1098       Security is the official name for what most people refer to as "SSL")
1099       that contains standard HTTP protocol exchanges. The only other
1100       difference to HTTP is that by default it uses port 443 instead of port
1101       80.
1102
1103       To implement these two differences we need two tiny changes, first, in
1104       the "connect" parameter, we replace "http" by "https" to connect to the
1105       https port:
1106
1107                connect  => [$host => 'https'],
1108
1109       The other change deals with TLS, which is something AnyEvent::Handle
1110       does for us, as long as you made sure that the Net::SSLeay module is
1111       around. To enable TLS with AnyEvent::Handle, we simply pass an
1112       additional "tls" parameter to the call to "AnyEvent::Handle::new":
1113
1114                tls      => "connect",
1115
1116       Specifying "tls" enables TLS, and the argument specifies whether
1117       AnyEvent::Handle is the server side ("accept") or the client side
1118       ("connect") for the TLS connection, as unlike TCP, there is a clear
1119       server/client relationship in TLS.
1120
1121       That's all.
1122
1123       Of course, all this should be handled transparently by "http_get" after
1124       parsing the URL. If you need this, see the part about exercising your
1125       inspiration earlier in this document. You could also use the
1126       AnyEvent::HTTP module from CPAN, which implements all this and works
1127       around a lot of quirks for you, too.
1128
1129       The read queue - revisited
1130
1131       HTTP always uses the same structure in its responses, but many
1132       protocols require parsing responses differently depending on the
1133       response itself.
1134
1135       For example, in SMTP, you normally get a single response line:
1136
1137          220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1138
1139       But SMTP also supports multi-line responses:
1140
1141          220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1142          220-hey guys
1143          220 my response is longer than yours
1144
1145       To handle this, we need "unshift_read". As the name (hopefully)
1146       implies, "unshift_read" will not append your read request to the end of
1147       the read queue, but instead it will prepend it to the queue.
1148
1149       This is useful in the situation above: Just push your response-line
1150       read request when sending the SMTP command, and when handling it, you
1151       look at the line to see if more is to come, and "unshift_read" another
1152       reader callback if required, like this:
1153
1154          my $response; # response lines end up in here
1155
1156          my $read_response; $read_response = sub {
1157             my ($handle, $line) = @_;
1158
1159             $response .= "$line\n";
1160
1161             # check for continuation lines ("-" as 4th character")
1162             if ($line =~ /^...-/) {
1163                # if yes, then unshift another line read
1164                $handle->unshift_read (line => $read_response);
1165
1166             } else {
1167                # otherwise we are done
1168
1169                # free callback
1170                undef $read_response;
1171
1172                print "we are don reading: $response\n";
1173             }
1174          };
1175
1176          $handle->push_read (line => $read_response);
1177
1178       This recipe can be used for all similar parsing problems, for example
1179       in NNTP, the response code to some commands indicates that more data
1180       will be sent:
1181
1182          $handle->push_write ("article 42");
1183
1184          # read response line
1185          $handle->push_read (line => sub {
1186             my ($handle, $status) = @_;
1187
1188             # article data following?
1189             if ($status =~ /^2/) {
1190                # yes, read article body
1191
1192                $handle->unshift_read (line => "\012.\015\012", sub {
1193                   my ($handle, $body) = @_;
1194
1195                   $finish->($status, $body);
1196                });
1197
1198             } else {
1199                # some error occured, no article data
1200
1201                $finish->($status);
1202             }
1203          }
1204
1205       Your own read queue handler
1206
1207       Sometimes, your protocol doesn't play nice and uses lines or chunks of
1208       data not formatted in a way handled by AnyEvent::Handle out of the box.
1209       In this case you have to implement your own read parser.
1210
1211       To make up a contorted example, imagine you are looking for an even
1212       number of characters followed by a colon (":"). Also imagine that
1213       AnyEvent::Handle had no "regex" read type which could be used, so you'd
1214       had to do it manually.
1215
1216       To implement a read handler for this, you would "push_read" (or
1217       "unshift_read") just a single code reference.
1218
1219       This code reference will then be called each time there is (new) data
1220       available in the read buffer, and is expected to either successfully
1221       eat/consume some of that data (and return true) or to return false to
1222       indicate that it wants to be called again.
1223
1224       If the code reference returns true, then it will be removed from the
1225       read queue (because it has parsed/consumed whatever it was supposed to
1226       consume), otherwise it stays in the front of it.
1227
1228       The example above could be coded like this:
1229
1230          $handle->push_read (sub {
1231             my ($handle) = @_;
1232
1233             # check for even number of characters + ":"
1234             # and remove the data if a match is found.
1235             # if not, return false (actually nothing)
1236
1237             $handle->{rbuf} =~ s/^( (?:..)* ) ://x
1238                or return;
1239
1240             # we got some data in $1, pass it to whoever wants it
1241             $finish->($1);
1242
1243             # and return true to indicate we are done
1244             1
1245          });
1246
1247       This concludes our little tutorial.
1248

Where to go from here?

1250       This introduction should have explained the key concepts of AnyEvent -
1251       event watchers and condition variables, AnyEvent::Socket - basic
1252       networking utilities, and AnyEvent::Handle - a nice wrapper around
1253       handles.
1254
1255       You could either start coding stuff right away, look at those manual
1256       pages for the gory details, or roam CPAN for other AnyEvent modules
1257       (such as AnyEvent::IRC or AnyEvent::HTTP) to see more code examples (or
1258       simply to use them).
1259
1260       If you need a protocol that doesn't have an implementation using
1261       AnyEvent, remember that you can mix AnyEvent with one other event
1262       framework, such as POE, so you can always use AnyEvent for your own
1263       tasks plus modules of one other event framework to fill any gaps.
1264
1265       And last not least, you could also look at Coro, especially
1266       Coro::AnyEvent, to see how you can turn event-based programming from
1267       callback style back to the usual imperative style (also called
1268       "inversion of control" - AnyEvent calls you, but Coro lets you call
1269       AnyEvent).
1270

Authors

1272       Robin Redeker "<elmex at ta-sa.org>", Marc Lehmann
1273       <schmorp@schmorp.de>.
1274
1275
1276
1277perl v5.12.1                      2009-12-24                AnyEvent::Intro(3)