AnyEvent::Intro(3pm)

1AnyEvent::Intro(3)    User Contributed Perl Documentation   AnyEvent::Intro(3)
2
3
4

NAME

6       AnyEvent::Intro - an introductory tutorial to AnyEvent
7

Introduction to AnyEvent

9       This is a tutorial that will introduce you to the features of AnyEvent.
10
11       The first part introduces the core AnyEvent module (after swamping you
12       a bit in evangelism), which might already provide all you ever need: If
13       you are only interested in AnyEvent's event handling capabilities, read
14       no further.
15
16       The second part focuses on network programming using sockets, for which
17       AnyEvent offers a lot of support you can use, and a lot of workarounds
18       around portability quirks.
19

What is AnyEvent?

21       If you don't care for the whys and want to see code, skip this section!
22
23       AnyEvent is first of all just a framework to do event-based
24       programming. Typically such frameworks are an all-or-nothing thing: If
25       you use one such framework, you can't (easily, or even at all) use
26       another in the same program.
27
28       AnyEvent is different - it is a thin abstraction layer on top of other
29       event loops, just like DBI is an abstraction of many different database
30       APIs. Its main purpose is to move the choice of the underlying
31       framework (the event loop) from the module author to the program author
32       using the module.
33
34       That means you can write code that uses events to control what it does,
35       without forcing other code in the same program to use the same
36       underlying framework as you do - i.e. you can create a Perl module that
37       is event-based using AnyEvent, and users of that module can still
38       choose between using Gtk2, Tk, Event (or run inside Irssi or rxvt-
39       unicode) or any other supported event loop. AnyEvent even comes with
40       its own pure-perl event loop implementation, so your code works
41       regardless of other modules that might or might not be installed. The
42       latter is important, as AnyEvent does not have any hard dependencies to
43       other modules, which makes it easy to install, for example, when you
44       lack a C compiler. No matter what environment, AnyEvent will just cope
45       with it.
46
47       A typical limitation of existing Perl modules such as Net::IRC is that
48       they come with their own event loop: In Net::IRC, a program which uses
49       it needs to start the event loop of Net::IRC. That means that one
50       cannot integrate this module into a Gtk2 GUI for instance, as that
51       module, too, enforces the use of its own event loop (namely Glib).
52
53       Another example is LWP: it provides no event interface at all. It's a
54       pure blocking HTTP (and FTP etc.) client library, which usually means
55       that you either have to start another process or have to fork for a
56       HTTP request, or use threads (e.g. Coro::LWP), if you want to do
57       something else while waiting for the request to finish.
58
59       The motivation behind these designs is often that a module doesn't want
60       to depend on some complicated XS-module (Net::IRC), or that it doesn't
61       want to force the user to use some specific event loop at all (LWP),
62       out of fear of severly limiting the usefulness of the module: If your
63       module requires Glib, it will not run in a Tk program.
64
65       AnyEvent solves this dilemma, by not forcing module authors to either:
66
67       - write their own event loop (because it guarantees the availability of
68       an event loop everywhere - even on windows with no extra modules
69       installed).
70       - choose one specific event loop (because AnyEvent works with most
71       event loops available for Perl).
72
73       If the module author uses AnyEvent for all his (or her) event needs (IO
74       events, timers, signals, ...) then all other modules can just use his
75       module and don't have to choose an event loop or adapt to his event
76       loop. The choice of the event loop is ultimately made by the program
77       author who uses all the modules and writes the main program. And even
78       there he doesn't have to choose, he can just let AnyEvent choose the
79       most efficient event loop available on the system.
80
81       Read more about this in the main documentation of the AnyEvent module.
82

Introduction to Event-Based Programming

84       So what exactly is programming using events? It quite simply means that
85       instead of your code actively waiting for something, such as the user
86       entering something on STDIN:
87
88          $| = 1; print "enter your name> ";
89
90          my $name = <STDIN>;
91
92       You instead tell your event framework to notify you in the event of
93       some data being available on STDIN, by using a callback mechanism:
94
95          use AnyEvent;
96
97          $| = 1; print "enter your name> ";
98
99          my $name;
100
101          my $wait_for_input = AnyEvent->io (
102             fh   => \*STDIN, # which file handle to check
103             poll => "r",     # which event to wait for ("r"ead data)
104             cb   => sub {    # what callback to execute
105                $name = <STDIN>; # read it
106             }
107          );
108
109          # do something else here
110
111       Looks more complicated, and surely is, but the advantage of using
112       events is that your program can do something else instead of waiting
113       for input (side note: combining AnyEvent with a thread package such as
114       Coro can recoup much of the simplicity, effectively getting the best of
115       two worlds).
116
117       Waiting as done in the first example is also called "blocking" the
118       process because you "block"/keep your process from executing anything
119       else while you do so.
120
121       The second example avoids blocking by only registering interest in a
122       read event, which is fast and doesn't block your process. The callback
123       will be called only when data is available and can be read without
124       blocking.
125
126       The "interest" is represented by an object returned by "AnyEvent->io"
127       called a "watcher" object - thus named because it "watches" your file
128       handle (or other event sources) for the event you are interested in.
129
130       In the example above, we create an I/O watcher by calling the
131       "AnyEvent->io" method. A lack of further interest in some event is
132       expressed by simply forgetting about its watcher, for example by
133       "undef"-ing the only variable it is stored in. AnyEvent will
134       automatically clean up the watcher if it is no longer used, much like
135       Perl closes your file handles if you no longer use them anywhere.
136
137       A short note on callbacks
138
139       A common issue that hits people is the problem of passing parameters to
140       callbacks. Programmers used to languages such as C or C++ are often
141       used to a style where one passes the address of a function (a function
142       reference) and some data value, e.g.:
143
144          sub callback {
145             my ($arg) = @_;
146
147             $arg->method;
148          }
149
150          my $arg = ...;
151
152          call_me_back_later \&callback, $arg;
153
154       This is clumsy, as the place where behaviour is specified (when the
155       callback is registered) is often far away from the place where
156       behaviour is implemented. It also doesn't use Perl syntax to invoke the
157       code. There is also an abstraction penalty to pay as one has to name
158       the callback, which often is unnecessary and leads to nonsensical or
159       duplicated names.
160
161       In Perl, one can specify behaviour much more directly by using
162       closures. Closures are code blocks that take a reference to the
163       enclosing scope(s) when they are created. This means lexical variables
164       in scope when a closure is created can be used inside the closure:
165
166          my $arg = ...;
167
168          call_me_back_later sub { $arg->method };
169
170       Under most circumstances, closures are faster, use fewer resources and
171       result in much clearer code than the traditional approach. Faster,
172       because parameter passing and storing them in local variables in Perl
173       is relatively slow. Fewer resources, because closures take references
174       to existing variables without having to create new ones, and clearer
175       code because it is immediately obvious that the second example calls
176       the "method" method when the callback is invoked.
177
178       Apart from these, the strongest argument for using closures with
179       AnyEvent is that AnyEvent does not allow passing parameters to the
180       callback, so closures are the only way to achieve that in most cases
181       :->
182
183       A little hint to catch mistakes
184
185       AnyEvent does not check the parameters you pass in, at least not by
186       default. to enable checking, simply start your program with
187       "AE_STRICT=1" in the environment, or put "use AnyEvent::Strict" near
188       the top of your program:
189
190          AE_STRICT=1 perl myprogram
191
192       You can find more info on this and additional debugging aids later in
193       this introduction.
194
195   Condition Variables
196       Back to the I/O watcher example: The code is not yet a fully working
197       program, and will not work as-is. The reason is that your callback will
198       not be invoked out of the blue; you have to run the event loop first.
199       Also, event-based programs need to block sometimes too, such as when
200       there is nothing to do, and everything is waiting for new events to
201       arrive.
202
203       In AnyEvent, this is done using condition variables. Condition
204       variables are named "condition variables" because they represent a
205       condition that is initially false and needs to be fulfilled.
206
207       You can also call them "merge points", "sync points", "rendezvous
208       ports" or even callbacks and many other things (and they are often
209       called these names in other frameworks). The important point is that
210       you can create them freely and later wait for them to become true.
211
212       Condition variables have two sides - one side is the "producer" of the
213       condition (whatever code detects and flags the condition), the other
214       side is the "consumer" (the code that waits for that condition).
215
216       In our example in the previous section, the producer is the event
217       callback and there is no consumer yet - let's change that right now:
218
219          use AnyEvent;
220
221          $| = 1; print "enter your name> ";
222
223          my $name;
224
225          my $name_ready = AnyEvent->condvar;
226
227          my $wait_for_input = AnyEvent->io (
228             fh   => \*STDIN,
229             poll => "r",
230             cb   => sub {
231                $name = <STDIN>;
232                $name_ready->send;
233             }
234          );
235
236          # do something else here
237
238          # now wait until the name is available:
239          $name_ready->recv;
240
241          undef $wait_for_input; # watcher no longer needed
242
243          print "your name is $name\n";
244
245       This program creates an AnyEvent condvar by calling the
246       "AnyEvent->condvar" method. It then creates a watcher as usual, but
247       inside the callback it "send"s the $name_ready condition variable,
248       which causes whoever is waiting on it to continue.
249
250       The "whoever" in this case is the code that follows, which calls
251       "$name_ready->recv": The producer calls "send", the consumer calls
252       "recv".
253
254       If there is no $name available yet, then the call to
255       "$name_ready->recv" will halt your program until the condition becomes
256       true.
257
258       As the names "send" and "recv" imply, you can actually send and receive
259       data using this, for example, the above code could also be written like
260       this, without an extra variable to store the name in:
261
262          use AnyEvent;
263
264          $| = 1; print "enter your name> ";
265
266          my $name_ready = AnyEvent->condvar;
267
268          my $wait_for_input = AnyEvent->io (
269             fh => \*STDIN, poll => "r",
270             cb => sub { $name_ready->send (scalar <STDIN>) }
271          );
272
273          # do something else here
274
275          # now wait and fetch the name
276          my $name = $name_ready->recv;
277
278          undef $wait_for_input; # watcher no longer needed
279
280          print "your name is $name\n";
281
282       You can pass any number of arguments to "send", and every subsequent
283       call to "recv" will return them.
284
285   The "main loop"
286       Most event-based frameworks have something called a "main loop" or
287       "event loop run function" or something similar.
288
289       Just like in "recv" AnyEvent, these functions need to be called
290       eventually so that your event loop has a chance of actually looking for
291       the events you are interested in.
292
293       For example, in a Gtk2 program, the above example could also be written
294       like this:
295
296          use Gtk2 -init;
297          use AnyEvent;
298
299          ############################################
300          # create a window and some label
301
302          my $window = new Gtk2::Window "toplevel";
303          $window->add (my $label = new Gtk2::Label "soon replaced by name");
304
305          $window->show_all;
306
307          ############################################
308          # do our AnyEvent stuff
309
310          $| = 1; print "enter your name> ";
311
312          my $name_ready = AnyEvent->condvar;
313
314          my $wait_for_input = AnyEvent->io (
315             fh => \*STDIN, poll => "r",
316             cb => sub {
317                # set the label
318                $label->set_text (scalar <STDIN>);
319                print "enter another name> ";
320             }
321          );
322
323          ############################################
324          # Now enter Gtk2's event loop
325
326          main Gtk2;
327
328       No condition variable anywhere in sight - instead, we just read a line
329       from STDIN and replace the text in the label. In fact, since nobody
330       "undef"s $wait_for_input you can enter multiple lines.
331
332       Instead of waiting for a condition variable, the program enters the
333       Gtk2 main loop by calling "Gtk2->main", which will block the program
334       and wait for events to arrive.
335
336       This also shows that AnyEvent is quite flexible - you didn't have to do
337       anything to make the AnyEvent watcher use Gtk2 (actually Glib) - it
338       just worked.
339
340       Admittedly, the example is a bit silly - who would want to read names
341       from standard input in a Gtk+ application? But imagine that instead of
342       doing that, you make an HTTP request in the background and display its
343       results. In fact, with event-based programming you can make many HTTP
344       requests in parallel in your program and still provide feedback to the
345       user and stay interactive.
346
347       And in the next part you will see how to do just that - by implementing
348       an HTTP request, on our own, with the utility modules AnyEvent comes
349       with.
350
351       Before that, however, let's briefly look at how you would write your
352       program using only AnyEvent, without ever calling some other event
353       loop's run function.
354
355       In the example using condition variables, we used those to start
356       waiting for events, and in fact, condition variables are the solution:
357
358          my $quit_program = AnyEvent->condvar;
359
360          # create AnyEvent watchers (or not) here
361
362          $quit_program->recv;
363
364       If any of your watcher callbacks decide to quit (this is often called
365       an "unloop" in other frameworks), they can just call
366       "$quit_program->send". Of course, they could also decide not to and
367       call "exit" instead, or they could decide never to quit (e.g. in a
368       long-running daemon program).
369
370       If you don't need some clean quit functionality and just want to run
371       the event loop, you can do this:
372
373          AnyEvent->condvar->recv;
374
375       And this is, in fact, the closest to the idea of a main loop run
376       function that AnyEvent offers.
377
378   Timers and other event sources
379       So far, we have used only I/O watchers. These are useful mainly to find
380       out whether a socket has data to read, or space to write more data. On
381       sane operating systems this also works for console windows/terminals
382       (typically on standard input), serial lines, all sorts of other
383       devices, basically almost everything that has a file descriptor but
384       isn't a file itself. (As usual, "sane" excludes windows - on that
385       platform you would need different functions for all of these,
386       complicating code immensely - think "socket only" on windows).
387
388       However, I/O is not everything - the second most important event source
389       is the clock. For example when doing an HTTP request you might want to
390       time out when the server doesn't answer within some predefined amount
391       of time.
392
393       In AnyEvent, timer event watchers are created by calling the
394       "AnyEvent->timer" method:
395
396          use AnyEvent;
397
398          my $cv = AnyEvent->condvar;
399
400          my $wait_one_and_a_half_seconds = AnyEvent->timer (
401             after => 1.5,  # after how many seconds to invoke the cb?
402             cb    => sub { # the callback to invoke
403                $cv->send;
404             },
405          );
406
407          # can do something else here
408
409          # now wait till our time has come
410          $cv->recv;
411
412       Unlike I/O watchers, timers are only interested in the amount of
413       seconds they have to wait. When (at least) that amount of time has
414       passed, AnyEvent will invoke your callback.
415
416       Unlike I/O watchers, which will call your callback as many times as
417       there is data available, timers are normally one-shot: after they have
418       "fired" once and invoked your callback, they are dead and no longer do
419       anything.
420
421       To get a repeating timer, such as a timer firing roughly once per
422       second, you can specify an "interval" parameter:
423
424          my $once_per_second = AnyEvent->timer (
425             after => 0,    # first invoke ASAP
426             interval => 1, # then invoke every second
427             cb    => sub { # the callback to invoke
428                $cv->send;
429             },
430          );
431
432       More esoteric sources
433
434       AnyEvent also has some other, more esoteric event sources you can tap
435       into: signal, child and idle watchers.
436
437       Signal watchers can be used to wait for "signal events", which means
438       your process was sent a signal (such as "SIGTERM" or "SIGUSR1").
439
440       Child-process watchers wait for a child process to exit. They are
441       useful when you fork a separate process and need to know when it exits,
442       but you do not want to wait for that by blocking.
443
444       Idle watchers invoke their callback when the event loop has handled all
445       outstanding events, polled for new events and didn't find any, i.e.,
446       when your process is otherwise idle. They are useful if you want to do
447       some non-trivial data processing that can be done when your program
448       doesn't have anything better to do.
449
450       All these watcher types are described in detail in the main AnyEvent
451       manual page.
452
453       Sometimes you also need to know what the current time is:
454       "AnyEvent->now" returns the time the event toolkit uses to schedule
455       relative timers, and is usually what you want. It is often cached
456       (which means it can be a bit outdated). In that case, you can use the
457       more costly "AnyEvent->time" method which will ask your operating
458       system for the current time, which is slower, but also more up to date.
459

Network programming and AnyEvent

461       So far you have seen how to register event watchers and handle events.
462
463       This is a great foundation to write network clients and servers, and
464       might be all that your module (or program) ever requires, but writing
465       your own I/O buffering again and again becomes tedious, not to mention
466       that it attracts errors.
467
468       While the core AnyEvent module is still small and self-contained, the
469       distribution comes with some very useful utility modules such as
470       AnyEvent::Handle, AnyEvent::DNS and AnyEvent::Socket. These can make
471       your life as a non-blocking network programmer a lot easier.
472
473       Here is a quick overview of these three modules:
474
475   AnyEvent::DNS
476       This module allows fully asynchronous DNS resolution. It is used mainly
477       by AnyEvent::Socket to resolve hostnames and service ports for you, but
478       is a great way to do other DNS resolution tasks, such as reverse
479       lookups of IP addresses for log files.
480
481   AnyEvent::Handle
482       This module handles non-blocking IO on (socket-, pipe- etc.) file
483       handles in an event based manner. It provides a wrapper object around
484       your file handle that provides queueing and buffering of incoming and
485       outgoing data for you.
486
487       It also implements the most common data formats, such as text lines, or
488       fixed and variable-width data blocks.
489
490   AnyEvent::Socket
491       This module provides you with functions that handle socket creation and
492       IP address magic. The two main functions are "tcp_connect" and
493       "tcp_server". The former will connect a (streaming) socket to an
494       internet host for you and the later will make a server socket for you,
495       to accept connections.
496
497       This module also comes with transparent IPv6 support, this means: If
498       you write your programs with this module, you will be IPv6 ready
499       without doing anything special.
500
501       It also works around a lot of portability quirks (especially on the
502       windows platform), which makes it even easier to write your programs in
503       a portable way (did you know that windows uses different error codes
504       for all socket functions and that Perl does not know about these? That
505       "Unknown error 10022" (which is "WSAEINVAL") can mean that our
506       "connect" call was successful? That unsuccessful TCP connects might
507       never be reported back to your program? That "WSAEINPROGRESS" means
508       your "connect" call was ignored instead of being in progress?
509       AnyEvent::Socket works around all of these Windows/Perl bugs for you).
510
511   Implementing a parallel finger client with non-blocking connects and
512       AnyEvent::Socket
513       The finger protocol is one of the simplest protocols in use on the
514       internet. Or in use in the past, as almost nobody uses it anymore.
515
516       It works by connecting to the finger port on another host, writing a
517       single line with a user name and then reading the finger response, as
518       specified by that user. OK, RFC 1288 specifies a vastly more complex
519       protocol, but it basically boils down to this:
520
521          # telnet freebsd.org finger
522          Trying 8.8.178.135...
523          Connected to freebsd.org (8.8.178.135).
524          Escape character is '^]'.
525          larry
526          Login: lile                             Name: Larry Lile
527          Directory: /home/lile                   Shell: /usr/local/bin/bash
528          No Mail.
529          Mail forwarded to: lile@stdio.com
530          No Plan.
531
532       So let's write a little AnyEvent function that makes a finger request:
533
534          use AnyEvent;
535          use AnyEvent::Socket;
536
537          sub finger($$) {
538             my ($user, $host) = @_;
539
540             # use a condvar to return results
541             my $cv = AnyEvent->condvar;
542
543             # first, connect to the host
544             tcp_connect $host, "finger", sub {
545                # the callback receives the socket handle - or nothing
546                my ($fh) = @_
547                   or return $cv->send;
548
549                # now write the username
550                syswrite $fh, "$user\015\012";
551
552                my $response;
553
554                # register a read watcher
555                my $read_watcher; $read_watcher = AnyEvent->io (
556                   fh   => $fh,
557                   poll => "r",
558                   cb   => sub {
559                      my $len = sysread $fh, $response, 1024, length $response;
560
561                      if ($len <= 0) {
562                         # we are done, or an error occured, lets ignore the latter
563                         undef $read_watcher; # no longer interested
564                         $cv->send ($response); # send results
565                      }
566                   },
567                );
568             };
569
570             # pass $cv to the caller
571             $cv
572          }
573
574       That's a mouthful! Let's dissect this function a bit, first the overall
575       function and execution flow:
576
577          sub finger($$) {
578             my ($user, $host) = @_;
579
580             # use a condvar to return results
581             my $cv = AnyEvent->condvar;
582
583             # first, connect to the host
584             tcp_connect $host, "finger", sub {
585                ...
586             };
587
588             $cv
589          }
590
591       This isn't too complicated, just a function with two parameters that
592       creates a condition variable $cv, initiates a TCP connect to $host, and
593       returns $cv. The caller can use the returned $cv to receive the finger
594       response, but one could equally well pass a third argument, a callback,
595       to the function.
596
597       Since we are programming event'ish, we do not wait for the connect to
598       finish - it could block the program for a minute or longer!
599
600       Instead, we pass "tcp_connect" a callback to invoke when the connect is
601       done. The callback is called with the socket handle as its first
602       argument if the connect succeeds, and no arguments otherwise. The
603       important point is that it will always be called as soon as the outcome
604       of the TCP connect is known.
605
606       This style of programming is also called "continuation style": the
607       "continuation" is simply the way the program continues - normally at
608       the next line after some statement (the exception is loops or things
609       like "return"). When we are interested in events, however, we instead
610       specify the "continuation" of our program by passing a closure, which
611       makes that closure the "continuation" of the program.
612
613       The "tcp_connect" call is like saying "return now, and when the
614       connection is established or the attempt failed, continue there".
615
616       Now let's look at the callback/closure in more detail:
617
618                # the callback receives the socket handle - or nothing
619                my ($fh) = @_
620                   or return $cv->send;
621
622       The first thing the callback does is to save the socket handle in $fh.
623       When there was an error (no arguments), then our instinct as expert
624       Perl programmers would tell us to "die":
625
626                my ($fh) = @_
627                   or die "$host: $!";
628
629       While this would give good feedback to the user (if he happens to watch
630       standard error), our program would probably stop working here, as we
631       never report the results to anybody, certainly not the caller of our
632       "finger" function, and most event loops continue even after a "die"!
633
634       This is why we instead "return", but also call "$cv->send" without any
635       arguments to signal to the condvar consumer that something bad has
636       happened. The return value of "$cv->send" is irrelevant, as is the
637       return value of our callback. The "return" statement is used for the
638       side effect of, well, returning immediately from the callback.
639       Checking for errors and handling them this way is very common, which is
640       why this compact idiom is so handy.
641
642       As the next step in the finger protocol, we send the username to the
643       finger daemon on the other side of our connection (the kernel.org
644       finger service doesn't actually wait for a username, but the net is
645       running out of finger servers fast):
646
647                syswrite $fh, "$user\015\012";
648
649       Note that this isn't 100% clean socket programming - the socket could,
650       for whatever reasons, not accept our data. When writing a small amount
651       of data like in this example it doesn't matter, as a socket buffer is
652       almost always big enough for a mere "username", but for real-world
653       cases you might need to implement some kind of write buffering - or use
654       AnyEvent::Handle, which handles these matters for you, as shown in the
655       next section.
656
657       What we do have to do is implement our own read buffer - the response
658       data could arrive late or in multiple chunks, and we cannot just wait
659       for it (event-based programming, you know?).
660
661       To do that, we register a read watcher on the socket which waits for
662       data:
663
664                my $read_watcher; $read_watcher = AnyEvent->io (
665                   fh   => $fh,
666                   poll => "r",
667
668       There is a trick here, however: the read watcher isn't stored in a
669       global variable, but in a local one - if the callback returns, it would
670       normally destroy the variable and its contents, which would in turn
671       unregister our watcher.
672
673       To avoid that, we refer to the watcher variable in the watcher
674       callback.  This means that, when the "tcp_connect" callback returns,
675       perl thinks (quite correctly) that the read watcher is still in use -
676       namely inside the inner callback - and thus keeps it alive even if
677       nothing else in the program refers to it anymore (it is much like Baron
678       Münchhausen keeping himself from dying by pulling himself out of a
679       swamp).
680
681       The trick, however, is that instead of:
682
683          my $read_watcher = AnyEvent->io (...
684
685       The program does:
686
687          my $read_watcher; $read_watcher = AnyEvent->io (...
688
689       The reason for this is a quirk in the way Perl works: variable names
690       declared with "my" are only visible in the next statement. If the whole
691       "AnyEvent->io" call, including the callback, would be done in a single
692       statement, the callback could not refer to the $read_watcher variable
693       to "undef"ine it, so it is done in two statements.
694
695       Whether you'd want to format it like this is of course a matter of
696       style.  This way emphasizes that the declaration and assignment really
697       are one logical statement.
698
699       The callback itself calls "sysread" for as many times as necessary,
700       until "sysread" returns either an error or end-of-file:
701
702                   cb   => sub {
703                      my $len = sysread $fh, $response, 1024, length $response;
704
705                      if ($len <= 0) {
706
707       Note that "sysread" has the ability to append data it reads to a scalar
708       if we specify an offset, a feature which we make use of in this
709       example.
710
711       When "sysread" indicates we are done, the callback "undef"ines the
712       watcher and then "send"s the response data to the condition variable.
713       All this has the following effects:
714
715       Undefining the watcher destroys it, as our callback was the only one
716       still having a reference to it. When the watcher gets destroyed, it
717       destroys the callback, which in turn means the $fh handle is no longer
718       used, so that gets destroyed as well. The result is that all resources
719       will be nicely cleaned up by perl for us.
720
721       Using the finger client
722
723       Now, we could probably write the same finger client in a simpler way if
724       we used "IO::Socket::INET", ignored the problem of multiple hosts and
725       ignored IPv6 and a few other things that "tcp_connect" handles for us.
726
727       But the main advantage is that we can not only run this finger function
728       in the background, we even can run multiple sessions in parallel, like
729       this:
730
731          my $f1 = finger "kuriyama", "freebsd.org";
732          my $f2 = finger "icculus?listarchives=1", "icculus.org";
733          my $f3 = finger "mikachu", "icculus.org";
734
735          print "kuriyama's gpg key\n"    , $f1->recv, "\n";
736          print "icculus' plan archive\n" , $f2->recv, "\n";
737          print "mikachu's plan zomgn\n"  , $f3->recv, "\n";
738
739       It doesn't look like it, but in fact all three requests run in
740       parallel. The code waits for the first finger request to finish first,
741       but that doesn't keep it from executing them parallel: when the first
742       "recv" call sees that the data isn't ready yet, it serves events for
743       all three requests automatically, until the first request has finished.
744
745       The second "recv" call might either find the data is already there, or
746       it will continue handling events until that is the case, and so on.
747
748       By taking advantage of network latencies, which allows us to serve
749       other requests and events while we wait for an event on one socket, the
750       overall time to do these three requests will be greatly reduced,
751       typically all three are done in the same time as the slowest of the
752       three requests.
753
754       By the way, you do not actually have to wait in the "recv" method on an
755       AnyEvent condition variable - after all, waiting is evil - you can also
756       register a callback:
757
758          $f1->cb (sub {
759             my $response = shift->recv;
760             # ...
761          });
762
763       The callback will be invoked only when "send" is called. In fact,
764       instead of returning a condition variable you could also pass a third
765       parameter to your finger function, the callback to invoke with the
766       response:
767
768          sub finger($$$) {
769             my ($user, $host, $cb) = @_;
770
771       How you implement it is a matter of taste - if you expect your function
772       to be used mainly in an event-based program you would normally prefer
773       to pass a callback directly. If you write a module and expect your
774       users to use it "synchronously" often (for example, a simple http-get
775       script would not really care much for events), then you would use a
776       condition variable and tell them "simply "->recv" the data".
777
778       Problems with the implementation and how to fix them
779
780       To make this example more real-world-ready, we would not only implement
781       some write buffering (for the paranoid, or maybe denial-of-service
782       aware security expert), but we would also have to handle timeouts and
783       maybe protocol errors.
784
785       Doing this quickly gets unwieldy, which is why we introduce
786       AnyEvent::Handle in the next section, which takes care of all these
787       details for you and lets you concentrate on the actual protocol.
788
789   Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle
790       The AnyEvent::Handle module has been hyped quite a bit in this document
791       so far, so let's see what it really offers.
792
793       As finger is such a simple protocol, let's try something slightly more
794       complicated: HTTP/1.0.
795
796       An HTTP GET request works by sending a single request line that
797       indicates what you want the server to do and the URI you want to act it
798       on, followed by as many "header" lines ("Header: data", same as e-mail
799       headers) as required for the request, followed by an empty line.
800
801       The response is formatted very similarly, first a line with the
802       response status, then again as many header lines as required, then an
803       empty line, followed by any data that the server might send.
804
805       Again, let's try it out with "telnet" (I condensed the output a bit -
806       if you want to see the full response, do it yourself).
807
808          # telnet www.google.com 80
809          Trying 209.85.135.99...
810          Connected to www.google.com (209.85.135.99).
811          Escape character is '^]'.
812          GET /test HTTP/1.0
813
814          HTTP/1.0 404 Not Found
815          Date: Mon, 02 Jun 2008 07:05:54 GMT
816          Content-Type: text/html; charset=UTF-8
817
818          <html><head>
819          [...]
820          Connection closed by foreign host.
821
822       The "GET ..." and the empty line were entered manually, the rest of the
823       telnet output is google's response, in this case a "404 not found" one.
824
825       So, here is how you would do it with "AnyEvent::Handle":
826
827          sub http_get {
828             my ($host, $uri, $cb) = @_;
829
830             # store results here
831             my ($response, $header, $body);
832
833             my $handle; $handle = new AnyEvent::Handle
834                connect  => [$host => 'http'],
835                on_error => sub {
836                   $cb->("HTTP/1.0 500 $!");
837                   $handle->destroy; # explicitly destroy handle
838                },
839                on_eof   => sub {
840                   $cb->($response, $header, $body);
841                   $handle->destroy; # explicitly destroy handle
842                };
843
844             $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
845
846             # now fetch response status line
847             $handle->push_read (line => sub {
848                my ($handle, $line) = @_;
849                $response = $line;
850             });
851
852             # then the headers
853             $handle->push_read (line => "\015\012\015\012", sub {
854                my ($handle, $line) = @_;
855                $header = $line;
856             });
857
858             # and finally handle any remaining data as body
859             $handle->on_read (sub {
860                $body .= $_[0]->rbuf;
861                $_[0]->rbuf = "";
862             });
863          }
864
865       And now let's go through it step by step. First, as usual, the overall
866       "http_get" function structure:
867
868          sub http_get {
869             my ($host, $uri, $cb) = @_;
870
871             # store results here
872             my ($response, $header, $body);
873
874             my $handle; $handle = new AnyEvent::Handle
875                ... create handle object
876
877             ... push data to write
878
879             ... push what to expect to read queue
880          }
881
882       Unlike in the finger example, this time the caller has to pass a
883       callback to "http_get". Also, instead of passing a URL as one would
884       expect, the caller has to provide the hostname and URI - normally you
885       would use the "URI" module to parse a URL and separate it into those
886       parts, but that is left to the inspired reader :)
887
888       Since everything else is left to the caller, all "http_get" does is
889       initiate the connection by creating the AnyEvent::Handle object (which
890       calls "tcp_connect" for us) and leave everything else to its callback.
891
892       The handle object is created, unsurprisingly, by calling the "new"
893       method of AnyEvent::Handle:
894
895             my $handle; $handle = new AnyEvent::Handle
896                connect  => [$host => 'http'],
897                on_error => sub {
898                   $cb->("HTTP/1.0 500 $!");
899                   $handle->destroy; # explicitly destroy handle
900                },
901                on_eof   => sub {
902                   $cb->($response, $header, $body);
903                   $handle->destroy; # explicitly destroy handle
904                };
905
906       The "connect" argument tells AnyEvent::Handle to call "tcp_connect" for
907       the specified host and service/port.
908
909       The "on_error" callback will be called on any unexpected error, such as
910       a refused connection, or unexpected end-of-file while reading headers.
911
912       Instead of having an extra mechanism to signal errors, connection
913       errors are signalled by crafting a special "response status line", like
914       this:
915
916          HTTP/1.0 500 Connection refused
917
918       This means the caller cannot distinguish (easily) between locally-
919       generated errors and server errors, but it simplifies error handling
920       for the caller a lot.
921
922       The error callback also destroys the handle explicitly, because we are
923       not interested in continuing after any errors. In AnyEvent::Handle
924       callbacks you have to call "destroy" explicitly to destroy a handle.
925       Outside of those callbacks you can just forget the object reference and
926       it will be automatically cleaned up.
927
928       Last but not least, we set an "on_eof" callback that is called when the
929       other side indicates it has stopped writing data, which we will use to
930       gracefully shut down the handle and report the results. This callback
931       is only called when the read queue is empty - if the read queue expects
932       some data and the handle gets an EOF from the other side this will be
933       an error - after all, you did expect more to come.
934
935       If you wanted to write a server using AnyEvent::Handle, you would use
936       "tcp_accept" and then create the AnyEvent::Handle with the "fh"
937       argument.
938
939       The write queue
940
941       The next line sends the actual request:
942
943          $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
944
945       No headers will be sent (this is fine for simple requests), so the
946       whole request is just a single line followed by an empty line to signal
947       the end of the headers to the server.
948
949       The more interesting question is why the method is called "push_write"
950       and not just write. The reason is that you can always add some write
951       data without blocking, and to do this, AnyEvent::Handle needs some
952       write queue internally - and "push_write" pushes some data onto the end
953       of that queue, just like Perl's "push" pushes data onto the end of an
954       array.
955
956       The deeper reason is that at some point in the future, there might be
957       "unshift_write" as well, and in any case, we will shortly meet
958       "push_read" and "unshift_read", and it's usually easiest to remember if
959       all those functions have some symmetry in their name. So "push" is used
960       as the opposite of "unshift" in AnyEvent::Handle, not as the opposite
961       of "pull" - just like in Perl.
962
963       Note that we call "push_write" right after creating the
964       AnyEvent::Handle object, before it has had time to actually connect to
965       the server. This is fine, pushing the read and write requests will
966       queue them in the handle object until the connection has been
967       established. Alternatively, we could do this "on demand" in the
968       "on_connect" callback.
969
970       If "push_write" is called with more than one argument, then you can do
971       formatted I/O. For example, this would JSON-encode your data before
972       pushing it to the write queue:
973
974          $handle->push_write (json => [1, 2, 3]);
975
976       This pretty much summarises the write queue, there is little else to
977       it.
978
979       Reading the response is far more interesting, because it involves the
980       more powerful and complex read queue:
981
982       The read queue
983
984       The response consists of three parts: a single line with the response
985       status, a single paragraph of headers ended by an empty line, and the
986       request body, which is the remaining data on the connection.
987
988       For the first two, we push two read requests onto the read queue:
989
990          # now fetch response status line
991          $handle->push_read (line => sub {
992             my ($handle, $line) = @_;
993             $response = $line;
994          });
995
996          # then the headers
997          $handle->push_read (line => "\015\012\015\012", sub {
998             my ($handle, $line) = @_;
999             $header = $line;
1000          });
1001
1002       While one can just push a single callback to parse all the data on the
1003       queue, formatted I/O really comes to our aid here, since there is a
1004       ready-made "read line" read type. The first read expects a single line,
1005       ended by "\015\012" (the standard end-of-line marker in internet
1006       protocols).
1007
1008       The second "line" is actually a single paragraph - instead of reading
1009       it line by line we tell "push_read" that the end-of-line marker is
1010       really "\015\012\015\012", which is an empty line. The result is that
1011       the whole header paragraph will be treated as a single line and read.
1012       The word "line" is interpreted very freely, much like Perl itself does
1013       it.
1014
1015       Note that push read requests are pushed immediately after creating the
1016       handle object - since AnyEvent::Handle provides a queue we can push as
1017       many requests as we want, and AnyEvent::Handle will handle them in
1018       order.
1019
1020       There is, however, no read type for "the remaining data". For that, we
1021       install our own "on_read" callback:
1022
1023          # and finally handle any remaining data as body
1024          $handle->on_read (sub {
1025             $body .= $_[0]->rbuf;
1026             $_[0]->rbuf = "";
1027          });
1028
1029       This callback is invoked every time data arrives and the read queue is
1030       empty - which in this example will only be the case when both response
1031       and header have been read. The "on_read" callback could actually have
1032       been specified when constructing the object, but doing it this way
1033       preserves logical ordering.
1034
1035       The read callback adds the current read buffer to its $body variable
1036       and, most importantly, empties the buffer by assigning the empty string
1037       to it.
1038
1039       Given these instructions, AnyEvent::Handle will handle incoming data -
1040       if all goes well, the callback will be invoked with the response data;
1041       if not, it will get an error.
1042
1043       In general, you can implement pipelining (a semi-advanced feature of
1044       many protocols) very easily with AnyEvent::Handle: If you have a
1045       protocol with a request/response structure, your request
1046       methods/functions will all look like this (simplified):
1047
1048          sub request {
1049
1050             # send the request to the server
1051             $handle->push_write (...);
1052
1053             # push some response handlers
1054             $handle->push_read (...);
1055          }
1056
1057       This means you can queue as many requests as you want, and while
1058       AnyEvent::Handle goes through its read queue to handle the response
1059       data, the other side can work on the next request - queueing the
1060       request just appends some data to the write queue and installs a
1061       handler to be called later.
1062
1063       You might ask yourself how to handle decisions you can only make after
1064       you have received some data (such as handling a short error response or
1065       a long and differently-formatted response). The answer to this problem
1066       is "unshift_read", which we will introduce together with an example in
1067       the coming sections.
1068
1069       Using "http_get"
1070
1071       Finally, here is how you would use "http_get":
1072
1073          http_get "www.google.com", "/", sub {
1074             my ($response, $header, $body) = @_;
1075
1076             print
1077                $response, "\n",
1078                $body;
1079          };
1080
1081       And of course, you can run as many of these requests in parallel as you
1082       want (and your memory supports).
1083
1084       HTTPS
1085
1086       Now, as promised, let's implement the same thing for HTTPS, or more
1087       correctly, let's change our "http_get" function into a function that
1088       speaks HTTPS instead.
1089
1090       HTTPS is a standard TLS connection (Transport Layer Security is the
1091       official name for what most people refer to as "SSL") that contains
1092       standard HTTP protocol exchanges. The only other difference to HTTP is
1093       that by default it uses port 443 instead of port 80.
1094
1095       To implement these two differences we need two tiny changes, first, in
1096       the "connect" parameter, we replace "http" by "https" to connect to the
1097       https port:
1098
1099                connect  => [$host => 'https'],
1100
1101       The other change deals with TLS, which is something AnyEvent::Handle
1102       does for us if the Net::SSLeay module is available. To enable TLS with
1103       AnyEvent::Handle, we pass an additional "tls" parameter to the call to
1104       "AnyEvent::Handle::new":
1105
1106                tls      => "connect",
1107
1108       Specifying "tls" enables TLS, and the argument specifies whether
1109       AnyEvent::Handle is the server side ("accept") or the client side
1110       ("connect") for the TLS connection, as unlike TCP, there is a clear
1111       server/client relationship in TLS.
1112
1113       That's all.
1114
1115       Of course, all this should be handled transparently by "http_get" after
1116       parsing the URL. If you need this, see the part about exercising your
1117       inspiration earlier in this document. You could also use the
1118       AnyEvent::HTTP module from CPAN, which implements all this and works
1119       around a lot of quirks for you too.
1120
1121       The read queue - revisited
1122
1123       HTTP always uses the same structure in its responses, but many
1124       protocols require parsing responses differently depending on the
1125       response itself.
1126
1127       For example, in SMTP, you normally get a single response line:
1128
1129          220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1130
1131       But SMTP also supports multi-line responses:
1132
1133          220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1134          220-hey guys
1135          220 my response is longer than yours
1136
1137       To handle this, we need "unshift_read". As the name (we hope) implies,
1138       "unshift_read" will not append your read request to the end of the read
1139       queue, but will prepend it to the queue instead.
1140
1141       This is useful in the situation above: Just push your response-line
1142       read request when sending the SMTP command, and when handling it, you
1143       look at the line to see if more is to come, and "unshift_read" another
1144       reader callback if required, like this:
1145
1146          my $response; # response lines end up in here
1147
1148          my $read_response; $read_response = sub {
1149             my ($handle, $line) = @_;
1150
1151             $response .= "$line\n";
1152
1153             # check for continuation lines ("-" as 4th character")
1154             if ($line =~ /^...-/) {
1155                # if yes, then unshift another line read
1156                $handle->unshift_read (line => $read_response);
1157
1158             } else {
1159                # otherwise we are done
1160
1161                # free callback
1162                undef $read_response;
1163
1164                print "we are don reading: $response\n";
1165             }
1166          };
1167
1168          $handle->push_read (line => $read_response);
1169
1170       This recipe can be used for all similar parsing problems, for example
1171       in NNTP, the response code to some commands indicates that more data
1172       will be sent:
1173
1174          $handle->push_write ("article 42");
1175
1176          # read response line
1177          $handle->push_read (line => sub {
1178             my ($handle, $status) = @_;
1179
1180             # article data following?
1181             if ($status =~ /^2/) {
1182                # yes, read article body
1183
1184                $handle->unshift_read (line => "\012.\015\012", sub {
1185                   my ($handle, $body) = @_;
1186
1187                   $finish->($status, $body);
1188                });
1189
1190             } else {
1191                # some error occured, no article data
1192
1193                $finish->($status);
1194             }
1195          }
1196
1197       Your own read queue handler
1198
1199       Sometimes your protocol doesn't play nice, and uses lines or chunks of
1200       data not formatted in a way handled out of the box by AnyEvent::Handle.
1201       In this case you have to implement your own read parser.
1202
1203       To make up a contorted example, imagine you are looking for an even
1204       number of characters followed by a colon (":"). Also imagine that
1205       AnyEvent::Handle has no "regex" read type which could be used, so you'd
1206       have to do it manually.
1207
1208       To implement a read handler for this, you would "push_read" (or
1209       "unshift_read") a single code reference.
1210
1211       This code reference will then be called each time there is (new) data
1212       available in the read buffer, and is expected to either successfully
1213       eat/consume some of that data (and return true) or to return false to
1214       indicate that it wants to be called again.
1215
1216       If the code reference returns true, then it will be removed from the
1217       read queue (because it has parsed/consumed whatever it was supposed to
1218       consume), otherwise it stays in the front of it.
1219
1220       The example above could be coded like this:
1221
1222          $handle->push_read (sub {
1223             my ($handle) = @_;
1224
1225             # check for even number of characters + ":"
1226             # and remove the data if a match is found.
1227             # if not, return false (actually nothing)
1228
1229             $handle->{rbuf} =~ s/^( (?:..)* ) ://x
1230                or return;
1231
1232             # we got some data in $1, pass it to whoever wants it
1233             $finish->($1);
1234
1235             # and return true to indicate we are done
1236             1
1237          });
1238

Debugging aids

1240       Now that you have seen how to use AnyEvent, here's what to use when you
1241       don't use it correctly, or simply hit a bug somewhere and want to debug
1242       it:
1243
1244       Enable strict argument checking during development
1245           AnyEvent does not, by default, do any argument checking. This can
1246           lead to strange and unexpected results especially if you are just
1247           trying to find your way with AnyEvent.
1248
1249           AnyEvent supports a special "strict" mode - off by default - which
1250           does very strict argument checking, at the expense of slowing down
1251           your program. During development, however, this mode is very useful
1252           because it quickly catches the msot common errors.
1253
1254           You can enable this strict mode either by having an environment
1255           variable "AE_STRICT" with a true value in your environment:
1256
1257              AE_STRICT=1 perl myprog
1258
1259           Or you can write "use AnyEvent::Strict" in your program, which has
1260           the same effect (do not do this in production, however).
1261
1262       Increase verbosity, configure logging
1263           AnyEvent, by default, only logs critical messages. If something
1264           doesn't work, maybe there was a warning about it that you didn't
1265           see because it was suppressed.
1266
1267           So during development it is recommended to push up the logging
1268           level to at least warn level (5):
1269
1270              AE_VERBOSE=5 perl myprog
1271
1272           Other levels that might be helpful are debug (8) or even trace (9).
1273
1274           AnyEvent's logging is quite versatile - the AnyEvent::Log manpage
1275           has all the details.
1276
1277       Watcher wrapping, tracing, the shell
1278           For even more debugging, you can enable watcher wrapping:
1279
1280             AE_DEBUG_WRAP=2 perl myprog
1281
1282           This will have the effect of wrapping every watcher into a special
1283           object that stores a backtrace of when it was created, stores a
1284           backtrace when an exception occurs during watcher execution, and
1285           stores a lot of other information. If that slows down your program
1286           too much, then "AE_DEBUG_WRAP=1" avoids the costly backtraces.
1287
1288           Here is an example of what of information is stored:
1289
1290              59148536 DC::DB:472(Server::run)>io>DC::DB::Server::fh_read
1291              type:    io watcher
1292              args:    poll r fh GLOB(0x35283f0)
1293              created: 2011-09-01 23:13:46.597336 +0200 (1314911626.59734)
1294              file:    ./blib/lib/Deliantra/Client/private/DC/DB.pm
1295              line:    472
1296              subname: DC::DB::Server::run
1297              context:
1298              tracing: enabled
1299              cb:      CODE(0x2d1fb98) (DC::DB::Server::fh_read)
1300              invoked: 0 times
1301              created
1302              (eval 25) line 6        AnyEvent::Debug::Wrap::__ANON__('AnyEvent','fh',GLOB(0x35283f0),'poll','r','cb',CODE(0x2d1fb98)=DC::DB::Server::fh_read)
1303              DC::DB line 472         AE::io(GLOB(0x35283f0),'0',CODE(0x2d1fb98)=DC::DB::Server::fh_read)
1304              bin/deliantra line 2776 DC::DB::Server::run()
1305              bin/deliantra line 2941 main::main()
1306
1307           There are many ways to get at this data - see the AnyEvent::Debug
1308           and AnyEvent::Log manpages for more details.
1309
1310           The most interesting and interactive way is to create a debug
1311           shell, for example by setting "AE_DEBUG_SHELL":
1312
1313             AE_DEBUG_WRAP=2 AE_DEBUG_SHELL=$HOME/myshell ./myprog
1314
1315             # while myprog is running:
1316             socat readline $HOME/myshell
1317
1318           Note that anybody who can access $HOME/myshell can make your
1319           program do anything he or she wants, so if you are not the only
1320           user on your machine, better put it into a secure location ($HOME
1321           might not be secure enough).
1322
1323           If you don't have "socat" (a shame!) and care even less about
1324           security, you can also use TCP and "telnet":
1325
1326             AE_DEBUG_WRAP=2 AE_DEBUG_SHELL=127.0.0.1:1234 ./myprog
1327
1328             telnet 127.0.0.1 1234
1329
1330           The debug shell can enable and disable tracing of watcher
1331           invocations, can display the trace output, give you a list of
1332           watchers and lets you investigate watchers in detail.
1333
1334       This concludes our little tutorial.
1335

Where to go from here?

1337       This introduction should have explained the key concepts of AnyEvent -
1338       event watchers and condition variables, AnyEvent::Socket - basic
1339       networking utilities, and AnyEvent::Handle - a nice wrapper around
1340       sockets.
1341
1342       You could either start coding stuff right away, look at those manual
1343       pages for the gory details, or roam CPAN for other AnyEvent modules
1344       (such as AnyEvent::IRC or AnyEvent::HTTP) to see more code examples (or
1345       simply to use them).
1346
1347       If you need a protocol that doesn't have an implementation using
1348       AnyEvent, remember that you can mix AnyEvent with one other event
1349       framework, such as POE, so you can always use AnyEvent for your own
1350       tasks plus modules of one other event framework to fill any gaps.
1351
1352       And last not least, you could also look at Coro, especially
1353       Coro::AnyEvent, to see how you can turn event-based programming from
1354       callback style back to the usual imperative style (also called
1355       "inversion of control" - AnyEvent calls you, but Coro lets you call
1356       AnyEvent).
1357

Authors

1359       Robin Redeker "<elmex at ta-sa.org>", Marc Lehmann
1360       <schmorp@schmorp.de>.
1361
1362
1363
1364perl v5.28.1                      2014-11-19                AnyEvent::Intro(3)