1AnyEvent::Intro(3) User Contributed Perl Documentation AnyEvent::Intro(3)
2
3
4
6 AnyEvent::Intro - an introductory tutorial to AnyEvent
7
9 This is a tutorial that will introduce you to the features of AnyEvent.
10
11 The first part introduces the core AnyEvent module (after swamping you
12 a bit in evangelism), which might already provide all you ever need: If
13 you are only interested in AnyEvent's event handling capabilities, read
14 no further.
15
16 The second part focuses on network programming using sockets, for which
17 AnyEvent offers a lot of support you can use, and a lot of workarounds
18 around portability quirks.
19
21 If you don't care for the whys and want to see code, skip this section!
22
23 AnyEvent is first of all just a framework to do event-based
24 programming. Typically such frameworks are an all-or-nothing thing: If
25 you use one such framework, you can't (easily, or even at all) use
26 another in the same program.
27
28 AnyEvent is different - it is a thin abstraction layer on top of other
29 event loops, just like DBI is an abstraction of many different database
30 APIs. Its main purpose is to move the choice of the underlying
31 framework (the event loop) from the module author to the program author
32 using the module.
33
34 That means you can write code that uses events to control what it does,
35 without forcing other code in the same program to use the same
36 underlying framework as you do - i.e. you can create a Perl module that
37 is event-based using AnyEvent, and users of that module can still
38 choose between using Gtk2, Tk, Event (or run inside Irssi or rxvt-
39 unicode) or any other supported event loop. AnyEvent even comes with
40 its own pure-perl event loop implementation, so your code works
41 regardless of other modules that might or might not be installed. The
42 latter is important, as AnyEvent does not have any hard dependencies to
43 other modules, which makes it easy to install, for example, when you
44 lack a C compiler. No matter what environment, AnyEvent will just cope
45 with it.
46
47 A typical limitation of existing Perl modules such as Net::IRC is that
48 they come with their own event loop: In Net::IRC, a program which uses
49 it needs to start the event loop of Net::IRC. That means that one
50 cannot integrate this module into a Gtk2 GUI for instance, as that
51 module, too, enforces the use of its own event loop (namely Glib).
52
53 Another example is LWP: it provides no event interface at all. It's a
54 pure blocking HTTP (and FTP etc.) client library, which usually means
55 that you either have to start another process or have to fork for a
56 HTTP request, or use threads (e.g. Coro::LWP), if you want to do
57 something else while waiting for the request to finish.
58
59 The motivation behind these designs is often that a module doesn't want
60 to depend on some complicated XS-module (Net::IRC), or that it doesn't
61 want to force the user to use some specific event loop at all (LWP),
62 out of fear of severly limiting the usefulness of the module: If your
63 module requires Glib, it will not run in a Tk program.
64
65 AnyEvent solves this dilemma, by not forcing module authors to either:
66
67 - write their own event loop (because it guarantees the availability of
68 an event loop everywhere - even on windows with no extra modules
69 installed).
70 - choose one specific event loop (because AnyEvent works with most
71 event loops available for Perl).
72
73 If the module author uses AnyEvent for all his (or her) event needs (IO
74 events, timers, signals, ...) then all other modules can just use his
75 module and don't have to choose an event loop or adapt to his event
76 loop. The choice of the event loop is ultimately made by the program
77 author who uses all the modules and writes the main program. And even
78 there he doesn't have to choose, he can just let AnyEvent choose the
79 most efficient event loop available on the system.
80
81 Read more about this in the main documentation of the AnyEvent module.
82
84 So what exactly is programming using events? It quite simply means that
85 instead of your code actively waiting for something, such as the user
86 entering something on STDIN:
87
88 $| = 1; print "enter your name> ";
89
90 my $name = <STDIN>;
91
92 You instead tell your event framework to notify you in the event of
93 some data being available on STDIN, by using a callback mechanism:
94
95 use AnyEvent;
96
97 $| = 1; print "enter your name> ";
98
99 my $name;
100
101 my $wait_for_input = AnyEvent->io (
102 fh => \*STDIN, # which file handle to check
103 poll => "r", # which event to wait for ("r"ead data)
104 cb => sub { # what callback to execute
105 $name = <STDIN>; # read it
106 }
107 );
108
109 # do something else here
110
111 Looks more complicated, and surely is, but the advantage of using
112 events is that your program can do something else instead of waiting
113 for input (side note: combining AnyEvent with a thread package such as
114 Coro can recoup much of the simplicity, effectively getting the best of
115 two worlds).
116
117 Waiting as done in the first example is also called "blocking" the
118 process because you "block"/keep your process from executing anything
119 else while you do so.
120
121 The second example avoids blocking by only registering interest in a
122 read event, which is fast and doesn't block your process. The callback
123 will be called only when data is available and can be read without
124 blocking.
125
126 The "interest" is represented by an object returned by "AnyEvent->io"
127 called a "watcher" object - thus named because it "watches" your file
128 handle (or other event sources) for the event you are interested in.
129
130 In the example above, we create an I/O watcher by calling the
131 "AnyEvent->io" method. A lack of further interest in some event is
132 expressed by simply forgetting about its watcher, for example by
133 "undef"-ing the only variable it is stored in. AnyEvent will
134 automatically clean up the watcher if it is no longer used, much like
135 Perl closes your file handles if you no longer use them anywhere.
136
137 A short note on callbacks
138
139 A common issue that hits people is the problem of passing parameters to
140 callbacks. Programmers used to languages such as C or C++ are often
141 used to a style where one passes the address of a function (a function
142 reference) and some data value, e.g.:
143
144 sub callback {
145 my ($arg) = @_;
146
147 $arg->method;
148 }
149
150 my $arg = ...;
151
152 call_me_back_later \&callback, $arg;
153
154 This is clumsy, as the place where behaviour is specified (when the
155 callback is registered) is often far away from the place where
156 behaviour is implemented. It also doesn't use Perl syntax to invoke the
157 code. There is also an abstraction penalty to pay as one has to name
158 the callback, which often is unnecessary and leads to nonsensical or
159 duplicated names.
160
161 In Perl, one can specify behaviour much more directly by using
162 closures. Closures are code blocks that take a reference to the
163 enclosing scope(s) when they are created. This means lexical variables
164 in scope when a closure is created can be used inside the closure:
165
166 my $arg = ...;
167
168 call_me_back_later sub { $arg->method };
169
170 Under most circumstances, closures are faster, use fewer resources and
171 result in much clearer code than the traditional approach. Faster,
172 because parameter passing and storing them in local variables in Perl
173 is relatively slow. Fewer resources, because closures take references
174 to existing variables without having to create new ones, and clearer
175 code because it is immediately obvious that the second example calls
176 the "method" method when the callback is invoked.
177
178 Apart from these, the strongest argument for using closures with
179 AnyEvent is that AnyEvent does not allow passing parameters to the
180 callback, so closures are the only way to achieve that in most cases
181 :->
182
183 A little hint to catch mistakes
184
185 AnyEvent does not check the parameters you pass in, at least not by
186 default. to enable checking, simply start your program with
187 "AE_STRICT=1" in the environment, or put "use AnyEvent::Strict" near
188 the top of your program:
189
190 AE_STRICT=1 perl myprogram
191
192 You can find more info on this and additional debugging aids later in
193 this introduction.
194
195 Condition Variables
196 Back to the I/O watcher example: The code is not yet a fully working
197 program, and will not work as-is. The reason is that your callback will
198 not be invoked out of the blue; you have to run the event loop first.
199 Also, event-based programs need to block sometimes too, such as when
200 there is nothing to do, and everything is waiting for new events to
201 arrive.
202
203 In AnyEvent, this is done using condition variables. Condition
204 variables are named "condition variables" because they represent a
205 condition that is initially false and needs to be fulfilled.
206
207 You can also call them "merge points", "sync points", "rendezvous
208 ports" or even callbacks and many other things (and they are often
209 called these names in other frameworks). The important point is that
210 you can create them freely and later wait for them to become true.
211
212 Condition variables have two sides - one side is the "producer" of the
213 condition (whatever code detects and flags the condition), the other
214 side is the "consumer" (the code that waits for that condition).
215
216 In our example in the previous section, the producer is the event
217 callback and there is no consumer yet - let's change that right now:
218
219 use AnyEvent;
220
221 $| = 1; print "enter your name> ";
222
223 my $name;
224
225 my $name_ready = AnyEvent->condvar;
226
227 my $wait_for_input = AnyEvent->io (
228 fh => \*STDIN,
229 poll => "r",
230 cb => sub {
231 $name = <STDIN>;
232 $name_ready->send;
233 }
234 );
235
236 # do something else here
237
238 # now wait until the name is available:
239 $name_ready->recv;
240
241 undef $wait_for_input; # watcher no longer needed
242
243 print "your name is $name\n";
244
245 This program creates an AnyEvent condvar by calling the
246 "AnyEvent->condvar" method. It then creates a watcher as usual, but
247 inside the callback it "send"s the $name_ready condition variable,
248 which causes whoever is waiting on it to continue.
249
250 The "whoever" in this case is the code that follows, which calls
251 "$name_ready->recv": The producer calls "send", the consumer calls
252 "recv".
253
254 If there is no $name available yet, then the call to
255 "$name_ready->recv" will halt your program until the condition becomes
256 true.
257
258 As the names "send" and "recv" imply, you can actually send and receive
259 data using this, for example, the above code could also be written like
260 this, without an extra variable to store the name in:
261
262 use AnyEvent;
263
264 $| = 1; print "enter your name> ";
265
266 my $name_ready = AnyEvent->condvar;
267
268 my $wait_for_input = AnyEvent->io (
269 fh => \*STDIN, poll => "r",
270 cb => sub { $name_ready->send (scalar <STDIN>) }
271 );
272
273 # do something else here
274
275 # now wait and fetch the name
276 my $name = $name_ready->recv;
277
278 undef $wait_for_input; # watcher no longer needed
279
280 print "your name is $name\n";
281
282 You can pass any number of arguments to "send", and every subsequent
283 call to "recv" will return them.
284
285 The "main loop"
286 Most event-based frameworks have something called a "main loop" or
287 "event loop run function" or something similar.
288
289 Just like in "recv" AnyEvent, these functions need to be called
290 eventually so that your event loop has a chance of actually looking for
291 the events you are interested in.
292
293 For example, in a Gtk2 program, the above example could also be written
294 like this:
295
296 use Gtk2 -init;
297 use AnyEvent;
298
299 ############################################
300 # create a window and some label
301
302 my $window = new Gtk2::Window "toplevel";
303 $window->add (my $label = new Gtk2::Label "soon replaced by name");
304
305 $window->show_all;
306
307 ############################################
308 # do our AnyEvent stuff
309
310 $| = 1; print "enter your name> ";
311
312 my $name_ready = AnyEvent->condvar;
313
314 my $wait_for_input = AnyEvent->io (
315 fh => \*STDIN, poll => "r",
316 cb => sub {
317 # set the label
318 $label->set_text (scalar <STDIN>);
319 print "enter another name> ";
320 }
321 );
322
323 ############################################
324 # Now enter Gtk2's event loop
325
326 main Gtk2;
327
328 No condition variable anywhere in sight - instead, we just read a line
329 from STDIN and replace the text in the label. In fact, since nobody
330 "undef"s $wait_for_input you can enter multiple lines.
331
332 Instead of waiting for a condition variable, the program enters the
333 Gtk2 main loop by calling "Gtk2->main", which will block the program
334 and wait for events to arrive.
335
336 This also shows that AnyEvent is quite flexible - you didn't have to do
337 anything to make the AnyEvent watcher use Gtk2 (actually Glib) - it
338 just worked.
339
340 Admittedly, the example is a bit silly - who would want to read names
341 from standard input in a Gtk+ application? But imagine that instead of
342 doing that, you make an HTTP request in the background and display its
343 results. In fact, with event-based programming you can make many HTTP
344 requests in parallel in your program and still provide feedback to the
345 user and stay interactive.
346
347 And in the next part you will see how to do just that - by implementing
348 an HTTP request, on our own, with the utility modules AnyEvent comes
349 with.
350
351 Before that, however, let's briefly look at how you would write your
352 program using only AnyEvent, without ever calling some other event
353 loop's run function.
354
355 In the example using condition variables, we used those to start
356 waiting for events, and in fact, condition variables are the solution:
357
358 my $quit_program = AnyEvent->condvar;
359
360 # create AnyEvent watchers (or not) here
361
362 $quit_program->recv;
363
364 If any of your watcher callbacks decide to quit (this is often called
365 an "unloop" in other frameworks), they can just call
366 "$quit_program->send". Of course, they could also decide not to and
367 call "exit" instead, or they could decide never to quit (e.g. in a
368 long-running daemon program).
369
370 If you don't need some clean quit functionality and just want to run
371 the event loop, you can do this:
372
373 AnyEvent->condvar->recv;
374
375 And this is, in fact, the closest to the idea of a main loop run
376 function that AnyEvent offers.
377
378 Timers and other event sources
379 So far, we have used only I/O watchers. These are useful mainly to find
380 out whether a socket has data to read, or space to write more data. On
381 sane operating systems this also works for console windows/terminals
382 (typically on standard input), serial lines, all sorts of other
383 devices, basically almost everything that has a file descriptor but
384 isn't a file itself. (As usual, "sane" excludes windows - on that
385 platform you would need different functions for all of these,
386 complicating code immensely - think "socket only" on windows).
387
388 However, I/O is not everything - the second most important event source
389 is the clock. For example when doing an HTTP request you might want to
390 time out when the server doesn't answer within some predefined amount
391 of time.
392
393 In AnyEvent, timer event watchers are created by calling the
394 "AnyEvent->timer" method:
395
396 use AnyEvent;
397
398 my $cv = AnyEvent->condvar;
399
400 my $wait_one_and_a_half_seconds = AnyEvent->timer (
401 after => 1.5, # after how many seconds to invoke the cb?
402 cb => sub { # the callback to invoke
403 $cv->send;
404 },
405 );
406
407 # can do something else here
408
409 # now wait till our time has come
410 $cv->recv;
411
412 Unlike I/O watchers, timers are only interested in the amount of
413 seconds they have to wait. When (at least) that amount of time has
414 passed, AnyEvent will invoke your callback.
415
416 Unlike I/O watchers, which will call your callback as many times as
417 there is data available, timers are normally one-shot: after they have
418 "fired" once and invoked your callback, they are dead and no longer do
419 anything.
420
421 To get a repeating timer, such as a timer firing roughly once per
422 second, you can specify an "interval" parameter:
423
424 my $once_per_second = AnyEvent->timer (
425 after => 0, # first invoke ASAP
426 interval => 1, # then invoke every second
427 cb => sub { # the callback to invoke
428 $cv->send;
429 },
430 );
431
432 More esoteric sources
433
434 AnyEvent also has some other, more esoteric event sources you can tap
435 into: signal, child and idle watchers.
436
437 Signal watchers can be used to wait for "signal events", which means
438 your process was sent a signal (such as "SIGTERM" or "SIGUSR1").
439
440 Child-process watchers wait for a child process to exit. They are
441 useful when you fork a separate process and need to know when it exits,
442 but you do not want to wait for that by blocking.
443
444 Idle watchers invoke their callback when the event loop has handled all
445 outstanding events, polled for new events and didn't find any, i.e.,
446 when your process is otherwise idle. They are useful if you want to do
447 some non-trivial data processing that can be done when your program
448 doesn't have anything better to do.
449
450 All these watcher types are described in detail in the main AnyEvent
451 manual page.
452
453 Sometimes you also need to know what the current time is:
454 "AnyEvent->now" returns the time the event toolkit uses to schedule
455 relative timers, and is usually what you want. It is often cached
456 (which means it can be a bit outdated). In that case, you can use the
457 more costly "AnyEvent->time" method which will ask your operating
458 system for the current time, which is slower, but also more up to date.
459
461 So far you have seen how to register event watchers and handle events.
462
463 This is a great foundation to write network clients and servers, and
464 might be all that your module (or program) ever requires, but writing
465 your own I/O buffering again and again becomes tedious, not to mention
466 that it attracts errors.
467
468 While the core AnyEvent module is still small and self-contained, the
469 distribution comes with some very useful utility modules such as
470 AnyEvent::Handle, AnyEvent::DNS and AnyEvent::Socket. These can make
471 your life as a non-blocking network programmer a lot easier.
472
473 Here is a quick overview of these three modules:
474
475 AnyEvent::DNS
476 This module allows fully asynchronous DNS resolution. It is used mainly
477 by AnyEvent::Socket to resolve hostnames and service ports for you, but
478 is a great way to do other DNS resolution tasks, such as reverse
479 lookups of IP addresses for log files.
480
481 AnyEvent::Handle
482 This module handles non-blocking IO on (socket-, pipe- etc.) file
483 handles in an event based manner. It provides a wrapper object around
484 your file handle that provides queueing and buffering of incoming and
485 outgoing data for you.
486
487 It also implements the most common data formats, such as text lines, or
488 fixed and variable-width data blocks.
489
490 AnyEvent::Socket
491 This module provides you with functions that handle socket creation and
492 IP address magic. The two main functions are "tcp_connect" and
493 "tcp_server". The former will connect a (streaming) socket to an
494 internet host for you and the later will make a server socket for you,
495 to accept connections.
496
497 This module also comes with transparent IPv6 support, this means: If
498 you write your programs with this module, you will be IPv6 ready
499 without doing anything special.
500
501 It also works around a lot of portability quirks (especially on the
502 windows platform), which makes it even easier to write your programs in
503 a portable way (did you know that windows uses different error codes
504 for all socket functions and that Perl does not know about these? That
505 "Unknown error 10022" (which is "WSAEINVAL") can mean that our
506 "connect" call was successful? That unsuccessful TCP connects might
507 never be reported back to your program? That "WSAEINPROGRESS" means
508 your "connect" call was ignored instead of being in progress?
509 AnyEvent::Socket works around all of these Windows/Perl bugs for you).
510
511 Implementing a parallel finger client with non-blocking connects and
512 AnyEvent::Socket
513 The finger protocol is one of the simplest protocols in use on the
514 internet. Or in use in the past, as almost nobody uses it anymore.
515
516 It works by connecting to the finger port on another host, writing a
517 single line with a user name and then reading the finger response, as
518 specified by that user. OK, RFC 1288 specifies a vastly more complex
519 protocol, but it basically boils down to this:
520
521 # telnet freebsd.org finger
522 Trying 8.8.178.135...
523 Connected to freebsd.org (8.8.178.135).
524 Escape character is '^]'.
525 larry
526 Login: lile Name: Larry Lile
527 Directory: /home/lile Shell: /usr/local/bin/bash
528 No Mail.
529 Mail forwarded to: lile@stdio.com
530 No Plan.
531
532 So let's write a little AnyEvent function that makes a finger request:
533
534 use AnyEvent;
535 use AnyEvent::Socket;
536
537 sub finger($$) {
538 my ($user, $host) = @_;
539
540 # use a condvar to return results
541 my $cv = AnyEvent->condvar;
542
543 # first, connect to the host
544 tcp_connect $host, "finger", sub {
545 # the callback receives the socket handle - or nothing
546 my ($fh) = @_
547 or return $cv->send;
548
549 # now write the username
550 syswrite $fh, "$user\015\012";
551
552 my $response;
553
554 # register a read watcher
555 my $read_watcher; $read_watcher = AnyEvent->io (
556 fh => $fh,
557 poll => "r",
558 cb => sub {
559 my $len = sysread $fh, $response, 1024, length $response;
560
561 if ($len <= 0) {
562 # we are done, or an error occured, lets ignore the latter
563 undef $read_watcher; # no longer interested
564 $cv->send ($response); # send results
565 }
566 },
567 );
568 };
569
570 # pass $cv to the caller
571 $cv
572 }
573
574 That's a mouthful! Let's dissect this function a bit, first the overall
575 function and execution flow:
576
577 sub finger($$) {
578 my ($user, $host) = @_;
579
580 # use a condvar to return results
581 my $cv = AnyEvent->condvar;
582
583 # first, connect to the host
584 tcp_connect $host, "finger", sub {
585 ...
586 };
587
588 $cv
589 }
590
591 This isn't too complicated, just a function with two parameters that
592 creates a condition variable $cv, initiates a TCP connect to $host, and
593 returns $cv. The caller can use the returned $cv to receive the finger
594 response, but one could equally well pass a third argument, a callback,
595 to the function.
596
597 Since we are programming event'ish, we do not wait for the connect to
598 finish - it could block the program for a minute or longer!
599
600 Instead, we pass "tcp_connect" a callback to invoke when the connect is
601 done. The callback is called with the socket handle as its first
602 argument if the connect succeeds, and no arguments otherwise. The
603 important point is that it will always be called as soon as the outcome
604 of the TCP connect is known.
605
606 This style of programming is also called "continuation style": the
607 "continuation" is simply the way the program continues - normally at
608 the next line after some statement (the exception is loops or things
609 like "return"). When we are interested in events, however, we instead
610 specify the "continuation" of our program by passing a closure, which
611 makes that closure the "continuation" of the program.
612
613 The "tcp_connect" call is like saying "return now, and when the
614 connection is established or the attempt failed, continue there".
615
616 Now let's look at the callback/closure in more detail:
617
618 # the callback receives the socket handle - or nothing
619 my ($fh) = @_
620 or return $cv->send;
621
622 The first thing the callback does is to save the socket handle in $fh.
623 When there was an error (no arguments), then our instinct as expert
624 Perl programmers would tell us to "die":
625
626 my ($fh) = @_
627 or die "$host: $!";
628
629 While this would give good feedback to the user (if he happens to watch
630 standard error), our program would probably stop working here, as we
631 never report the results to anybody, certainly not the caller of our
632 "finger" function, and most event loops continue even after a "die"!
633
634 This is why we instead "return", but also call "$cv->send" without any
635 arguments to signal to the condvar consumer that something bad has
636 happened. The return value of "$cv->send" is irrelevant, as is the
637 return value of our callback. The "return" statement is used for the
638 side effect of, well, returning immediately from the callback.
639 Checking for errors and handling them this way is very common, which is
640 why this compact idiom is so handy.
641
642 As the next step in the finger protocol, we send the username to the
643 finger daemon on the other side of our connection (the kernel.org
644 finger service doesn't actually wait for a username, but the net is
645 running out of finger servers fast):
646
647 syswrite $fh, "$user\015\012";
648
649 Note that this isn't 100% clean socket programming - the socket could,
650 for whatever reasons, not accept our data. When writing a small amount
651 of data like in this example it doesn't matter, as a socket buffer is
652 almost always big enough for a mere "username", but for real-world
653 cases you might need to implement some kind of write buffering - or use
654 AnyEvent::Handle, which handles these matters for you, as shown in the
655 next section.
656
657 What we do have to do is implement our own read buffer - the response
658 data could arrive late or in multiple chunks, and we cannot just wait
659 for it (event-based programming, you know?).
660
661 To do that, we register a read watcher on the socket which waits for
662 data:
663
664 my $read_watcher; $read_watcher = AnyEvent->io (
665 fh => $fh,
666 poll => "r",
667
668 There is a trick here, however: the read watcher isn't stored in a
669 global variable, but in a local one - if the callback returns, it would
670 normally destroy the variable and its contents, which would in turn
671 unregister our watcher.
672
673 To avoid that, we refer to the watcher variable in the watcher
674 callback. This means that, when the "tcp_connect" callback returns,
675 perl thinks (quite correctly) that the read watcher is still in use -
676 namely inside the inner callback - and thus keeps it alive even if
677 nothing else in the program refers to it anymore (it is much like Baron
678 Münchhausen keeping himself from dying by pulling himself out of a
679 swamp).
680
681 The trick, however, is that instead of:
682
683 my $read_watcher = AnyEvent->io (...
684
685 The program does:
686
687 my $read_watcher; $read_watcher = AnyEvent->io (...
688
689 The reason for this is a quirk in the way Perl works: variable names
690 declared with "my" are only visible in the next statement. If the whole
691 "AnyEvent->io" call, including the callback, would be done in a single
692 statement, the callback could not refer to the $read_watcher variable
693 to "undef"ine it, so it is done in two statements.
694
695 Whether you'd want to format it like this is of course a matter of
696 style. This way emphasizes that the declaration and assignment really
697 are one logical statement.
698
699 The callback itself calls "sysread" for as many times as necessary,
700 until "sysread" returns either an error or end-of-file:
701
702 cb => sub {
703 my $len = sysread $fh, $response, 1024, length $response;
704
705 if ($len <= 0) {
706
707 Note that "sysread" has the ability to append data it reads to a scalar
708 if we specify an offset, a feature which we make use of in this
709 example.
710
711 When "sysread" indicates we are done, the callback "undef"ines the
712 watcher and then "send"s the response data to the condition variable.
713 All this has the following effects:
714
715 Undefining the watcher destroys it, as our callback was the only one
716 still having a reference to it. When the watcher gets destroyed, it
717 destroys the callback, which in turn means the $fh handle is no longer
718 used, so that gets destroyed as well. The result is that all resources
719 will be nicely cleaned up by perl for us.
720
721 Using the finger client
722
723 Now, we could probably write the same finger client in a simpler way if
724 we used "IO::Socket::INET", ignored the problem of multiple hosts and
725 ignored IPv6 and a few other things that "tcp_connect" handles for us.
726
727 But the main advantage is that we can not only run this finger function
728 in the background, we even can run multiple sessions in parallel, like
729 this:
730
731 my $f1 = finger "kuriyama", "freebsd.org";
732 my $f2 = finger "icculus?listarchives=1", "icculus.org";
733 my $f3 = finger "mikachu", "icculus.org";
734
735 print "kuriyama's gpg key\n" , $f1->recv, "\n";
736 print "icculus' plan archive\n" , $f2->recv, "\n";
737 print "mikachu's plan zomgn\n" , $f3->recv, "\n";
738
739 It doesn't look like it, but in fact all three requests run in
740 parallel. The code waits for the first finger request to finish first,
741 but that doesn't keep it from executing them parallel: when the first
742 "recv" call sees that the data isn't ready yet, it serves events for
743 all three requests automatically, until the first request has finished.
744
745 The second "recv" call might either find the data is already there, or
746 it will continue handling events until that is the case, and so on.
747
748 By taking advantage of network latencies, which allows us to serve
749 other requests and events while we wait for an event on one socket, the
750 overall time to do these three requests will be greatly reduced,
751 typically all three are done in the same time as the slowest of the
752 three requests.
753
754 By the way, you do not actually have to wait in the "recv" method on an
755 AnyEvent condition variable - after all, waiting is evil - you can also
756 register a callback:
757
758 $f1->cb (sub {
759 my $response = shift->recv;
760 # ...
761 });
762
763 The callback will be invoked only when "send" is called. In fact,
764 instead of returning a condition variable you could also pass a third
765 parameter to your finger function, the callback to invoke with the
766 response:
767
768 sub finger($$$) {
769 my ($user, $host, $cb) = @_;
770
771 How you implement it is a matter of taste - if you expect your function
772 to be used mainly in an event-based program you would normally prefer
773 to pass a callback directly. If you write a module and expect your
774 users to use it "synchronously" often (for example, a simple http-get
775 script would not really care much for events), then you would use a
776 condition variable and tell them "simply "->recv" the data".
777
778 Problems with the implementation and how to fix them
779
780 To make this example more real-world-ready, we would not only implement
781 some write buffering (for the paranoid, or maybe denial-of-service
782 aware security expert), but we would also have to handle timeouts and
783 maybe protocol errors.
784
785 Doing this quickly gets unwieldy, which is why we introduce
786 AnyEvent::Handle in the next section, which takes care of all these
787 details for you and lets you concentrate on the actual protocol.
788
789 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle
790 The AnyEvent::Handle module has been hyped quite a bit in this document
791 so far, so let's see what it really offers.
792
793 As finger is such a simple protocol, let's try something slightly more
794 complicated: HTTP/1.0.
795
796 An HTTP GET request works by sending a single request line that
797 indicates what you want the server to do and the URI you want to act it
798 on, followed by as many "header" lines ("Header: data", same as e-mail
799 headers) as required for the request, followed by an empty line.
800
801 The response is formatted very similarly, first a line with the
802 response status, then again as many header lines as required, then an
803 empty line, followed by any data that the server might send.
804
805 Again, let's try it out with "telnet" (I condensed the output a bit -
806 if you want to see the full response, do it yourself).
807
808 # telnet www.google.com 80
809 Trying 209.85.135.99...
810 Connected to www.google.com (209.85.135.99).
811 Escape character is '^]'.
812 GET /test HTTP/1.0
813
814 HTTP/1.0 404 Not Found
815 Date: Mon, 02 Jun 2008 07:05:54 GMT
816 Content-Type: text/html; charset=UTF-8
817
818 <html><head>
819 [...]
820 Connection closed by foreign host.
821
822 The "GET ..." and the empty line were entered manually, the rest of the
823 telnet output is google's response, in this case a "404 not found" one.
824
825 So, here is how you would do it with "AnyEvent::Handle":
826
827 sub http_get {
828 my ($host, $uri, $cb) = @_;
829
830 # store results here
831 my ($response, $header, $body);
832
833 my $handle; $handle = new AnyEvent::Handle
834 connect => [$host => 'http'],
835 on_error => sub {
836 $cb->("HTTP/1.0 500 $!");
837 $handle->destroy; # explicitly destroy handle
838 },
839 on_eof => sub {
840 $cb->($response, $header, $body);
841 $handle->destroy; # explicitly destroy handle
842 };
843
844 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
845
846 # now fetch response status line
847 $handle->push_read (line => sub {
848 my ($handle, $line) = @_;
849 $response = $line;
850 });
851
852 # then the headers
853 $handle->push_read (line => "\015\012\015\012", sub {
854 my ($handle, $line) = @_;
855 $header = $line;
856 });
857
858 # and finally handle any remaining data as body
859 $handle->on_read (sub {
860 $body .= $_[0]->rbuf;
861 $_[0]->rbuf = "";
862 });
863 }
864
865 And now let's go through it step by step. First, as usual, the overall
866 "http_get" function structure:
867
868 sub http_get {
869 my ($host, $uri, $cb) = @_;
870
871 # store results here
872 my ($response, $header, $body);
873
874 my $handle; $handle = new AnyEvent::Handle
875 ... create handle object
876
877 ... push data to write
878
879 ... push what to expect to read queue
880 }
881
882 Unlike in the finger example, this time the caller has to pass a
883 callback to "http_get". Also, instead of passing a URL as one would
884 expect, the caller has to provide the hostname and URI - normally you
885 would use the "URI" module to parse a URL and separate it into those
886 parts, but that is left to the inspired reader :)
887
888 Since everything else is left to the caller, all "http_get" does is
889 initiate the connection by creating the AnyEvent::Handle object (which
890 calls "tcp_connect" for us) and leave everything else to its callback.
891
892 The handle object is created, unsurprisingly, by calling the "new"
893 method of AnyEvent::Handle:
894
895 my $handle; $handle = new AnyEvent::Handle
896 connect => [$host => 'http'],
897 on_error => sub {
898 $cb->("HTTP/1.0 500 $!");
899 $handle->destroy; # explicitly destroy handle
900 },
901 on_eof => sub {
902 $cb->($response, $header, $body);
903 $handle->destroy; # explicitly destroy handle
904 };
905
906 The "connect" argument tells AnyEvent::Handle to call "tcp_connect" for
907 the specified host and service/port.
908
909 The "on_error" callback will be called on any unexpected error, such as
910 a refused connection, or unexpected end-of-file while reading headers.
911
912 Instead of having an extra mechanism to signal errors, connection
913 errors are signalled by crafting a special "response status line", like
914 this:
915
916 HTTP/1.0 500 Connection refused
917
918 This means the caller cannot distinguish (easily) between locally-
919 generated errors and server errors, but it simplifies error handling
920 for the caller a lot.
921
922 The error callback also destroys the handle explicitly, because we are
923 not interested in continuing after any errors. In AnyEvent::Handle
924 callbacks you have to call "destroy" explicitly to destroy a handle.
925 Outside of those callbacks you can just forget the object reference and
926 it will be automatically cleaned up.
927
928 Last but not least, we set an "on_eof" callback that is called when the
929 other side indicates it has stopped writing data, which we will use to
930 gracefully shut down the handle and report the results. This callback
931 is only called when the read queue is empty - if the read queue expects
932 some data and the handle gets an EOF from the other side this will be
933 an error - after all, you did expect more to come.
934
935 If you wanted to write a server using AnyEvent::Handle, you would use
936 "tcp_accept" and then create the AnyEvent::Handle with the "fh"
937 argument.
938
939 The write queue
940
941 The next line sends the actual request:
942
943 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
944
945 No headers will be sent (this is fine for simple requests), so the
946 whole request is just a single line followed by an empty line to signal
947 the end of the headers to the server.
948
949 The more interesting question is why the method is called "push_write"
950 and not just write. The reason is that you can always add some write
951 data without blocking, and to do this, AnyEvent::Handle needs some
952 write queue internally - and "push_write" pushes some data onto the end
953 of that queue, just like Perl's "push" pushes data onto the end of an
954 array.
955
956 The deeper reason is that at some point in the future, there might be
957 "unshift_write" as well, and in any case, we will shortly meet
958 "push_read" and "unshift_read", and it's usually easiest to remember if
959 all those functions have some symmetry in their name. So "push" is used
960 as the opposite of "unshift" in AnyEvent::Handle, not as the opposite
961 of "pull" - just like in Perl.
962
963 Note that we call "push_write" right after creating the
964 AnyEvent::Handle object, before it has had time to actually connect to
965 the server. This is fine, pushing the read and write requests will
966 queue them in the handle object until the connection has been
967 established. Alternatively, we could do this "on demand" in the
968 "on_connect" callback.
969
970 If "push_write" is called with more than one argument, then you can do
971 formatted I/O. For example, this would JSON-encode your data before
972 pushing it to the write queue:
973
974 $handle->push_write (json => [1, 2, 3]);
975
976 This pretty much summarises the write queue, there is little else to
977 it.
978
979 Reading the response is far more interesting, because it involves the
980 more powerful and complex read queue:
981
982 The read queue
983
984 The response consists of three parts: a single line with the response
985 status, a single paragraph of headers ended by an empty line, and the
986 request body, which is the remaining data on the connection.
987
988 For the first two, we push two read requests onto the read queue:
989
990 # now fetch response status line
991 $handle->push_read (line => sub {
992 my ($handle, $line) = @_;
993 $response = $line;
994 });
995
996 # then the headers
997 $handle->push_read (line => "\015\012\015\012", sub {
998 my ($handle, $line) = @_;
999 $header = $line;
1000 });
1001
1002 While one can just push a single callback to parse all the data on the
1003 queue, formatted I/O really comes to our aid here, since there is a
1004 ready-made "read line" read type. The first read expects a single line,
1005 ended by "\015\012" (the standard end-of-line marker in internet
1006 protocols).
1007
1008 The second "line" is actually a single paragraph - instead of reading
1009 it line by line we tell "push_read" that the end-of-line marker is
1010 really "\015\012\015\012", which is an empty line. The result is that
1011 the whole header paragraph will be treated as a single line and read.
1012 The word "line" is interpreted very freely, much like Perl itself does
1013 it.
1014
1015 Note that push read requests are pushed immediately after creating the
1016 handle object - since AnyEvent::Handle provides a queue we can push as
1017 many requests as we want, and AnyEvent::Handle will handle them in
1018 order.
1019
1020 There is, however, no read type for "the remaining data". For that, we
1021 install our own "on_read" callback:
1022
1023 # and finally handle any remaining data as body
1024 $handle->on_read (sub {
1025 $body .= $_[0]->rbuf;
1026 $_[0]->rbuf = "";
1027 });
1028
1029 This callback is invoked every time data arrives and the read queue is
1030 empty - which in this example will only be the case when both response
1031 and header have been read. The "on_read" callback could actually have
1032 been specified when constructing the object, but doing it this way
1033 preserves logical ordering.
1034
1035 The read callback adds the current read buffer to its $body variable
1036 and, most importantly, empties the buffer by assigning the empty string
1037 to it.
1038
1039 Given these instructions, AnyEvent::Handle will handle incoming data -
1040 if all goes well, the callback will be invoked with the response data;
1041 if not, it will get an error.
1042
1043 In general, you can implement pipelining (a semi-advanced feature of
1044 many protocols) very easily with AnyEvent::Handle: If you have a
1045 protocol with a request/response structure, your request
1046 methods/functions will all look like this (simplified):
1047
1048 sub request {
1049
1050 # send the request to the server
1051 $handle->push_write (...);
1052
1053 # push some response handlers
1054 $handle->push_read (...);
1055 }
1056
1057 This means you can queue as many requests as you want, and while
1058 AnyEvent::Handle goes through its read queue to handle the response
1059 data, the other side can work on the next request - queueing the
1060 request just appends some data to the write queue and installs a
1061 handler to be called later.
1062
1063 You might ask yourself how to handle decisions you can only make after
1064 you have received some data (such as handling a short error response or
1065 a long and differently-formatted response). The answer to this problem
1066 is "unshift_read", which we will introduce together with an example in
1067 the coming sections.
1068
1069 Using "http_get"
1070
1071 Finally, here is how you would use "http_get":
1072
1073 http_get "www.google.com", "/", sub {
1074 my ($response, $header, $body) = @_;
1075
1076 print
1077 $response, "\n",
1078 $body;
1079 };
1080
1081 And of course, you can run as many of these requests in parallel as you
1082 want (and your memory supports).
1083
1084 HTTPS
1085
1086 Now, as promised, let's implement the same thing for HTTPS, or more
1087 correctly, let's change our "http_get" function into a function that
1088 speaks HTTPS instead.
1089
1090 HTTPS is a standard TLS connection (Transport Layer Security is the
1091 official name for what most people refer to as "SSL") that contains
1092 standard HTTP protocol exchanges. The only other difference to HTTP is
1093 that by default it uses port 443 instead of port 80.
1094
1095 To implement these two differences we need two tiny changes, first, in
1096 the "connect" parameter, we replace "http" by "https" to connect to the
1097 https port:
1098
1099 connect => [$host => 'https'],
1100
1101 The other change deals with TLS, which is something AnyEvent::Handle
1102 does for us if the Net::SSLeay module is available. To enable TLS with
1103 AnyEvent::Handle, we pass an additional "tls" parameter to the call to
1104 "AnyEvent::Handle::new":
1105
1106 tls => "connect",
1107
1108 Specifying "tls" enables TLS, and the argument specifies whether
1109 AnyEvent::Handle is the server side ("accept") or the client side
1110 ("connect") for the TLS connection, as unlike TCP, there is a clear
1111 server/client relationship in TLS.
1112
1113 That's all.
1114
1115 Of course, all this should be handled transparently by "http_get" after
1116 parsing the URL. If you need this, see the part about exercising your
1117 inspiration earlier in this document. You could also use the
1118 AnyEvent::HTTP module from CPAN, which implements all this and works
1119 around a lot of quirks for you too.
1120
1121 The read queue - revisited
1122
1123 HTTP always uses the same structure in its responses, but many
1124 protocols require parsing responses differently depending on the
1125 response itself.
1126
1127 For example, in SMTP, you normally get a single response line:
1128
1129 220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1130
1131 But SMTP also supports multi-line responses:
1132
1133 220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1134 220-hey guys
1135 220 my response is longer than yours
1136
1137 To handle this, we need "unshift_read". As the name (we hope) implies,
1138 "unshift_read" will not append your read request to the end of the read
1139 queue, but will prepend it to the queue instead.
1140
1141 This is useful in the situation above: Just push your response-line
1142 read request when sending the SMTP command, and when handling it, you
1143 look at the line to see if more is to come, and "unshift_read" another
1144 reader callback if required, like this:
1145
1146 my $response; # response lines end up in here
1147
1148 my $read_response; $read_response = sub {
1149 my ($handle, $line) = @_;
1150
1151 $response .= "$line\n";
1152
1153 # check for continuation lines ("-" as 4th character")
1154 if ($line =~ /^...-/) {
1155 # if yes, then unshift another line read
1156 $handle->unshift_read (line => $read_response);
1157
1158 } else {
1159 # otherwise we are done
1160
1161 # free callback
1162 undef $read_response;
1163
1164 print "we are don reading: $response\n";
1165 }
1166 };
1167
1168 $handle->push_read (line => $read_response);
1169
1170 This recipe can be used for all similar parsing problems, for example
1171 in NNTP, the response code to some commands indicates that more data
1172 will be sent:
1173
1174 $handle->push_write ("article 42");
1175
1176 # read response line
1177 $handle->push_read (line => sub {
1178 my ($handle, $status) = @_;
1179
1180 # article data following?
1181 if ($status =~ /^2/) {
1182 # yes, read article body
1183
1184 $handle->unshift_read (line => "\012.\015\012", sub {
1185 my ($handle, $body) = @_;
1186
1187 $finish->($status, $body);
1188 });
1189
1190 } else {
1191 # some error occured, no article data
1192
1193 $finish->($status);
1194 }
1195 }
1196
1197 Your own read queue handler
1198
1199 Sometimes your protocol doesn't play nice, and uses lines or chunks of
1200 data not formatted in a way handled out of the box by AnyEvent::Handle.
1201 In this case you have to implement your own read parser.
1202
1203 To make up a contorted example, imagine you are looking for an even
1204 number of characters followed by a colon (":"). Also imagine that
1205 AnyEvent::Handle has no "regex" read type which could be used, so you'd
1206 have to do it manually.
1207
1208 To implement a read handler for this, you would "push_read" (or
1209 "unshift_read") a single code reference.
1210
1211 This code reference will then be called each time there is (new) data
1212 available in the read buffer, and is expected to either successfully
1213 eat/consume some of that data (and return true) or to return false to
1214 indicate that it wants to be called again.
1215
1216 If the code reference returns true, then it will be removed from the
1217 read queue (because it has parsed/consumed whatever it was supposed to
1218 consume), otherwise it stays in the front of it.
1219
1220 The example above could be coded like this:
1221
1222 $handle->push_read (sub {
1223 my ($handle) = @_;
1224
1225 # check for even number of characters + ":"
1226 # and remove the data if a match is found.
1227 # if not, return false (actually nothing)
1228
1229 $handle->{rbuf} =~ s/^( (?:..)* ) ://x
1230 or return;
1231
1232 # we got some data in $1, pass it to whoever wants it
1233 $finish->($1);
1234
1235 # and return true to indicate we are done
1236 1
1237 });
1238
1240 Now that you have seen how to use AnyEvent, here's what to use when you
1241 don't use it correctly, or simply hit a bug somewhere and want to debug
1242 it:
1243
1244 Enable strict argument checking during development
1245 AnyEvent does not, by default, do any argument checking. This can
1246 lead to strange and unexpected results especially if you are just
1247 trying to find your way with AnyEvent.
1248
1249 AnyEvent supports a special "strict" mode - off by default - which
1250 does very strict argument checking, at the expense of slowing down
1251 your program. During development, however, this mode is very useful
1252 because it quickly catches the msot common errors.
1253
1254 You can enable this strict mode either by having an environment
1255 variable "AE_STRICT" with a true value in your environment:
1256
1257 AE_STRICT=1 perl myprog
1258
1259 Or you can write "use AnyEvent::Strict" in your program, which has
1260 the same effect (do not do this in production, however).
1261
1262 Increase verbosity, configure logging
1263 AnyEvent, by default, only logs critical messages. If something
1264 doesn't work, maybe there was a warning about it that you didn't
1265 see because it was suppressed.
1266
1267 So during development it is recommended to push up the logging
1268 level to at least warn level (5):
1269
1270 AE_VERBOSE=5 perl myprog
1271
1272 Other levels that might be helpful are debug (8) or even trace (9).
1273
1274 AnyEvent's logging is quite versatile - the AnyEvent::Log manpage
1275 has all the details.
1276
1277 Watcher wrapping, tracing, the shell
1278 For even more debugging, you can enable watcher wrapping:
1279
1280 AE_DEBUG_WRAP=2 perl myprog
1281
1282 This will have the effect of wrapping every watcher into a special
1283 object that stores a backtrace of when it was created, stores a
1284 backtrace when an exception occurs during watcher execution, and
1285 stores a lot of other information. If that slows down your program
1286 too much, then "AE_DEBUG_WRAP=1" avoids the costly backtraces.
1287
1288 Here is an example of what of information is stored:
1289
1290 59148536 DC::DB:472(Server::run)>io>DC::DB::Server::fh_read
1291 type: io watcher
1292 args: poll r fh GLOB(0x35283f0)
1293 created: 2011-09-01 23:13:46.597336 +0200 (1314911626.59734)
1294 file: ./blib/lib/Deliantra/Client/private/DC/DB.pm
1295 line: 472
1296 subname: DC::DB::Server::run
1297 context:
1298 tracing: enabled
1299 cb: CODE(0x2d1fb98) (DC::DB::Server::fh_read)
1300 invoked: 0 times
1301 created
1302 (eval 25) line 6 AnyEvent::Debug::Wrap::__ANON__('AnyEvent','fh',GLOB(0x35283f0),'poll','r','cb',CODE(0x2d1fb98)=DC::DB::Server::fh_read)
1303 DC::DB line 472 AE::io(GLOB(0x35283f0),'0',CODE(0x2d1fb98)=DC::DB::Server::fh_read)
1304 bin/deliantra line 2776 DC::DB::Server::run()
1305 bin/deliantra line 2941 main::main()
1306
1307 There are many ways to get at this data - see the AnyEvent::Debug
1308 and AnyEvent::Log manpages for more details.
1309
1310 The most interesting and interactive way is to create a debug
1311 shell, for example by setting "AE_DEBUG_SHELL":
1312
1313 AE_DEBUG_WRAP=2 AE_DEBUG_SHELL=$HOME/myshell ./myprog
1314
1315 # while myprog is running:
1316 socat readline $HOME/myshell
1317
1318 Note that anybody who can access $HOME/myshell can make your
1319 program do anything he or she wants, so if you are not the only
1320 user on your machine, better put it into a secure location ($HOME
1321 might not be secure enough).
1322
1323 If you don't have "socat" (a shame!) and care even less about
1324 security, you can also use TCP and "telnet":
1325
1326 AE_DEBUG_WRAP=2 AE_DEBUG_SHELL=127.0.0.1:1234 ./myprog
1327
1328 telnet 127.0.0.1 1234
1329
1330 The debug shell can enable and disable tracing of watcher
1331 invocations, can display the trace output, give you a list of
1332 watchers and lets you investigate watchers in detail.
1333
1334 This concludes our little tutorial.
1335
1337 This introduction should have explained the key concepts of AnyEvent -
1338 event watchers and condition variables, AnyEvent::Socket - basic
1339 networking utilities, and AnyEvent::Handle - a nice wrapper around
1340 sockets.
1341
1342 You could either start coding stuff right away, look at those manual
1343 pages for the gory details, or roam CPAN for other AnyEvent modules
1344 (such as AnyEvent::IRC or AnyEvent::HTTP) to see more code examples (or
1345 simply to use them).
1346
1347 If you need a protocol that doesn't have an implementation using
1348 AnyEvent, remember that you can mix AnyEvent with one other event
1349 framework, such as POE, so you can always use AnyEvent for your own
1350 tasks plus modules of one other event framework to fill any gaps.
1351
1352 And last not least, you could also look at Coro, especially
1353 Coro::AnyEvent, to see how you can turn event-based programming from
1354 callback style back to the usual imperative style (also called
1355 "inversion of control" - AnyEvent calls you, but Coro lets you call
1356 AnyEvent).
1357
1359 Robin Redeker "<elmex at ta-sa.org>", Marc Lehmann
1360 <schmorp@schmorp.de>.
1361
1362
1363
1364perl v5.28.1 2014-11-19 AnyEvent::Intro(3)